date:20160229

Re: [Xen-devel] [RFC PATCH] xen-block: introduces extra request to pass-through SCSI commands

2016-02-29 Thread Juergen Gross

On 29/02/16 04:37, Bob Liu wrote:
> 1) What is this patch about?
> This patch introduces an new block operation (BLKIF_OP_EXTRA_FLAG).
> A request with BLKIF_OP_EXTRA_FLAG set means the following request is an
> extra request which is used to pass through SCSI commands.
> This is like a simplified version of XEN_NETIF_EXTRA_* in netif.h.
> It can be extended easily to transmit other per-request/bio data from frontend
> to backend e.g Data Integrity Field per bio.
> 
> 2) Why we need this?
> Currently only raw data segments are transmitted from blkfront to blkback, 
> which
> means some advanced features are lost.
>  * Guest knows nothing about features of the real backend storage.
>   For example, on bare-metal environment INQUIRY SCSI command can be used
>   to query storage device information. If it's a SSD or flash device we
>   can have the option to use the device as a fast cache.
>   But this can't happen in current domU guests, because blkfront only
>   knows it's just a normal virtual disk
> 
>  * Failover Clusters in Windows
>   Failover clusters require SCSI-3 persistent reservation target disks,
>   but now this can't work in domU.
> 
> 3) Known issues:
>  * Security issues, how to 'validate' this extra request payload.
>E.g SCSI operates on LUN bases (the whole disk) while we really just want 
> to
>operate on partitions

It's not only validation: some operations just affect the whole LUN
(e.g. Reserve/Release). And what about "multi-LUN" commands like
"report LUNs"?

>  * Can't pass SCSI commands through if the backend storage driver is bio-based
>instead of request-based.
> 
> 4) Alternative approach: Using PVSCSI instead:
>  * Doubt PVSCSI can support as many type of backend storage devices as 
> Xen-block.

pvSCSI won't need to support all types of backends. It's enough to
support those where passing through SCSI commands makes sense.

Seems to be a similar issue as the above mentioned problem with
bio-based backend storage drivers.

>  * Much longer path:
>ioctl() -> SCSI upper layer -> Middle layer -> PVSCSI-frontend -> 
> PVSCSI-backend -> Target framework(LIO?) ->
> 
>With xen-block we only need:
>ioctl() -> blkfront -> blkback ->

I'd like to see performance numbers before making a decision.

>  * xen-block has been existed for many years, widely used and more stable.

Adding another SCSI passthrough capability wasn't accepted for pvSCSI
(that's the reason I used the Target Framework). Why do you think it
will be accepted for pvblk?

This is not my personal opinion, just a heads up from someone who had a
try already. ;-)


Juergen

> Welcome any input, thank you!
> 
> Signed-off-by: Bob Liu 
> ---
>  xen/include/public/io/blkif.h |   73 
> +
>  1 file changed, 73 insertions(+)
> 
> diff --git a/xen/include/public/io/blkif.h b/xen/include/public/io/blkif.h
> index 99f0326..7c10bce 100644
> --- a/xen/include/public/io/blkif.h
> +++ b/xen/include/public/io/blkif.h
> @@ -635,6 +635,28 @@
>  #define BLKIF_OP_INDIRECT  6
>  
>  /*
> + * Recognised only if "feature-extra-request" is present in backend xenbus 
> info.
> + * A request with BLKIF_OP_EXTRA_FLAG indicates an extra request is followed
> + * in the shared ring buffer.
> + *
> + * By this way, extra data like SCSI command, DIF/DIX and other 
> per-request/bio
> + * data can be transmitted from frontend to backend.
> + *
> + * The 'wire' format is like:
> + *  Request 1: xen_blkif_request
> + * [Request 2: xen_blkif_extra_request](only if request 1 has 
> BLKIF_OP_EXTRA_FLAG)
> + *  Request 3: xen_blkif_request
> + *  Request 4: xen_blkif_request
> + * [Request 5: xen_blkif_extra_request](only if request 4 has 
> BLKIF_OP_EXTRA_FLAG)
> + *  ...
> + *  Request N: xen_blkif_request
> + *
> + * If a backend does not recognize BLKIF_OP_EXTRA_FLAG, it should *not* 
> create the
> + * "feature-extra-request" node!
> + */
> +#define BLKIF_OP_EXTRA_FLAG (0x80)
> +
> +/*
>   * Maximum scatter/gather segments per request.
>   * This is carefully chosen so that sizeof(blkif_ring_t) <= PAGE_SIZE.
>   * NB. This could be 12 if the ring indexes weren't stored in the same page.
> @@ -703,10 +725,61 @@ struct blkif_request_indirect {
>  };
>  typedef struct blkif_request_indirect blkif_request_indirect_t;
>  
> +enum blkif_extra_request_type {
> + BLKIF_EXTRA_TYPE_SCSI_CMD = 1,  /* Transmit SCSI command.  */
> +};
> +
> +struct scsi_cmd_req {
> + /*
> +  * Grant mapping for transmiting SCSI command to backend, and
> +  * also receive sense data from backend.
> +  * One 4KB page is enough.
> +  */
> + grant_ref_t cmd_gref;
> + /* Length of SCSI command in the grant mapped page. */
> + unsigned int cmd_len;
> +
> + /*
> +  * SCSI command may require transmiting data segment length less
> +  * than a sector(512 bytes).
> +  * Record num_sg and last segment length in extra request so that

Re: [Xen-devel] [PATCH v5 09/11] xen: add capability to load initrd outside of initial mapping

2016-02-29 Thread Juergen Gross

On 26/02/16 16:41, Daniel Kiper wrote:
> On Fri, Feb 26, 2016 at 03:28:21PM +0100, Juergen Gross wrote:
>> On 26/02/16 15:00, Daniel Kiper wrote:
>>> On Thu, Feb 25, 2016 at 04:33:46PM +0100, Juergen Gross wrote:
 On 25/02/16 13:47, Daniel Kiper wrote:
> On Thu, Feb 25, 2016 at 12:33:35PM +0100, Juergen Gross wrote:
>> Modern pvops linux kernels support an initrd not covered by the initial
>> mapping. This capability is flagged by an elf-note.
>>
>> In case the elf-note is set by the kernel don't place the initrd into
>> the initial mapping. This will allow to load larger initrds and/or
>> support domains with larger memory, as the initial mapping is limited
>> to 2GB and it is containing the p2m list.
>>
>> Signed-off-by: Juergen Gross 
>> ---
>> V5: let call grub_xen_alloc_final() all subfunctions unconditionally
>> and let them decide whether they need to do anything
>> V4: rename grub_xen_alloc_end() to grub_xen_alloc_final()
>> ---
>>  grub-core/loader/i386/xen.c| 65 
>> ++
>>  grub-core/loader/i386/xen_fileXX.c |  3 ++
>>  include/grub/xen_file.h|  1 +
>>  3 files changed, 56 insertions(+), 13 deletions(-)
>>
>> diff --git a/grub-core/loader/i386/xen.c b/grub-core/loader/i386/xen.c
>> index 2e12763..466f0c0 100644
>> --- a/grub-core/loader/i386/xen.c
>> +++ b/grub-core/loader/i386/xen.c
>> @@ -228,6 +228,9 @@ grub_xen_p2m_alloc (void)
>>grub_size_t p2msize;
>>grub_err_t err;
>>
>> +  if (xen_state.virt_mfn_list)
>> +return GRUB_ERR_NONE;
>> +
>>xen_state.state.mfn_list = xen_state.max_addr;
>>xen_state.next_start.mfn_list =
>>  xen_state.max_addr + xen_state.xen_inf.virt_base;
>> @@ -250,6 +253,9 @@ grub_xen_special_alloc (void)
>>grub_relocator_chunk_t ch;
>>grub_err_t err;
>>
>> +  if (xen_state.virt_start_info)
>> +return GRUB_ERR_NONE;
>> +
>>err = grub_relocator_alloc_chunk_addr (xen_state.relocator, &ch,
>>   xen_state.max_addr,
>>   sizeof (xen_state.next_start));
>> @@ -281,6 +287,9 @@ grub_xen_pt_alloc (void)
>>grub_uint64_t nr_info_pages;
>>grub_uint64_t nr_pages, nr_pt_pages, nr_need_pages;
>>
>> +  if (xen_state.virt_pgtable)
>> +return GRUB_ERR_NONE;
>> +
>>xen_state.next_start.pt_base =
>>  xen_state.max_addr + xen_state.xen_inf.virt_base;
>>xen_state.state.paging_start = xen_state.max_addr >> PAGE_SHIFT;
>> @@ -320,6 +329,24 @@ grub_xen_pt_alloc (void)
>>  }
>>
>>  static grub_err_t
>> +grub_xen_alloc_final (void)
>> +{
>> +  grub_err_t err;
>> +
>> +  err = grub_xen_p2m_alloc ();
>> +  if (err)
>> +return err;
>> +  err = grub_xen_special_alloc ();
>> +  if (err)
>> +return err;
>> +  err = grub_xen_pt_alloc ();
>> +  if (err)
>> +return err;
>
> Could you move grub_xen_p2m_alloc() here? This way grub_xen_alloc_final()
> will be real final in patch 11 and you do not need an extra condition
> around grub_xen_p2m_alloc().

 No. Page tables must be allocated after p2m unless p2m is outside of
 initial kernel mapping (patch 11).
>>>
>>> OK, I was afraid of that. However, then grub_xen_alloc_final() is not final.
>>
>> Sure it is. Patch 11 just adds allocation to grub_xen_alloc_final(), not
>> after it.
> 
> Excerpt from patch 11:
> 
> @@ -474,6 +503,12 @@ grub_xen_boot (void)
>err = grub_xen_alloc_final ();
>if (err)
>  return err;
> +  if (xen_state.xen_inf.has_p2m_base)
> +{
> +  err = grub_xen_p2m_alloc ();
> +  if (err)
> +return err;
> +}
> 

Hmph. No idea what I looked at when writing my previous reply.

>>> So, maybe it should be called grub_xen_alloc_boot_data() or something like 
>>> that.
> 
> Then my previous comment is still valid.

What about naming it grub_xen_alloc_kernel_mapping() ? This is what it
really does: It is allocating all not yet allocated areas which are to
be included in the initial kernel mapping.

Juergen


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [RFC Design Doc] Add vNVDIMM support for Xen

2016-02-29 Thread Jan Beulich

>>> On 28.02.16 at 15:48,  wrote:
> On 02/24/16 09:54, Jan Beulich wrote:
>> >>> On 24.02.16 at 16:48,  wrote:
>> > On 02/24/16 07:24, Jan Beulich wrote:
>> >> >>> On 24.02.16 at 14:28,  wrote:
>> >> > On 02/18/16 10:17, Jan Beulich wrote:
>> >> >> >>> On 01.02.16 at 06:44,  wrote:
>> >> >> > 3.3 Guest ACPI Emulation
>> >> >> > 
>> >> >> > 3.3.1 My Design
>> >> >> > 
>> >> >> >  Guest ACPI emulation is composed of two parts: building guest NFIT
>> >> >> >  and SSDT that defines ACPI namespace devices for NVDIMM, and
>> >> >> >  emulating guest _DSM.
>> >> >> > 
>> >> >> >  (1) Building Guest ACPI Tables
>> >> >> > 
>> >> >> >   This design reuses and extends hvmloader's existing mechanism that
>> >> >> >   loads passthrough ACPI tables from binary files to load NFIT and
>> >> >> >   SSDT tables built by QEMU:
>> >> >> >   1) Because the current QEMU does not building any ACPI tables when
>> >> >> >  it runs as the Xen device model, this design needs to patch QEMU
>> >> >> >  to build NFIT and SSDT (so far only NFIT and SSDT) in this case.
>> >> >> > 
>> >> >> >   2) QEMU copies NFIT and SSDT to the end of guest memory below
>> >> >> >  4G. The guest address and size of those tables are written into
>> >> >> >  xenstore 
>> >> >> > (/local/domain/domid/hvmloader/dm-acpi/{address,length}).
>> >> >> > 
>> >> >> >   3) hvmloader is patched to probe and load device model passthrough
>> >> >> >  ACPI tables from above xenstore keys. The detected ACPI tables
>> >> >> >  are then appended to the end of existing guest ACPI tables just
>> >> >> >  like what current construct_passthrough_tables() does.
>> >> >> > 
>> >> >> >   Reasons for this design are listed below:
>> >> >> >   - NFIT and SSDT in question are quite self-contained, i.e. they do
>> >> >> > not refer to other ACPI tables and not conflict with existing
>> >> >> > guest ACPI tables in Xen. Therefore, it is safe to copy them from
>> >> >> > QEMU and append to existing guest ACPI tables.
>> >> >> 
>> >> >> How is this not conflicting being guaranteed? In particular I don't
>> >> >> see how tables containing AML code and coming from different
>> >> >> sources won't possibly cause ACPI name space collisions.
>> >> >>
>> >> > 
>> >> > Really there is no effective mechanism to avoid ACPI name space
>> >> > collisions (and other kinds of conflicts) between ACPI tables loaded
>> >> > from QEMU and ACPI tables built by hvmloader. Because which ACPI tables
>> >> > are loaded is determined by developers, IMO it's developers'
>> >> > responsibility to avoid any collisions and conflicts with existing ACPI
>> >> > tables.
>> >> 
>> >> Right, but this needs to be spelled out and settled on at design
>> >> time (i.e. now), rather leaving things unspecified, awaiting the
>> >> first clash.
>> > 
>> > So that means if no collision-proof mechanism is introduced, Xen should not
>> > trust any passed-in ACPI tables and should build them by itself?
>> 
>> Basically yes, albeit collision-proof may be too much to demand.
>> Simply separating name spaces (for hvmloader and qemu to have
>> their own sub-spaces) would be sufficient imo. We should trust
>> ourselves to play by such a specification.
>>
> 
> I don't quite understand 'separating name spaces'. Do you mean, for
> example, if both hvmloader and qemu want to put a namespace device under
> \_SB, they could be put in different sub-scopes under \_SB? But it does
> not work for Linux at least.

Aiui just the leaf names matter for sufficient separation, i.e.
recurring sub-scopes should not be a problem.

> Anyway, we may avoid some conflicts between ACPI tables/objects by
> restricting which tables and objects can be passed from QEMU to Xen:
> (1) For ACPI tables, xen does not accept those built by itself,
> e.g. DSDT and SSDT.
> (2) xen does not accept ACPI tables for devices that are not attached to
> a domain, e.g. if NFIT cannot be passed if a domain does not have
> vNVDIMM.
> (3) For ACPI objects, xen only accepts namespace devices and requires
> their names does not conflict with existing ones provided by Xen.

And how do you imagine to enforce this without parsing the
handed AML? (Remember there's no AML parser in hvmloader.)

Jan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v5 6/6] docs: Add descriptions of TSC scaling in xl.cfg and tscmode.txt

2016-02-29 Thread Jan Beulich

>>> On 29.02.16 at 03:45,  wrote:
> On 02/29/16 10:02, Tian, Kevin wrote:
>> > From: Jan Beulich [mailto:jbeul...@suse.com]
>> > Sent: Friday, February 26, 2016 4:01 PM
>> > 
>> > >>> On 26.02.16 at 05:37,  wrote:
>> > >>  From: Zhang, Haozhong
>> > >> Sent: Tuesday, February 23, 2016 10:05 AM
>> > >>
>> > >> Signed-off-by: Haozhong Zhang 
>> > >
>> > > Reviewed-by: Kevin Tian , except:
>> > >
>> > >> +
>> > >> +Hardware TSC Scaling
>> > >> +
>> > >> +Intel VMX TSC scaling and AMD SVM TSC ratio allow the guest TSC read
>> > >> +by guest rdtsc/p increasing in a different frequency than the host
>> > >> +TSC frequency.
>> > >> +
>> > >> +If a HVM container in default TSC mode (tsc_mode=0) or PVRDTSCP mode
>> > >
>> > > 'HVM container' means something different. We usually use "HVM domain"
>> > > as you may see in other places in this doc.
>> > 
>> > But I think this is specifically meant to refer to both HVM and PVH
>> > domains.
>> > 
>> 
>> First, I have a feeling that many people today refer to containers
>> running within a VM as 'VM container', which is a bit confusing to
>> 'HVM container' purpose here. Couldn't we use 'HVM domains'
>> to cover both HVM and PVH (which is PV-HVM)? Curious whether
>> there is formal definition of those terminologies...
>>
> 
> I call it 'HVM container' because I use has_hvm_container_domain(d)
> | #define has_hvm_container_domain(d) ((d)->guest_type != guest_type_pv)
> to check whether TSC scaling can be used by a domain, which, in current
> implementation, is either a HVM domain (d->guest_type == guest_type_hvm)
> or a PVH domain (d->guest_type == guest_type_pvh).
> 
> And I also noticed another macro is_hvm_domain(d)
> | #define is_hvm_domain(d) ((d)->guest_type == guest_type_hvm)
> so I think 'HVM domain' can not be used to refer to both HVM and PVH
> domains.

Indeed.

>> Second, even when 'HVM container' can be used as you explained,
>> it's inconsistent with other places in same doc, where only 'HVM
>> domain' is used. I'd think consistency is more important in this
>> patch series, and then if 'HVM container' is really preferred which
>> should be a separate patch to update all related docs.

Yes, inconsistencies in the docs are likely a result of them not
having got updated when PVH got introduced, of PVH existence
being neglected at the time they got put in.

> Or, maybe I should make it explicit, i.e. using 'HVM and PVH domains'
> rather than 'HVM container'.

That's an option, but personally I think a worse one than the
"HVM container" term.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 2/3] x86/xsaves: fix overwriting between non-lazy/lazy xsave[sc]

2016-02-29 Thread Shuai Ruan

On Fri, Feb 26, 2016 at 01:42:35AM -0700, Jan Beulich wrote:
> >>> On 26.02.16 at 08:41,  wrote:
> > On Wed, Feb 24, 2016 at 02:16:38AM -0700, Jan Beulich wrote:
> >> >> The description lacks any mention of the performance impact,
> >> >> and what investigation was done to find ways to perhaps
> >> >> overcome this. For example, regardless of cpu_has_xsaves,
> >> >> do we really always need to _use_ XSAVES?
> >> >> 
> >> > Currently no supervisor xstates is enabled in xen or even in
> >> > guest os. Using xsaves is a little ahead (xsavec may used).  
> >> > xsavec may also cause overwriting problem like xsaves.
> >> > I will add performance impact in the description. 
> >> > I am still thinking of a better way to overcome the overhead xsave
> >> > (But have not get a better solution yet).
> >> 
> >> I was thinking that taking into consideration the features a guest
> >> has ever used (i.e. v->arch.xcr0_accum) to decide which variant
> >> to use may be a worthwhile avenue to explore.
> >> 
> > Oh, Thanks for your suggestion.
> > You mean when (v->arch.xcr0_accum & (XSTATE_LAZY & ~XSTATE_FP_SSE)) return 
> > false,
> > we can return XSTATE_NONLAZY in vcpu_xsave_mask when using xsave[cs]
> > otherwise return XSTATE_ALL.
> > It means we can save the area safely, if the guest has ever use 
> > XSTATE_NONLAZY | XSTATE_FP_SSE only. 
> 
> That's one step in the right direction. But the main difference
> between XSAVE/XSAVEOPT and XSAVEC/XSAVES is that the former
> can be used incrementally, while the latter can't. And I highly
> doubt the modified optimization the latter two support will kick in
> very often, since there's quite good a chance that the guest
> itself executed another one of these between two of our instances.
> Which to me means it should at least be investigated whether using
> XSAVEOPT in favor of XSAVEC wouldn't produce better performance,
> and whether XSAVES wouldn't better be used only if the guest uses
> a component which can only be saved that way.
> 
Thanks. 

Ok , I will do the performace test. And can you suggest me some 
workload/benchmark 
can be used here to the xsave related performance test ?


Other thing is from the text above, I guess that the best way to solve xsave[cs]
problem is:
1. use xsaveopt instead of xsave[cs] nowdays.
2. use xsaves whenever a component can only be saved that way( when some
   supervised components are supported in xen).
3. no xsavec support.
4. xsavec/xsaves feature will expose guest os if point 2 is ok.

If the above 4 points are ok to you, then I will write a patch to do
this.

> Jan
> 
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [RFC PATCH] xen-block: introduces extra request to pass-through SCSI commands

2016-02-29 Thread Paul Durrant

> -Original Message-
> From: Bob Liu [mailto:bob@oracle.com]
> Sent: 29 February 2016 03:37
> To: xen-devel@lists.xen.org
> Cc: Ian Jackson; jbeul...@suse.com; Roger Pau Monne; jgr...@suse.com;
> Paul Durrant; konrad.w...@oracle.com; Bob Liu
> Subject: [RFC PATCH] xen-block: introduces extra request to pass-through
> SCSI commands
> 
> 1) What is this patch about?
> This patch introduces an new block operation (BLKIF_OP_EXTRA_FLAG).
> A request with BLKIF_OP_EXTRA_FLAG set means the following request is an
> extra request which is used to pass through SCSI commands.
> This is like a simplified version of XEN_NETIF_EXTRA_* in netif.h.
> It can be extended easily to transmit other per-request/bio data from
> frontend
> to backend e.g Data Integrity Field per bio.
> 
> 2) Why we need this?
> Currently only raw data segments are transmitted from blkfront to blkback,
> which
> means some advanced features are lost.
>  * Guest knows nothing about features of the real backend storage.
>   For example, on bare-metal environment INQUIRY SCSI command
> can be used
>   to query storage device information. If it's a SSD or flash device we
>   can have the option to use the device as a fast cache.
>   But this can't happen in current domU guests, because blkfront only
>   knows it's just a normal virtual disk
> 

That's the sort of information that should be advertised via xenstore then. 
There already feature flags for specific things but if some form of 
throughput/latency information is meaningful to a frontend stack then perhaps 
that can be advertised too.

>  * Failover Clusters in Windows
>   Failover clusters require SCSI-3 persistent reservation target disks,
>   but now this can't work in domU.
> 

That's true but allowing arbitrary SCSI messages through is not the way forward 
IMO. Just because Windows thinks every HBA is SCSI doesn't mean other OS do so 
I think reservation/release should have dedicated messages in the blkif 
protocol if it's desirable to support clustering in the frontend.

> 3) Known issues:
>  * Security issues, how to 'validate' this extra request payload.
>E.g SCSI operates on LUN bases (the whole disk) while we really just want
> to
>operate on partitions
> 
>  * Can't pass SCSI commands through if the backend storage driver is bio-
> based
>instead of request-based.
> 
> 4) Alternative approach: Using PVSCSI instead:
>  * Doubt PVSCSI can support as many type of backend storage devices as
> Xen-block.
> 

LIO can interface to any block device in much the same way blkback does IIRC.

>  * Much longer path:
>ioctl() -> SCSI upper layer -> Middle layer -> PVSCSI-frontend -> PVSCSI-
> backend -> Target framework(LIO?) ->
> 
>With xen-block we only need:
>ioctl() -> blkfront -> blkback ->
> 

...and what happens if the block device that blkback is talking to is a SCSI 
LUN?

That latter path is also not true for Windows. You've got all the SCSI 
translation logic in the frontend when using blkif so that first path would 
collapse to:

Disk driver -> (SCSI) HBA Driver -> xen-scsiback -> LIO -> backstore -> XXX

>  * xen-block has been existed for many years, widely used and more stable.
> 

It's definitely widely used, but it has had stability issues in recent times.

  Paul

> Welcome any input, thank you!
> 
> Signed-off-by: Bob Liu 
> ---
>  xen/include/public/io/blkif.h |   73
> +
>  1 file changed, 73 insertions(+)
> 
> diff --git a/xen/include/public/io/blkif.h b/xen/include/public/io/blkif.h
> index 99f0326..7c10bce 100644
> --- a/xen/include/public/io/blkif.h
> +++ b/xen/include/public/io/blkif.h
> @@ -635,6 +635,28 @@
>  #define BLKIF_OP_INDIRECT  6
> 
>  /*
> + * Recognised only if "feature-extra-request" is present in backend xenbus
> info.
> + * A request with BLKIF_OP_EXTRA_FLAG indicates an extra request is
> followed
> + * in the shared ring buffer.
> + *
> + * By this way, extra data like SCSI command, DIF/DIX and other per-
> request/bio
> + * data can be transmitted from frontend to backend.
> + *
> + * The 'wire' format is like:
> + *  Request 1: xen_blkif_request
> + * [Request 2: xen_blkif_extra_request](only if request 1 has
> BLKIF_OP_EXTRA_FLAG)
> + *  Request 3: xen_blkif_request
> + *  Request 4: xen_blkif_request
> + * [Request 5: xen_blkif_extra_request](only if request 4 has
> BLKIF_OP_EXTRA_FLAG)
> + *  ...
> + *  Request N: xen_blkif_request
> + *
> + * If a backend does not recognize BLKIF_OP_EXTRA_FLAG, it should *not*
> create the
> + * "feature-extra-request" node!
> + */
> +#define BLKIF_OP_EXTRA_FLAG (0x80)
> +
> +/*
>   * Maximum scatter/gather segments per request.
>   * This is carefully chosen so that sizeof(blkif_ring_t) <= PAGE_SIZE.
>   * NB. This could be 12 if the ring indexes weren't stored in the same page.
> @@ -703,10 +725,61 @@ struct blkif_request_indirect {
>  };
>  typedef struct blkif_request_indirect blkif_request_in

Re: [Xen-devel] [PATCH v4 10/11] xen: modify page table construction

2016-02-29 Thread Juergen Gross

On 25/02/16 19:33, Andrei Borzenkov wrote:
> 22.02.2016 16:14, Juergen Gross пишет:
>> On 22/02/16 13:48, Daniel Kiper wrote:
>>> On Mon, Feb 22, 2016 at 01:30:30PM +0100, Juergen Gross wrote:
 On 22/02/16 13:18, Daniel Kiper wrote:
> On Mon, Feb 22, 2016 at 10:29:04AM +0100, Juergen Gross wrote:
>> On 22/02/16 10:17, Daniel Kiper wrote:
>>> On Mon, Feb 22, 2016 at 07:03:18AM +0100, Juergen Gross wrote:
 diff --git a/grub-core/lib/xen/relocator.c 
 b/grub-core/lib/xen/relocator.c
 index 8f427d3..a05b253 100644
 --- a/grub-core/lib/xen/relocator.c
 +++ b/grub-core/lib/xen/relocator.c
 @@ -29,6 +29,11 @@

  typedef grub_addr_t grub_xen_reg_t;

 +struct grub_relocator_xen_paging_area {
 +  grub_xen_reg_t start;
 +  grub_xen_reg_t size;
 +};
 +
>>>
>>> ... this should have GRUB_PACKED because compiler may
>>> add padding to align size member.
>>
>> Why would the compiler add padding to a structure containing two items
>> of the same type? I don't think the C standard would allow this.
>>
>> grub_xen_reg_t is either unsigned (32 bit) or unsigned long (64 bit).
>> There is no way this could require any padding.
>
> You are right but we should add this here just in case.

 Sorry, I don't think this makes any sense. The C standard is very clear
 in this case: a type requiring a special alignment has always a length
 being a multiple of that alignment. Otherwise arrays wouldn't work.
>>>
>>> Sorry, I am not sure what do you mean by that.
>>
>> The size of any C type (no matter whether it is an integral type like
>> "int" or a structure) has always the same alignment restriction as the
>> type itself. So a type requiring 8 byte alignment will always have a
>> size of a multiple of 8 bytes. This is mandatory for arrays to work, as
>> otherwise either the elements wouldn't be placed consecutively in memory
>> or the alignment restrictions wouldn't be obeyed for all elements.
>>
> 
> I too not follow how it is relevant to this case. We talk about internal
> padding between structure members, not between array elements.
> 
>> For our case it means that two structure elements of the same type will
>> never require a padding between them, thus the annotation with "packed"
>> can't serve any purpose.
>>
> 
> Well, I am not aware of any requirement. Compiler may add arbitrary
> padding between structure elements; it is only prohibited to add padding
> at the beginning. Sure, it would be unusual, but never say "never" ...
> also should Xen ever be ported to architecture where types are not
> self-aligned it will become an issue.

So you are telling me that _all_ interfaces between e.g. Linux, grub2,
Xen and all wire protocols not attributed with "packed" are just wrong?

Sorry, I don't think this is true.

And before starting this weird attributing I'd like to see e.g. the
structures in multiboot.h and multiboot2.h to be updated according to
this philosophy.

BTW: Linux kernel isn't using "packed" for network protocols and other
structures where the same reasoning would apply.

 Adding GRUB_PACKED would make the code less clear IMO. Finding such a
 qualifier in some code I want to modify would make me search for the
 reason for it which isn't existing.
>>>
>>> If maintainers do not object I am not going to insist on that any longer.
>>
> 
> It seems inconsistent through the code, really. But I think in this case
> it should be packed. It does not look like it is performance critical
> and it ensures we always match assembler code.

In case you still insist on it, I'll change it. But I'm far from
convinced this is a good move.


Juergen


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v4 3/4] libelf: rewrite symtab/strtab loading

2016-02-29 Thread Jan Beulich

>>> On 26.02.16 at 18:02,  wrote:
> El 26/2/16 a les 14:15, Jan Beulich ha escrit:
> On 16.02.16 at 18:37,  wrote:
>>> --- a/xen/common/libelf/libelf-loader.c
>>> +++ b/xen/common/libelf/libelf-loader.c
>>> @@ -164,20 +164,33 @@ void elf_parse_bsdsyms(struct elf_binary *elf, 
>>> uint64_t pstart)
>>>  sz = sizeof(uint32_t);
>>>  
>>>  /* Space for the elf and elf section headers */
>>> -sz += (elf_uval(elf, elf->ehdr, e_ehsize) +
>>> -   elf_shdr_count(elf) * elf_uval(elf, elf->ehdr, e_shentsize));
>>> +sz += elf_uval(elf, elf->ehdr, e_ehsize) +
>>> +  3 * elf_uval(elf, elf->ehdr, e_shentsize);
>> 
>> I think a literal 3 like this either needs a #define (at once serving
>> as documentation) or a comment. Perhaps rather the former,
>> considering that the (same?) 3 appears again further down. Plus -
>> what guarantees there are 3 section headers in the image?
> 
> This is not related to the image itself, but to the metadata that libelf
> places at the end of the loaded kernel in order to stash the symtab and
> strtab for the guest to use.

I understand this, yet imo the literal 3 still should be replaced.

> The layout is as follows (I should add this to the patch itself as a
> comment, since I guess this is still quite confusing):
> 
> ++
> ||
> | KERNEL |
> ||
> ++ pend
> | ALIGNMENT  |
> ++ bsd_symtab_pstart
> ||
> | size   | bsd_symtab_pend - bsd_symtab_pstart
> ||
> ++
> ||
> | ELF header |
> ||
> ++
> ||
> | ELF section header 0   | Dummy section header
> ||
> ++
> ||
> | ELF section header 1   | SYMTAB section header
> ||
> ++
> ||
> | ELF section header 2   | STRTAB section header
> ||
> ++
> ||
> | SYMTAB |
> ||
> ++
> ||
> | STRTAB |
> ||
> ++ bsd_symtab_pend
> 
> There are always only tree sections because that's all we need, section
> 0 is ignored, section 1 is used to describe the SYMTAB, and section 2 is
> used to describe the STRTAB.

All fine, but this still doesn't clarify how the kernel learns where
bsd_symtab_pstart is.

> According to the ELF spec there can only be
> one SYMTAB, so that's all we need.

Did you perhaps overlook the spec saying "... but this restriction
may be relaxed in the future", plus the fact that we're talking
about an executable file here (which commonly have two symbol
tables - .dynsym and .symtab), not an object one? (This isn't to
say that you need to make the code handle multiple ones, if you
know that *BSD will only ever have one.)

>>>  sz = elf_round_up(elf, sz);
>>>  
>>> -/* Space for the symbol and string tables. */
>>> +/* Space for the symbol and string table. */
>>>  for ( i = 0; i < elf_shdr_count(elf); i++ )
>>>  {
>>>  shdr = elf_shdr_by_index(elf, i);
>>>  if ( !elf_access_ok(elf, ELF_HANDLE_PTRVAL(shdr), 1) )
>>>  /* input has an insane section header count field */
>>>  break;
>>> -type = elf_uval(elf, shdr, sh_type);
>>> -if ( (type == SHT_STRTAB) || (type == SHT_SYMTAB) )
>>> -sz = elf_round_up(elf, sz + elf_uval(elf, shdr, sh_size));
>>> +
>>> +if ( elf_uval(elf, shdr, sh_type) != SHT_SYMTAB )
>>> +continue;
>>> +
>>> +sz = elf_round_up(elf, sz + elf_uval(elf, shdr, sh_size));
>>> +shdr = elf_shdr_by_index(elf, elf_uval(elf, shdr, sh_link));
>>> +
>>> +if ( !elf_access_ok(elf, ELF_HANDLE_PTRVAL(shdr), 1) )
>>> +/* input has an insane section header count field */
>>> +break;
>>> +
>>> +if ( elf_uval(elf, shdr, sh_type) != SHT_STRTAB )
>>> +/* Invalid symtab -> strtab link */
>>> +break;
>> 
>> This is not sufficient - what if sh_link is out of bounds, but the
>> bogusly accessed field happens to hold SHT_STRTAB? (Oddly
>> enough you have at least an SHN_UNDEF check in the second
>> loop below.)
> 
> The out-of-bounds access would be detected by elf_access_ok (if it's out
> of the scope of the ELF file, which is all we care about). In fact the
> elf_access_ok above could be removed because elf_uval already performs
> out-of-bounds checks. The result is definitely no worse that what we are
> doing ATM.

No, the out of bounds check should be more strict than just
considering the whole image: The image is broken if the link
points

Re: [Xen-devel] [PATCH 2/3] x86/xsaves: fix overwriting between non-lazy/lazy xsave[sc]

2016-02-29 Thread Jan Beulich

>>> On 29.02.16 at 10:06,  wrote:
> On Fri, Feb 26, 2016 at 01:42:35AM -0700, Jan Beulich wrote:
>> >>> On 26.02.16 at 08:41,  wrote:
>> > On Wed, Feb 24, 2016 at 02:16:38AM -0700, Jan Beulich wrote:
>> >> >> The description lacks any mention of the performance impact,
>> >> >> and what investigation was done to find ways to perhaps
>> >> >> overcome this. For example, regardless of cpu_has_xsaves,
>> >> >> do we really always need to _use_ XSAVES?
>> >> >> 
>> >> > Currently no supervisor xstates is enabled in xen or even in
>> >> > guest os. Using xsaves is a little ahead (xsavec may used).  
>> >> > xsavec may also cause overwriting problem like xsaves.
>> >> > I will add performance impact in the description. 
>> >> > I am still thinking of a better way to overcome the overhead xsave
>> >> > (But have not get a better solution yet).
>> >> 
>> >> I was thinking that taking into consideration the features a guest
>> >> has ever used (i.e. v->arch.xcr0_accum) to decide which variant
>> >> to use may be a worthwhile avenue to explore.
>> >> 
>> > Oh, Thanks for your suggestion.
>> > You mean when (v->arch.xcr0_accum & (XSTATE_LAZY & ~XSTATE_FP_SSE)) return 
>> > false,
>> > we can return XSTATE_NONLAZY in vcpu_xsave_mask when using xsave[cs]
>> > otherwise return XSTATE_ALL.
>> > It means we can save the area safely, if the guest has ever use 
>> > XSTATE_NONLAZY | XSTATE_FP_SSE only. 
>> 
>> That's one step in the right direction. But the main difference
>> between XSAVE/XSAVEOPT and XSAVEC/XSAVES is that the former
>> can be used incrementally, while the latter can't. And I highly
>> doubt the modified optimization the latter two support will kick in
>> very often, since there's quite good a chance that the guest
>> itself executed another one of these between two of our instances.
>> Which to me means it should at least be investigated whether using
>> XSAVEOPT in favor of XSAVEC wouldn't produce better performance,
>> and whether XSAVES wouldn't better be used only if the guest uses
>> a component which can only be saved that way.
>> 
> Thanks. 
> 
> Ok , I will do the performace test. And can you suggest me some 
> workload/benchmark 
> can be used here to the xsave related performance test ?

Measuring just instruction execution time should be fine for the
purpose here, I think.

> Other thing is from the text above, I guess that the best way to solve 
> xsave[cs]
> problem is:
> 1. use xsaveopt instead of xsave[cs] nowdays.
> 2. use xsaves whenever a component can only be saved that way( when some
>supervised components are supported in xen).
> 3. no xsavec support.
> 4. xsavec/xsaves feature will expose guest os if point 2 is ok.

Provided this results in better performance than the alternative(s),
yes.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [RFC Design Doc] Add vNVDIMM support for Xen

2016-02-29 Thread Haozhong Zhang

On 02/29/16 02:01, Jan Beulich wrote:
> >>> On 28.02.16 at 15:48,  wrote:
> > On 02/24/16 09:54, Jan Beulich wrote:
> >> >>> On 24.02.16 at 16:48,  wrote:
> >> > On 02/24/16 07:24, Jan Beulich wrote:
> >> >> >>> On 24.02.16 at 14:28,  wrote:
> >> >> > On 02/18/16 10:17, Jan Beulich wrote:
> >> >> >> >>> On 01.02.16 at 06:44,  wrote:
> >> >> >> > 3.3 Guest ACPI Emulation
> >> >> >> > 
> >> >> >> > 3.3.1 My Design
> >> >> >> > 
> >> >> >> >  Guest ACPI emulation is composed of two parts: building guest NFIT
> >> >> >> >  and SSDT that defines ACPI namespace devices for NVDIMM, and
> >> >> >> >  emulating guest _DSM.
> >> >> >> > 
> >> >> >> >  (1) Building Guest ACPI Tables
> >> >> >> > 
> >> >> >> >   This design reuses and extends hvmloader's existing mechanism 
> >> >> >> > that
> >> >> >> >   loads passthrough ACPI tables from binary files to load NFIT and
> >> >> >> >   SSDT tables built by QEMU:
> >> >> >> >   1) Because the current QEMU does not building any ACPI tables 
> >> >> >> > when
> >> >> >> >  it runs as the Xen device model, this design needs to patch 
> >> >> >> > QEMU
> >> >> >> >  to build NFIT and SSDT (so far only NFIT and SSDT) in this 
> >> >> >> > case.
> >> >> >> > 
> >> >> >> >   2) QEMU copies NFIT and SSDT to the end of guest memory below
> >> >> >> >  4G. The guest address and size of those tables are written 
> >> >> >> > into
> >> >> >> >  xenstore 
> >> >> >> > (/local/domain/domid/hvmloader/dm-acpi/{address,length}).
> >> >> >> > 
> >> >> >> >   3) hvmloader is patched to probe and load device model 
> >> >> >> > passthrough
> >> >> >> >  ACPI tables from above xenstore keys. The detected ACPI tables
> >> >> >> >  are then appended to the end of existing guest ACPI tables 
> >> >> >> > just
> >> >> >> >  like what current construct_passthrough_tables() does.
> >> >> >> > 
> >> >> >> >   Reasons for this design are listed below:
> >> >> >> >   - NFIT and SSDT in question are quite self-contained, i.e. they 
> >> >> >> > do
> >> >> >> > not refer to other ACPI tables and not conflict with existing
> >> >> >> > guest ACPI tables in Xen. Therefore, it is safe to copy them 
> >> >> >> > from
> >> >> >> > QEMU and append to existing guest ACPI tables.
> >> >> >> 
> >> >> >> How is this not conflicting being guaranteed? In particular I don't
> >> >> >> see how tables containing AML code and coming from different
> >> >> >> sources won't possibly cause ACPI name space collisions.
> >> >> >>
> >> >> > 
> >> >> > Really there is no effective mechanism to avoid ACPI name space
> >> >> > collisions (and other kinds of conflicts) between ACPI tables loaded
> >> >> > from QEMU and ACPI tables built by hvmloader. Because which ACPI 
> >> >> > tables
> >> >> > are loaded is determined by developers, IMO it's developers'
> >> >> > responsibility to avoid any collisions and conflicts with existing 
> >> >> > ACPI
> >> >> > tables.
> >> >> 
> >> >> Right, but this needs to be spelled out and settled on at design
> >> >> time (i.e. now), rather leaving things unspecified, awaiting the
> >> >> first clash.
> >> > 
> >> > So that means if no collision-proof mechanism is introduced, Xen should 
> >> > not
> >> > trust any passed-in ACPI tables and should build them by itself?
> >> 
> >> Basically yes, albeit collision-proof may be too much to demand.
> >> Simply separating name spaces (for hvmloader and qemu to have
> >> their own sub-spaces) would be sufficient imo. We should trust
> >> ourselves to play by such a specification.
> >>
> > 
> > I don't quite understand 'separating name spaces'. Do you mean, for
> > example, if both hvmloader and qemu want to put a namespace device under
> > \_SB, they could be put in different sub-scopes under \_SB? But it does
> > not work for Linux at least.
> 
> Aiui just the leaf names matter for sufficient separation, i.e.
> recurring sub-scopes should not be a problem.
>
> > Anyway, we may avoid some conflicts between ACPI tables/objects by
> > restricting which tables and objects can be passed from QEMU to Xen:
> > (1) For ACPI tables, xen does not accept those built by itself,
> > e.g. DSDT and SSDT.
> > (2) xen does not accept ACPI tables for devices that are not attached to
> > a domain, e.g. if NFIT cannot be passed if a domain does not have
> > vNVDIMM.
> > (3) For ACPI objects, xen only accepts namespace devices and requires
> > their names does not conflict with existing ones provided by Xen.
> 
> And how do you imagine to enforce this without parsing the
> handed AML? (Remember there's no AML parser in hvmloader.)
>

As I proposed in last reply, instead of passing an entire ACPI object,
QEMU passes the device name and the AML code under the AML device entry
separately. Because the name is explicitly given, no AML parser is
needed in hvmloader.

Haozhong

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] staging: libxl compile error in libxl__domain_save

2016-02-29 Thread Andrew Cooper

On 29/02/16 06:54, Olaf Hering wrote:
> On Sun, Feb 28, Wei Liu wrote:
>
>> If the current set of compiler flags is not good enough, we should
>> improve it. I'm afraid having a third set of maintainer mode flags that
>> nobody else uses is going to cause us more headache.
> There is nothing wrong with the CFLAGS. They are perfect for developers.
> But the commiters hopefully do some sort of compile test before doing a
> push. And this compile test must include -O2 to enable enough
> diagnostic to catch developer errors.

 gcc  -O1 -fno-omit-frame-pointer -m64 -g -fno-strict-aliasing
-std=gnu99 -Wall
-Wstrict-prototypes -Wdeclaration-after-statement
-Wno-unused-but-set-variable
-Wno-unused-local-typedefs   -O0 -g3
-D__XEN_INTERFACE_VERSION__=__XEN_LATEST_INTERFACE_VERSION__ -MMD -MF
.libxl_dom_save.o.d -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE
-fno-optimize-sibling-calls -fmessage-length=0 -grecord-gcc-switches -O2
-Wall
-D_FORTIFY_SOURCE=2 -fstack-protector -funwind-tables
-fasynchronous-unwind-tables -g -Werror -Wno-format-zero-length
-Wmissing-declarations -Wno-declaration-after-statement
-Wformat-nonliteral -I.
-fPIC -pthread

This set of options is very messy.  We have both an -O1, an -O0 and an
-O2, as well as three different -g's

Frankly, at no point ever should -O0 be used, even for debugging.  -Og
if available or -O1 if not.

In this case, the -O2, being latest, should take priority.  However, it
would be useful to identify which flags are coming from where, and see
if we usefully reduce them.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [RFC Design Doc] Add vNVDIMM support for Xen

2016-02-29 Thread Jan Beulich

>>> On 29.02.16 at 10:45,  wrote:
> On 02/29/16 02:01, Jan Beulich wrote:
>> >>> On 28.02.16 at 15:48,  wrote:
>> > Anyway, we may avoid some conflicts between ACPI tables/objects by
>> > restricting which tables and objects can be passed from QEMU to Xen:
>> > (1) For ACPI tables, xen does not accept those built by itself,
>> > e.g. DSDT and SSDT.
>> > (2) xen does not accept ACPI tables for devices that are not attached to
>> > a domain, e.g. if NFIT cannot be passed if a domain does not have
>> > vNVDIMM.
>> > (3) For ACPI objects, xen only accepts namespace devices and requires
>> > their names does not conflict with existing ones provided by Xen.
>> 
>> And how do you imagine to enforce this without parsing the
>> handed AML? (Remember there's no AML parser in hvmloader.)
> 
> As I proposed in last reply, instead of passing an entire ACPI object,
> QEMU passes the device name and the AML code under the AML device entry
> separately. Because the name is explicitly given, no AML parser is
> needed in hvmloader.

I must not only have missed that proposal, but I also don't see
how you mean this to work: Are you suggesting for hvmloader to
construct valid AML from the passed in blob? Or are you instead
considering to pass redundant information (name once given
explicitly and once embedded in the AML blob), allowing the two
to be out of sync?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [RFC v2 3/7] firmware: port built-in section to linker table

2016-02-29 Thread David Woodhouse

On Fri, 2016-02-19 at 05:45 -0800, Luis R. Rodriguez wrote:
> This ports built-in firmware to use linker tables,
> this replaces the custom section solution with a
> generic solution.
> 
> This also demos the use of the .rodata (SECTION_RO)
> linker tables.
> 
> Tested with 0 built-in firmware, 1 and 2 built-in
> firmwares successfully.

I think we'd do better to rip this support out entirely. It just isn't
needed; firmware can live in an initramfs and don't even need *any*
actual running userspace support to load it from there these days, do
we?

-- 
dwmw2

smime.p7s
Description: S/MIME cryptographic signature
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH V14 4/6] libxl: add pvusb API

2016-02-29 Thread George Dunlap

On 26/02/16 12:09, Ian Jackson wrote:
> George Dunlap writes ("Re: [Xen-devel] [PATCH V14 4/6] libxl: add pvusb API"):
>> On Fri, Feb 19, 2016 at 10:39 AM, Chunyan Liu  wrote:
>>> +  [...]
>>
>> So I see below that you're calling this before removing things from
>> xenstore, so that if any of these fail, the user can still call "xl
>> usb-remove" to retry.
>>
>> Unfortunately, since you reassign interfaces to the original driver
>> before all interfaces are de-assigned from usbback, you can end up in
>> a situation where the device is partially re-plugged into the original
>> drivers, partially still assigned to pciback.
>>
>> I think this whole process should be three steps:
>>
>> 1. Unassign all interfaces from usbback, stopping and returning an
>> error as soon as one attempt fails
>>
>> 2. Remove the pvusb xenstore nodes (stopping and returning an error if it 
>> fails)
>>
>> 3. Attempt to re-assign all interfaces to the original drivers,
>> stopping and returning an error as soon as one attempt fails.
> 
> This seems like a good plan to me.
> 
> (Making 3 after 2 re-attemptable would mean that the original driver
> information needs to be saved in a different location in xenstore to
> the pvusb control nodes, but that is not a problem.)
> 
>> * If one of the re-assignments fail, then the user will have to reload
>> the drivers, reboot, or mess around with sysfs to fix things.
>> *However*, this avoids a scenario where a user is completely unable to
>> remove a device from a domain because of a buggy driver.
> 
> Right.
> 
>> I realize this falls short of the "crash-only" design IanJ suggested
>> we try to do, but I think that in this case the work required to do
>> such a design would be a lot more work than the benefit it gives us.
> 
> I think what you have above is indeed crash-only.  You can tell by all
> the "if any error occurs, stop immediately".
> 
> The only wrinkle is that the obvious implementation reads the old
> bindings from xenstore between 1 and 2, deletes the information from
> xenstore in 2, and uses that information in step 3, which is cheating
> (and leads to the sysfs-wrangling-required state).  But that is quite
> easy to avoid:
>   - record the old driver bindings somewhere in xenstore (keyed by
> the physical host device, not the guest domain)
>   - provide a libxl call and corresponding xl command which attempts
> to rebind

The re-bind information is already stored in a different location, keyed
by physical host device. :-) (Search for uses of USBBACK_INFO_PATH.)

But we would, as you say, need to add a separate function/command for
doing clean-up (simply calling xl usb-detach on the virtual device again
doesn't make much sense, since the virtual devices no longer shows up in
xl usb-list).  Since we don't have another such command to copy, we'd be
inventing a new one, which means thinking very carefully about the
design of the interface (since we'd want future such functions to follow
this precedent if possible), &c &c, so...

> But, FAOD, I do not want to block this patch until such a thing is
> implemented.  I think it is sufficient to document the existence of
> the awkward state, and the likely workarounds.

Great, thanks. :-)

 -George



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] identify a Xen PV domU to fix devmem_is_allowed

2016-02-29 Thread Olaf Hering

What is the correct way to identify a Xen PV domU in the kenrel?
devmem_is_allowed() used to disable access to pages < 256 in domU.
With pvops this check was removed, or never ported forward.

Would this change be the correct fix?

+++ b/arch/x86/mm/init.c
@@ -637,7 +637,7 @@ void __init init_mem_mapping(void)
 int devmem_is_allowed(unsigned long pagenr)
 {
if (pagenr < 256)
-   return 1;
+   return !xen_pv_domain();
if (iomem_is_exclusive(pagenr << PAGE_SHIFT))
return 0;
if (!page_is_ram(pagenr))

Olaf

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] staging: libxl compile error in libxl__domain_save

2016-02-29 Thread Olaf Hering

On Mon, Feb 29, Andrew Cooper wrote:

> This set of options is very messy.  We have both an -O1, an -O0 and an
> -O2, as well as three different -g's

-O1 is a global CFLAG, -O0 is appended in tools/, and -O2 comes from
RPM_OPT_FLAGS.

Olaf

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [linux-3.10 baseline-only test] 44194: tolerable FAIL

2016-02-29 Thread Platform Team regression test user

This run is configured for baseline tests only.

flight 44194 linux-3.10 real [real]
http://osstest.xs.citrite.net/~osstest/testlogs/logs/44194/

Failures :-/ but no regressions.

Regressions which are regarded as allowable (not blocking):
 build-amd64-rumpuserxen   6 xen-buildfail blocked in 44176
 build-i386-rumpuserxen6 xen-buildfail blocked in 44176
 test-amd64-amd64-xl 19 guest-start/debian.repeat fail blocked in 44176
 test-amd64-amd64-xl-credit2 19 guest-start/debian.repeat fail blocked in 44176
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 9 debian-hvm-install fail 
blocked in 44176
 test-amd64-amd64-amd64-pvgrub 10 guest-start fail blocked in 44176
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsm 14 guest-saverestore.2 fail 
blocked in 44176
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stopfail blocked in 44176
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop   fail blocked in 44176
 test-amd64-amd64-xl-qemut-winxpsp3  9 windows-installfail blocked in 44176

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-rumpuserxen-amd64  1 build-check(1)   blocked n/a
 test-amd64-i386-rumpuserxen-i386  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail never pass

version targeted for testing:
 linux90915bdf5d75f981251d78f45dce37d39e679ac1
baseline version:
 linux66b4554a10194a0fb4272f185b5fc6f2710741de

Last test of basis44176  2016-02-26 11:59:19 Z2 days
Testing same since44194  2016-02-27 16:55:20 Z1 days1 attempts


People who touched revisions under test:
  Alan Stern 
  Alex Thorlton 
  Andrew Banman 
  Andrew Elble 
  Andrew Gabbasov 
  Andrew Morton 
  Andy Lutomirski 
  Anton Protopopov 
  Arnaldo Carvalho de Melo 
  Arnaldo Carvalho de Melo 
  Arnd Bergmann 
  AurÃ©lien Francillon 
  Bart Van Assche 
  Benjamin LaHaise 
  Benjamin Tissoires 
  Christoph Hellwig 
  Cong Wang 
  CQ Tang 
  Dan Carpenter 
  Darren Hart 
  David S. Miller 
  David Woodhouse 
  Dmitry Torokhov 
  Dmitry Vyukov 
  Erich Schubert 
  Ewan D. Milne 
  Filipe Manana 
  Greg Kroah-Hartman 
  H. Peter Anvin 
  Hannes Reinecke 
  Herton R. Krzesinski 
  Ingo Molnar 
  Insu Yun 
  James Bottomley 
  James Bottomley 
  Jan Kara 
  Jann Horn 
  Jens Axboe 
  Jiri Slaby 
  Joe Lawrence 
  Johannes Thumshirn 
  Jonathan Cameron 
  Kees Cook 
  Ken Xue 
  Kirill A. Shutemov 
  Konstantin Khlebnikov 
  Lars-Peter Clausen 
  Laura Abbott 
  Linus Torvalds 
  Linus Walleij 
  Martijn Coenen 
  Martin K. Petersen 
  Mathias Nyman 
  Matthew Wilcox 
  Michael Hennerich 
  Michal Hocko 
  Miklos Szeredi 
  Namhyung Kim 
  Naoya Horiguchi 
  Nicholas Bellinger 
  Nicolas Pitre 
  Paul Menzel 
  Peter Hurley 
  Peter Oberparleiter 
  Peter Zijlstra (Intel) 
  Peter Zijlstra 
  Roman Gushchin 
  Russell King 
  Rusty Russell 
  Sebastian Herbszt 
  Sergey Senozhatsky 
  Sergey Senozhatsky 
  Steve French 
  Steven Rostedt 
  Sudip Mukherjee 
  Sudip Mukherjee 
  Takashi Iwai 
  Theodore Ts'o 
  Thomas Gleixner 
  Trond Myklebust 
  Vegard Nossum 
  Vladimir Zapolskiy 
  Vlastimil Babka 
  WANG Cong 
  Willy Tarreau 
  Yong Li 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  fail
 build-i386-rumpuserxen   fail
 test-amd64-amd64-xl  fail
 test-amd64-i386-xl

Re: [Xen-devel] [PATCH v6 03/22] arm/acpi: Add __acpi_map_table function for ARM

2016-02-29 Thread Stefano Stabellini

On Sat, 27 Feb 2016, Shannon Zhao wrote:
> From: Shannon Zhao 
> 
> Implement __acpi_map_table function for ARM. Move FIX_ACPI_PAGES to
> common place and rename it to NUM_FIXMAP_ACPI_PAGES.
> 
> Cc: Jan Beulich 
> Signed-off-by: Shannon Zhao 

Reviewed-by: Stefano Stabellini 


>  xen/arch/arm/Makefile|  1 +
>  xen/arch/arm/acpi/Makefile   |  1 +
>  xen/arch/arm/acpi/lib.c  | 52 
> 
>  xen/include/asm-arm/config.h |  2 ++
>  xen/include/asm-x86/acpi.h   |  3 ---
>  xen/include/asm-x86/fixmap.h |  4 ++--
>  xen/include/xen/acpi.h   |  5 +
>  7 files changed, 63 insertions(+), 5 deletions(-)
>  create mode 100644 xen/arch/arm/acpi/Makefile
>  create mode 100644 xen/arch/arm/acpi/lib.c
> 
> diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
> index 1783912..6d51094 100644
> --- a/xen/arch/arm/Makefile
> +++ b/xen/arch/arm/Makefile
> @@ -2,6 +2,7 @@ subdir-$(CONFIG_ARM_32) += arm32
>  subdir-$(CONFIG_ARM_64) += arm64
>  subdir-y += platforms
>  subdir-$(CONFIG_ARM_64) += efi
> +subdir-$(CONFIG_HAS_ACPI) += acpi
>  
>  obj-$(EARLY_PRINTK) += early_printk.o
>  obj-y += cpu.o
> diff --git a/xen/arch/arm/acpi/Makefile b/xen/arch/arm/acpi/Makefile
> new file mode 100644
> index 000..b5be22d
> --- /dev/null
> +++ b/xen/arch/arm/acpi/Makefile
> @@ -0,0 +1 @@
> +obj-y += lib.o
> diff --git a/xen/arch/arm/acpi/lib.c b/xen/arch/arm/acpi/lib.c
> new file mode 100644
> index 000..a08e45c
> --- /dev/null
> +++ b/xen/arch/arm/acpi/lib.c
> @@ -0,0 +1,52 @@
> +/*
> + *  lib.c - Architecture-Specific Low-Level ACPI Support
> + *
> + *  Copyright (C) 2015, Shannon Zhao 
> + *
> + * ~~
> + *
> + *  This program is free software; you can redistribute it and/or modify
> + *  it under the terms of the GNU General Public License as published by
> + *  the Free Software Foundation; either version 2 of the License, or
> + *  (at your option) any later version.
> + *
> + *  This program is distributed in the hope that it will be useful,
> + *  but WITHOUT ANY WARRANTY; without even the implied warranty of
> + *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + *  GNU General Public License for more details.
> + *
> + *  You should have received a copy of the GNU General Public License
> + *  along with this program; if not, write to the Free Software
> + *  Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
> + *
> + * ~~
> + */
> +
> +#include 
> +#include 
> +#include 
> +
> +char *__acpi_map_table(paddr_t phys, unsigned long size)
> +{
> + unsigned long base, offset, mapped_size;
> + int idx;
> +
> + offset = phys & (PAGE_SIZE - 1);
> + mapped_size = PAGE_SIZE - offset;
> + set_fixmap(FIXMAP_ACPI_BEGIN, phys >> PAGE_SHIFT, PAGE_HYPERVISOR);
> + base = FIXMAP_ADDR(FIXMAP_ACPI_BEGIN);
> +
> + /*
> +  * Most cases can be covered by the below.
> +  */
> + idx = FIXMAP_ACPI_BEGIN;
> + while (mapped_size < size) {
> + if (++idx > FIXMAP_ACPI_END)
> + return NULL;/* cannot handle this */
> + phys += PAGE_SIZE;
> + set_fixmap(idx, phys >> PAGE_SHIFT, PAGE_HYPERVISOR);
> + mapped_size += PAGE_SIZE;
> + }
> +
> + return ((char *) base + offset);
> +}
> diff --git a/xen/include/asm-arm/config.h b/xen/include/asm-arm/config.h
> index a1b968d..a26cb1a 100644
> --- a/xen/include/asm-arm/config.h
> +++ b/xen/include/asm-arm/config.h
> @@ -180,6 +180,8 @@
>  #define FIXMAP_GICC14  /* Interrupt controller: CPU registers (first 
> page) */
>  #define FIXMAP_GICC25  /* Interrupt controller: CPU registers (second 
> page) */
>  #define FIXMAP_GICH 6  /* Interrupt controller: virtual interface 
> control registers */
> +#define FIXMAP_ACPI_BEGIN  7  /* Start mappings of ACPI tables */
> +#define FIXMAP_ACPI_END(FIXMAP_ACPI_BEGIN + NUM_FIXMAP_ACPI_PAGES - 1)  
> /* End mappings of ACPI tables */
>  
>  #define PAGE_SHIFT  12
>  
> diff --git a/xen/include/asm-x86/acpi.h b/xen/include/asm-x86/acpi.h
> index d532e3d..49f7e1e 100644
> --- a/xen/include/asm-x86/acpi.h
> +++ b/xen/include/asm-x86/acpi.h
> @@ -90,9 +90,6 @@ static inline void disable_acpi(void)
>   acpi_noirq = 1;
>  }
>  
> -/* Fixmap pages to reserve for ACPI boot-time tables (see fixmap.h) */
> -#define FIX_ACPI_PAGES 4
> -
>  static inline void acpi_noirq_set(void) { acpi_noirq = 1; }
>  
>  /* routines for saving/restoring kernel state */
> diff --git a/xen/include/asm-x86/fixmap.h b/xen/include/asm-x86/fixmap.h
> index 1e24b11..dc0856f 100644
> --- a/xen/include/asm-x86/fixmap.h
> +++ b/xen/include/asm-x86/fixmap.h
> @@ -19,11 +19,11 @@
>  
>  #ifndef __ASSEMBLY__
>  
> +#include 
>  #include 
>  #include 
>  #include 
>  #include 
> -#include 
>  #include 
>  #include 
>  #

[Xen-devel] 4.5.3 preparations

2016-02-29 Thread Jan Beulich

All,

it just occurred to me that 4.5.2 has been a while back, and indeed
4.5.3 would be due later this week. This may be a little too eager,
but I'd like to aim at getting this out at least some time next week.
Besides what is in the tree already, I have three more commits I
intend to backport:
0640ffb67f  x86emul: fix rIP handling
1329105943  x86/hvm: print register state upon triple fault
81d3a0b26c  x86emul: limit-check branch targets
Please indicate any further commits you think need putting in
before that stable release (Stefano, I'd in particular rely on you
to indicate any possible ARM candidates).

Thanks, Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v4 3/4] libelf: rewrite symtab/strtab loading

2016-02-29 Thread Roger Pau Monné

El 29/2/16 a les 10:31, Jan Beulich ha escrit:
 On 26.02.16 at 18:02,  wrote:
>> El 26/2/16 a les 14:15, Jan Beulich ha escrit:
>> On 16.02.16 at 18:37,  wrote:
 --- a/xen/common/libelf/libelf-loader.c
 +++ b/xen/common/libelf/libelf-loader.c
 @@ -164,20 +164,33 @@ void elf_parse_bsdsyms(struct elf_binary *elf, 
 uint64_t pstart)
  sz = sizeof(uint32_t);
  
  /* Space for the elf and elf section headers */
 -sz += (elf_uval(elf, elf->ehdr, e_ehsize) +
 -   elf_shdr_count(elf) * elf_uval(elf, elf->ehdr, e_shentsize));
 +sz += elf_uval(elf, elf->ehdr, e_ehsize) +
 +  3 * elf_uval(elf, elf->ehdr, e_shentsize);
>>>
>>> I think a literal 3 like this either needs a #define (at once serving
>>> as documentation) or a comment. Perhaps rather the former,
>>> considering that the (same?) 3 appears again further down. Plus -
>>> what guarantees there are 3 section headers in the image?
>>
>> This is not related to the image itself, but to the metadata that libelf
>> places at the end of the loaded kernel in order to stash the symtab and
>> strtab for the guest to use.
> 
> I understand this, yet imo the literal 3 still should be replaced.

Yes, I agree.

>> The layout is as follows (I should add this to the patch itself as a
>> comment, since I guess this is still quite confusing):
>>
>> ++
>> ||
>> | KERNEL |
>> ||
>> ++ pend
>> | ALIGNMENT  |
>> ++ bsd_symtab_pstart
>> ||
>> | size   | bsd_symtab_pend - bsd_symtab_pstart
>> ||
>> ++
>> ||
>> | ELF header |
>> ||
>> ++
>> ||
>> | ELF section header 0   | Dummy section header
>> ||
>> ++
>> ||
>> | ELF section header 1   | SYMTAB section header
>> ||
>> ++
>> ||
>> | ELF section header 2   | STRTAB section header
>> ||
>> ++
>> ||
>> | SYMTAB |
>> ||
>> ++
>> ||
>> | STRTAB |
>> ||
>> ++ bsd_symtab_pend
>>
>> There are always only tree sections because that's all we need, section
>> 0 is ignored, section 1 is used to describe the SYMTAB, and section 2 is
>> used to describe the STRTAB.
> 
> All fine, but this still doesn't clarify how the kernel learns where
> bsd_symtab_pstart is.

The BSDs linker scripts places an "end" symbol after all the loaded
sections:

https://svnweb.freebsd.org/base/head/sys/conf/ldscript.amd64?revision=284870&view=co

If you execute:

$ nm -n /boot/kernel/kernel

Against a FreeBSD kernel the output is as follows:

[...]
81dc4760 B cpu_info
81dc4b60 B dpcpu
81dc4b68 B smp_tlb_pmap
81dc4b70 B drq_rman
81dc4bb8 B mem_rman
81dc4c00 B irq_rman
81dc4c48 B port_rman
81dc4c90 B tsc_is_invariant
81dc4c94 B tsc_perf_stat
81dc4c98 B tsc_freq
81dc4ca0 B smp_tsc
81dc4ca8 B HYPERVISOR_shared_info
81dc4cb0 B xen_vector_callback_enabled
81dc4cb8 B HYPERVISOR_start_info
81dc4cc0 A _end
81dc4cc0 A end

>> According to the ELF spec there can only be
>> one SYMTAB, so that's all we need.
> 
> Did you perhaps overlook the spec saying "... but this restriction
> may be relaxed in the future", plus the fact that we're talking
> about an executable file here (which commonly have two symbol
> tables - .dynsym and .symtab), not an object one? (This isn't to
> say that you need to make the code handle multiple ones, if you
> know that *BSD will only ever have one.)

DYNSYM is just a subset of SYMTAB, BSDs prefer (it's not a strict
requirement) the SYMTAB because of the in-kernel debugger. Also DYNSYM
is already loaded by default because it's covered by the program headers.

I can add support for loading multiple SYMTABs/STRTABs, but shouldn't we
wait until the spec is updated? As I read the spec ATM, an ELF file with
multiple SYMTABs is invalid. I assume that if the ELF format ever allows
more than one SYMTAB, the version is going to be bumped at least (or
some other field is going to be used to signal this change).

  sz = elf_round_up(elf, sz);
  
 -/* Space for the symbol and string tables. */
 +/* Space for the symbol and string table. */
  for ( i = 0; i < elf_shdr_count(elf); i++ )
  {
  shdr = elf_shdr_by_index(elf, i);
  if ( !elf_access_ok(elf, ELF_HANDLE_PTRVAL(shdr), 1) )

[Xen-devel] [libvirt test] 84468: tolerable FAIL - PUSHED

2016-02-29 Thread osstest service owner

flight 84468 libvirt real [real]
http://logs.test-lab.xenproject.org/osstest/logs/84468/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-qcow2 13 guest-saverestorefail never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-armhf-armhf-libvirt 14 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 14 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt-raw 13 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail   never pass

version targeted for testing:
 libvirt  33fb8ff185846a7b4974105d2c9400690a6f95cf
baseline version:
 libvirt  ee67069c73cb91e3762cc63587d2a14d9a7253af

Last test of basis84215  2016-02-27 04:21:53 Z2 days
Testing same since84468  2016-02-29 04:24:05 Z0 days1 attempts


People who touched revisions under test:
  Roman Bogorodskiy 

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm   pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmpass
 test-amd64-amd64-libvirt-xsm pass
 test-armhf-armhf-libvirt-xsm fail
 test-amd64-i386-libvirt-xsm  pass
 test-amd64-amd64-libvirt pass
 test-armhf-armhf-libvirt fail
 test-amd64-i386-libvirt  pass
 test-amd64-amd64-libvirt-pairpass
 test-amd64-i386-libvirt-pair pass
 test-armhf-armhf-libvirt-qcow2   fail
 test-armhf-armhf-libvirt-raw fail
 test-amd64-amd64-libvirt-vhd pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

+ branch=libvirt
+ revision=33fb8ff185846a7b4974105d2c9400690a6f95cf
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x '!=' x/home/osstest/repos/lock ']'
++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock
++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push libvirt 
33fb8ff185846a7b4974105d2c9400690a6f95cf
+ branch=libvirt
+ revision=33fb8ff185846a7b4974105d2c9400

Re: [Xen-devel] 4.5.3 preparations

2016-02-29 Thread Lars Kurth

We should make sure that all relevant security patches are included
Thanks for the heads-up
Lars

> On 29 Feb 2016, at 10:55, Jan Beulich  wrote:
> 
> All,
> 
> it just occurred to me that 4.5.2 has been a while back, and indeed
> 4.5.3 would be due later this week. This may be a little too eager,
> but I'd like to aim at getting this out at least some time next week.
> Besides what is in the tree already, I have three more commits I
> intend to backport:
> 0640ffb67fx86emul: fix rIP handling
> 1329105943x86/hvm: print register state upon triple fault
> 81d3a0b26cx86emul: limit-check branch targets
> Please indicate any further commits you think need putting in
> before that stable release (Stefano, I'd in particular rely on you
> to indicate any possible ARM candidates).
> 
> Thanks, Jan
> 
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v4 04/24] arm/acpi: Estimate memory required for acpi/efi tables

2016-02-29 Thread Jan Beulich

>>> On 28.02.16 at 12:19,  wrote:
> --- a/xen/common/efi/boot.c
> +++ b/xen/common/efi/boot.c
> @@ -13,6 +13,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #if EFI_PAGE_SIZE != PAGE_SIZE
>  # error Cannot use xen/pfn.h here!
>  #endif
> @@ -1171,6 +1172,25 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE 
> *SystemTable)
>  for( ; ; ); /* not reached */
>  }
>  
> +#if defined (CONFIG_ACPI) && defined (CONFIG_ARM)

Are both parts really necessary? It would seem to me that the
latter should suffice.

> +/* Constant to indicate "Xen" in unicode u16 format */
> +static const u16 XEN_EFI_FW_VENDOR[] ={0x0058,0x0065,0x006E,0x};

This should be a wide string literal, with the variable type being
CHAR16. Also variable names shouldn't be all upper case.

> +int __init estimate_efi_size(int mem_nr_banks)
> +{
> +int size = 0;
> +int est_size = sizeof(EFI_SYSTEM_TABLE);
> +int ect_size = sizeof(EFI_CONFIGURATION_TABLE);
> +int emd_size = sizeof(EFI_MEMORY_DESCRIPTOR);
> +int fw_vendor_size = sizeof(XEN_EFI_FW_VENDOR);
> +
> +size += ROUNDUP((est_size + ect_size + fw_vendor_size), 8);
> +size += ROUNDUP(emd_size * (mem_nr_banks + acpi_mem.nr_banks + 1), 8);
> +
> +return size;
> +}

It would seem to me that none of the variables involved really
holds a signed quantity. This should be reflected by the types
chosen - likely you need s/int/size_t/ for the entire function,
except for the function parameter, which looks like it wants to
be unsigned int.

Also the initializer of "size" could be easily got rid of, and there's
a pair of pointless parentheses in the first ROUNDUP().

As to the configuration table - the sizeof() covers a single table
entry only afaict - is that really intended?

Jan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] Xen 4.7 Development Update

2016-02-29 Thread Wei Liu

NOTE: We are one month away from freeze. Features that wish to be in 4.7 must
be posted to xen-devel by March 18.

This email only tracks big items for xen.git tree. Please reply for items you
woulk like to see in 4.7 so that people have an idea what is going on and
prioritise accordingly.

You're welcome to provide description and use cases of the feature you're
working on.

= Timeline =

We now adopt a fixed cut-off date scheme. We will release twice a
year. The upcoming 4.7 timeline are as followed:

* Last posting date: March 18, 2016
* Hard code freeze: April 1, 2016
* RC1: TBD
* Release: June 3, 2016

Note that we don't have freeze exception scheme anymore. All patches
that wish to go into 4.7 must be posted no later than the last posting
date. All patches posted after that date will be automatically queued
into next release.

RCs will be arranged immediately after freeze.

= Projects =

== Hypervisor == 

*  Convert RTDS from time to event-driven model
  -  Meng Xu
  -  Tianyang Chen

=== x86 === 

*  Improve ioreq server performance
  -  Yu Zhang
  -  Paul Durrant

*  xsave/xrtors support
  -  Shuai Ruan

*  CPUID handling improvement
  -  Andrew Cooper

*  VMWare tools support
  -  Don Slutz

*  xSplice, hypervisor hot-patching
  -  Konrad Wilk
  -  Ross Lagerwall

*  Porting Intel PState driver
  -  Wei Wang

*  HVMlite support
  -  Roger Pau Monne

*  Posted interrupt
  -  Wu, Feng

*  VMX TSC scaling support
  -  Haozhong Zhang

*  VT-d asynchronous flush issue
  -  Quan Xu

*  Memory protection-key support
  -  Huaitong Han

=== ARM === 

*  ACPI for ARM64
  -  Shannon Zhao

*  Guest save / restore support
  -  Ian Campbell

== Toolstack == 

*  vNVDIMM support
  -  Haozhong Zhang

*  Libxl PVSCSI support
  -  Olaf Hering

*  PV USB passthrough
  -  Chunyan Liu

*  HVM USB passthrough
  -  George Dunlap

*  QEMU based PVUSB backend
  -  Juergen Gross

*  Domain snapshot API
  -  Chunyan Liu

*  Fix hotplug script machinery and remove blktap
  -  George Dunlap

*  Basic xSplice tooling (see hypervisor item)
  -  Konrad Wilk

*  Load BIOS via toolstack
  -  Anthony Perard

*  Libxl depriv QEMU
  -  Ian Jackson

*  COLO
  -  Wen Congyang

== Documentation == 

*  Feature maturity lifecycle
  -  Lars Kurth

== Completed == 

*  spinlock queued read-write locks
  -  Jennifer Herbert

*  Per-cpu reader-writer lock
  -  Malcolm Crossley

*  Make it easy to run xenstore in a domain
  -  Juergen Gross

*  Libxc support for migrating large PV domains
  -  Juergen Gross

*  Disentangle libxenctrl to stable libraries
  -  Ian Campbell

*  Use Kconfig to configure hypervisor components
  -  Doug Goldstein

*  vgic-v3
  -  Julien Grall

*  Toolstack-based soft reset
  -  Vitaly Kuznetsov

*  Libxc support for building large PV domains
  -  Juergen Gross

*  Run QEMU as non-root
  -  Stefano Stabellini

*  Intel CDP support
  -  He Chen


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v4 14/24] arm/acpi: Prepare EFI system table for Dom0

2016-02-29 Thread Jan Beulich

>>> On 28.02.16 at 12:19,  wrote:
> --- a/xen/common/efi/boot.c
> +++ b/xen/common/efi/boot.c
> @@ -1173,6 +1173,10 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE 
> *SystemTable)
>  }
>  
>  #if defined (CONFIG_ACPI) && defined (CONFIG_ARM)
> +#include "../../../common/decompress.h"
> +#define XZ_EXTERN STATIC
> +#include "../../../common/xz/crc32.c"
> +
>  /* Constant to indicate "Xen" in unicode u16 format */
>  static const u16 XEN_EFI_FW_VENDOR[] ={0x0058,0x0065,0x006E,0x};
>  
> @@ -1189,6 +1193,46 @@ int __init estimate_efi_size(int mem_nr_banks)
>  
>  return size;
>  }
> +
> +void __init acpi_create_efi_system_table(paddr_t paddr, void *efi_acpi_table,
> + struct membank tbl_add[])
> +{
> +u64 table_addr, table_size, offset = 0;
> +u8 *base_ptr;
> +EFI_CONFIGURATION_TABLE *efi_conf_tbl;
> +EFI_SYSTEM_TABLE *efi_sys_tbl;
> +
> +table_addr = paddr + acpi_get_table_offset(tbl_add, TBL_EFIT);
> +table_size = sizeof(EFI_SYSTEM_TABLE) + sizeof(EFI_CONFIGURATION_TABLE)
> + + sizeof(XEN_EFI_FW_VENDOR);
> +base_ptr = efi_acpi_table + acpi_get_table_offset(tbl_add, TBL_EFIT);
> +efi_sys_tbl = (EFI_SYSTEM_TABLE *)base_ptr;
> +
> +efi_sys_tbl->Hdr.Signature = EFI_SYSTEM_TABLE_SIGNATURE;
> +/* Specify the revision as 2.5 */
> +efi_sys_tbl->Hdr.Revision = (2 << 16 | 50);
> +efi_sys_tbl->Hdr.HeaderSize = table_size;
> +
> +efi_sys_tbl->FirmwareRevision = 1;
> +efi_sys_tbl->NumberOfTableEntries = 1;
> +offset += sizeof(EFI_SYSTEM_TABLE);
> +memcpy((u16 *)(base_ptr + offset), XEN_EFI_FW_VENDOR,
> +   sizeof(XEN_EFI_FW_VENDOR));
> +efi_sys_tbl->FirmwareVendor = (CHAR16 *)(table_addr + offset);
> +
> +offset += sizeof(XEN_EFI_FW_VENDOR);
> +efi_conf_tbl = (EFI_CONFIGURATION_TABLE *)(base_ptr + offset);
> +efi_conf_tbl->VendorGuid = (EFI_GUID)ACPI_20_TABLE_GUID;
> +efi_conf_tbl->VendorTable = (VOID *)tbl_add[TBL_RSDP].start;
> +efi_sys_tbl->ConfigurationTable = (EFI_CONFIGURATION_TABLE *)(table_addr
> +  + offset);
> +xz_crc32_init();
> +efi_sys_tbl->Hdr.CRC32 = xz_crc32((uint8_t *)efi_sys_tbl,
> +  efi_sys_tbl->Hdr.HeaderSize, 0);
> +
> +tbl_add[TBL_EFIT].start = table_addr;
> +tbl_add[TBL_EFIT].size = table_size;
> +}
>  #endif
>  
>  #ifndef CONFIG_ARM /* TODO - runtime service support */

While the previous addition here was probably fine on its own, here
it becomes clear that these additions all belong into arch/arm/efi/.

Also it doesn't look very nice to me to (ab)use xz's CRC32 code
here; I don't know who has suggested doing so.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v4 21/24] hvm/params: Add a new dilivery type for event-channel in HVM_PARAM_CALLBACK_IRQ

2016-02-29 Thread Jan Beulich

>>> On 28.02.16 at 12:19,  wrote:
> From: Shannon Zhao 
> 
> Add a new dilivery type:
> val[63:56] == 3: val[15:8] is flag: val[7:0] is a PPI.
> To the flag, bit 0 stands the interrupt mode is edge(1) or level(0) and
> bit 1 stands the interrupt polarity is active low(1) or high(0).
> 
> Cc: Jan Beulich 
> Signed-off-by: Shannon Zhao 
> ---
> v4: rebase on master
> ---
>  xen/include/public/hvm/params.h | 9 +
>  1 file changed, 9 insertions(+)
> 
> diff --git a/xen/include/public/hvm/params.h b/xen/include/public/hvm/params.h
> index 81f9451..382a79d 100644
> --- a/xen/include/public/hvm/params.h
> +++ b/xen/include/public/hvm/params.h
> @@ -55,6 +55,15 @@
>   * if this delivery method is available.
>   */
>  
> +#define HVM_PARAM_CALLBACK_TYPE_EVENT3
> +/*
> + * val[15:8] is flag of event-channel interrupt:
> + *  bit 0: interrupt is edge(1) or level(0) triggered
> + *  bit 1: interrupt is active low(1) or high(0)

Wouldn't it be better to name these bit 8 and bit 9 respectively,
to avoid confusion with the full value's bits 0 and 1?

Also please state explicitly that bits 16..63 need to be zero (and
make sure you check this in the code consuming the input).

Jan

> + * val[7:0] is PPI number used by event-channel.
> + * This is only used by ARM/ARM64.
> + */
> +
>  /*
>   * These are not used by Xen. They are here for convenience of HVM-guest
>   * xenbus implementations.
> -- 
> 2.0.4




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [BUG] gdbsx crashes Xen

2016-02-29 Thread Andrew Cooper

On 27/02/16 03:26, Carl Patenaude Poulin wrote:

Please don't top post.

> Hi all,
>
> Sorry for the delay. I've managed to get some serial console output.
>
> https://gist.github.com/lilred/50285e1f33ab1c881ea0
>
> I am not on Broadwell hardware. I'm using a PowerEdge R300 machine,
> with a Xeon x5460 CPU.
>
> Best

That is unfortunate.

Can you run `addr2line -e xen-syms 82d080217ae2` to find out exactly
which line the crash is from?

What kind of domain are you running?  It looks like an HVM domain.  If
so, can you possibly also, from dom0, run `xen-hvmctx $DOMID` before
attaching debugging?

~Andrew
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [xen-unstable test] 84436: regressions - FAIL

2016-02-29 Thread osstest service owner

flight 84436 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/84436/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-armhf-armhf-libvirt-raw  9 debian-di-install fail REGR. vs. 84347

Regressions which are regarded as allowable (not blocking):
 build-i386-rumpuserxen6 xen-buildfail   like 84347
 build-amd64-rumpuserxen   6 xen-buildfail   like 84347
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop  fail like 84347
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail like 84347
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail like 84347
 test-armhf-armhf-xl-rtds 15 guest-start/debian.repeatfail   like 84347

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-rumpuserxen-amd64  1 build-check(1)   blocked n/a
 test-amd64-i386-rumpuserxen-i386  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 14 guest-saverestorefail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 14 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-qcow2 13 guest-saverestorefail never pass
 test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  42391c613d42248d82f1b04c523d48bf141b75dc
baseline version:
 xen  3dd926a25d866364ce6d46c21f9ac05a82fa7ffb

Last test of basis84347  2016-02-28 07:59:56 Z1 days
Testing same since84436  2016-02-29 00:14:10 Z0 days1 attempts


People who touched revisions under test:
  Wei Liu 

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-oldkern  pass
 build-i386-oldkern   pass
 build-amd64-prev pass
 build-i386-prev  pass
 build-amd64-pvops

Re: [Xen-devel] [PATCH v4 23/24] xen/arm: Add a hypercall for device mmio mapping

2016-02-29 Thread Jan Beulich

>>> On 28.02.16 at 12:19,  wrote:
> It needs to map platform or amba device mmio to Dom0 on ARM. But when
> booting with ACPI, it can't get the mmio region in Xen due to lack of
> AML interpreter to parse DSDT table. Therefore, let Dom0 call a
> hypercall to map mmio region when it adds the devices.

Leaving aside the fact that XEN_DOMCTL_memory_mapping is not
meant to be called by other than the tools - is there anything here
which that hypercall wouldn't provide? If not, I think the commit
message should reference that hypercall too while rationalizing
the need for this new map space.

> ---
>  xen/arch/arm/mm.c   |  4 
>  xen/arch/arm/p2m.c  | 23 +++
>  xen/common/memory.c | 14 ++
>  xen/include/asm-arm/p2m.h   |  5 +
>  xen/include/public/memory.h |  1 +
>  5 files changed, 47 insertions(+)

This touches code with no maintainers Cc-ed.

> --- a/xen/arch/arm/mm.c
> +++ b/xen/arch/arm/mm.c
> @@ -1138,6 +1138,10 @@ int xenmem_add_to_physmap_one(
>  rcu_unlock_domain(od);
>  break;
>  }
> +case XENMAPSPACE_dev_mmio:
> +rc = map_dev_mmio_region(d, gpfn, 1, idx);
> +return rc;
> +break;

Dead code.

> --- a/xen/common/memory.c
> +++ b/xen/common/memory.c
> @@ -980,6 +980,13 @@ long do_memory_op(unsigned long cmd, 
> XEN_GUEST_HANDLE_PARAM(void) arg)
>  if ( d == NULL )
>  return -ESRCH;
>  
> +/* XENMAPSPACE_dev_mmio mapping is only supported for hardware Domain
> + * to map this kind of space to itself.
> + */

Coding style.

> +if ( (xatp.space == XENMAPSPACE_dev_mmio) &&
> + (!is_hardware_domain(current->domain) || (d != 
> current->domain)) )
> +return -EOPNOTSUPP;

More like -EPERM or -EACCES.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v4 24/24] xen/arm64: Add ACPI support

2016-02-29 Thread Jan Beulich

>>> On 28.02.16 at 12:19,  wrote:
> ---
>  xen/arch/arm/Kconfig | 1 +
>  xen/common/efi/runtime.c | 8 ++--
>  xen/include/asm-arm/config.h | 5 +
>  3 files changed, 12 insertions(+), 2 deletions(-)

Again - please Cc maintainers of all code modified.

> --- a/xen/common/efi/runtime.c
> +++ b/xen/common/efi/runtime.c
> @@ -10,8 +10,12 @@ DEFINE_XEN_GUEST_HANDLE(CHAR16);
>  
>  #ifndef COMPAT
>  
> -#ifdef CONFIG_ARM  /* Disabled until runtime services implemented */
> -const bool_t efi_enabled = 0;
> +#ifdef CONFIG_ARM
> +/* Currently it doesn't implement runtime services on ARM, but to boot Dom0 
> with
> + * ACPI it needs to assign efi_enabled with 1 to get acpi_os_get_root_pointer
> + * work.
> + */
> +const bool_t efi_enabled = 1;

This is now the same as x86's - the two should thus be folded.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v6 03/22] arm/acpi: Add __acpi_map_table function for ARM

2016-02-29 Thread Jan Beulich

>>> On 29.02.16 at 11:54,  wrote:
> On Sat, 27 Feb 2016, Shannon Zhao wrote:
>> From: Shannon Zhao 
>> 
>> Implement __acpi_map_table function for ARM. Move FIX_ACPI_PAGES to
>> common place and rename it to NUM_FIXMAP_ACPI_PAGES.
>> 
>> Cc: Jan Beulich 
>> Signed-off-by: Shannon Zhao 
> 
> Reviewed-by: Stefano Stabellini 

Are you sure, particularly with ...

>> --- /dev/null
>> +++ b/xen/arch/arm/acpi/lib.c
>> @@ -0,0 +1,52 @@
>> +/*
>> + *  lib.c - Architecture-Specific Low-Level ACPI Support
>> + *
>> + *  Copyright (C) 2015, Shannon Zhao 
>> + *
>> + * 
>> ~~
>> + *
>> + *  This program is free software; you can redistribute it and/or modify
>> + *  it under the terms of the GNU General Public License as published by
>> + *  the Free Software Foundation; either version 2 of the License, or
>> + *  (at your option) any later version.
>> + *
>> + *  This program is distributed in the hope that it will be useful,
>> + *  but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + *  GNU General Public License for more details.
>> + *
>> + *  You should have received a copy of the GNU General Public License
>> + *  along with this program; if not, write to the Free Software
>> + *  Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  
>> USA
>> + *
>> + * 
>> ~~
>> + */
>> +
>> +#include 
>> +#include 
>> +#include 
>> +
>> +char *__acpi_map_table(paddr_t phys, unsigned long size)
>> +{
>> +unsigned long base, offset, mapped_size;
>> +int idx;
>> +
>> +offset = phys & (PAGE_SIZE - 1);
>> +mapped_size = PAGE_SIZE - offset;
>> +set_fixmap(FIXMAP_ACPI_BEGIN, phys >> PAGE_SHIFT, PAGE_HYPERVISOR);
>> +base = FIXMAP_ADDR(FIXMAP_ACPI_BEGIN);
>> +
>> +/*
>> + * Most cases can be covered by the below.
>> + */
>> +idx = FIXMAP_ACPI_BEGIN;
>> +while (mapped_size < size) {
>> +if (++idx > FIXMAP_ACPI_END)
>> +return NULL;/* cannot handle this */
>> +phys += PAGE_SIZE;
>> +set_fixmap(idx, phys >> PAGE_SHIFT, PAGE_HYPERVISOR);
>> +mapped_size += PAGE_SIZE;
>> +}
>> +
>> +return ((char *) base + offset);
>> +}

... this new file using Linux instead of Xen coding style?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v6 03/22] arm/acpi: Add __acpi_map_table function for ARM

2016-02-29 Thread Stefano Stabellini

On Mon, 29 Feb 2016, Jan Beulich wrote:
> >>> On 29.02.16 at 11:54,  wrote:
> > On Sat, 27 Feb 2016, Shannon Zhao wrote:
> >> From: Shannon Zhao 
> >> 
> >> Implement __acpi_map_table function for ARM. Move FIX_ACPI_PAGES to
> >> common place and rename it to NUM_FIXMAP_ACPI_PAGES.
> >> 
> >> Cc: Jan Beulich 
> >> Signed-off-by: Shannon Zhao 
> > 
> > Reviewed-by: Stefano Stabellini 
> 
> Are you sure, particularly with ...
> 
> >> --- /dev/null
> >> +++ b/xen/arch/arm/acpi/lib.c
> >> @@ -0,0 +1,52 @@
> >> +/*
> >> + *  lib.c - Architecture-Specific Low-Level ACPI Support
> >> + *
> >> + *  Copyright (C) 2015, Shannon Zhao 
> >> + *
> >> + * 
> >> ~~
> >> + *
> >> + *  This program is free software; you can redistribute it and/or modify
> >> + *  it under the terms of the GNU General Public License as published by
> >> + *  the Free Software Foundation; either version 2 of the License, or
> >> + *  (at your option) any later version.
> >> + *
> >> + *  This program is distributed in the hope that it will be useful,
> >> + *  but WITHOUT ANY WARRANTY; without even the implied warranty of
> >> + *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> >> + *  GNU General Public License for more details.
> >> + *
> >> + *  You should have received a copy of the GNU General Public License
> >> + *  along with this program; if not, write to the Free Software
> >> + *  Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  
> >> USA
> >> + *
> >> + * 
> >> ~~
> >> + */
> >> +
> >> +#include 
> >> +#include 
> >> +#include 
> >> +
> >> +char *__acpi_map_table(paddr_t phys, unsigned long size)
> >> +{
> >> +  unsigned long base, offset, mapped_size;
> >> +  int idx;
> >> +
> >> +  offset = phys & (PAGE_SIZE - 1);
> >> +  mapped_size = PAGE_SIZE - offset;
> >> +  set_fixmap(FIXMAP_ACPI_BEGIN, phys >> PAGE_SHIFT, PAGE_HYPERVISOR);
> >> +  base = FIXMAP_ADDR(FIXMAP_ACPI_BEGIN);
> >> +
> >> +  /*
> >> +   * Most cases can be covered by the below.
> >> +   */
> >> +  idx = FIXMAP_ACPI_BEGIN;
> >> +  while (mapped_size < size) {
> >> +  if (++idx > FIXMAP_ACPI_END)
> >> +  return NULL;/* cannot handle this */
> >> +  phys += PAGE_SIZE;
> >> +  set_fixmap(idx, phys >> PAGE_SHIFT, PAGE_HYPERVISOR);
> >> +  mapped_size += PAGE_SIZE;
> >> +  }
> >> +
> >> +  return ((char *) base + offset);
> >> +}
> 
> ... this new file using Linux instead of Xen coding style?

That's true, but it is clearly taken from
xen/arch/x86/acpi/lib.c:__acpi_map_table, which is also Linux coding
style.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Xen 4.7 Development Update

2016-02-29 Thread Jan Beulich

>>> On 29.02.16 at 12:17,  wrote:
> *  Guest save / restore support
>   -  Ian Campbell

Not very likely I would say.

Also, looking at the amount of outstanding items, it seems unlikely
that we'll get even half of them in within the next 2.5 weeks.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Xen 4.7 Development Update

2016-02-29 Thread Wei Liu

On Mon, Feb 29, 2016 at 04:44:10AM -0700, Jan Beulich wrote:
> >>> On 29.02.16 at 12:17,  wrote:
> > *  Guest save / restore support
> >   -  Ian Campbell
> 
> Not very likely I would say.
> 
> Also, looking at the amount of outstanding items, it seems unlikely
> that we'll get even half of them in within the next 2.5 weeks.
> 

That is fine. Software engineers are often bad at estimating workload.
The ones that miss this cycle will be automatically moved to next cycle.

Wei.

> Jan
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [RFC Design Doc] Add vNVDIMM support for Xen

2016-02-29 Thread Haozhong Zhang

On 02/29/16 03:12, Jan Beulich wrote:
> >>> On 29.02.16 at 10:45,  wrote:
> > On 02/29/16 02:01, Jan Beulich wrote:
> >> >>> On 28.02.16 at 15:48,  wrote:
> >> > Anyway, we may avoid some conflicts between ACPI tables/objects by
> >> > restricting which tables and objects can be passed from QEMU to Xen:
> >> > (1) For ACPI tables, xen does not accept those built by itself,
> >> > e.g. DSDT and SSDT.
> >> > (2) xen does not accept ACPI tables for devices that are not attached to
> >> > a domain, e.g. if NFIT cannot be passed if a domain does not have
> >> > vNVDIMM.
> >> > (3) For ACPI objects, xen only accepts namespace devices and requires
> >> > their names does not conflict with existing ones provided by Xen.
> >> 
> >> And how do you imagine to enforce this without parsing the
> >> handed AML? (Remember there's no AML parser in hvmloader.)
> > 
> > As I proposed in last reply, instead of passing an entire ACPI object,
> > QEMU passes the device name and the AML code under the AML device entry
> > separately. Because the name is explicitly given, no AML parser is
> > needed in hvmloader.
> 
> I must not only have missed that proposal, but I also don't see
> how you mean this to work: Are you suggesting for hvmloader to
> construct valid AML from the passed in blob? Or are you instead
> considering to pass redundant information (name once given
> explicitly and once embedded in the AML blob), allowing the two
> to be out of sync?
>

I mean the former one.

Haozhong

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v6 03/22] arm/acpi: Add __acpi_map_table function for ARM

2016-02-29 Thread Jan Beulich

>>> On 29.02.16 at 12:43,  wrote:
> On Mon, 29 Feb 2016, Jan Beulich wrote:
>> >>> On 29.02.16 at 11:54,  wrote:
>> > On Sat, 27 Feb 2016, Shannon Zhao wrote:
>> >> From: Shannon Zhao 
>> >> 
>> >> Implement __acpi_map_table function for ARM. Move FIX_ACPI_PAGES to
>> >> common place and rename it to NUM_FIXMAP_ACPI_PAGES.
>> >> 
>> >> Cc: Jan Beulich 
>> >> Signed-off-by: Shannon Zhao 
>> > 
>> > Reviewed-by: Stefano Stabellini 
>> 
>> Are you sure, particularly with ...
>> 
>> >> --- /dev/null
>> >> +++ b/xen/arch/arm/acpi/lib.c
>> >> @@ -0,0 +1,52 @@
>> >> +/*
>> >> + *  lib.c - Architecture-Specific Low-Level ACPI Support
>> >> + *
>> >> + *  Copyright (C) 2015, Shannon Zhao 
>> >> + *
>> >> + * 
> ~~
>> >> + *
>> >> + *  This program is free software; you can redistribute it and/or modify
>> >> + *  it under the terms of the GNU General Public License as published by
>> >> + *  the Free Software Foundation; either version 2 of the License, or
>> >> + *  (at your option) any later version.
>> >> + *
>> >> + *  This program is distributed in the hope that it will be useful,
>> >> + *  but WITHOUT ANY WARRANTY; without even the implied warranty of
>> >> + *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> >> + *  GNU General Public License for more details.
>> >> + *
>> >> + *  You should have received a copy of the GNU General Public License
>> >> + *  along with this program; if not, write to the Free Software
>> >> + *  Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307 
>> >>  
> USA
>> >> + *
>> >> + * 
> ~~
>> >> + */
>> >> +
>> >> +#include 
>> >> +#include 
>> >> +#include 
>> >> +
>> >> +char *__acpi_map_table(paddr_t phys, unsigned long size)
>> >> +{
>> >> + unsigned long base, offset, mapped_size;
>> >> + int idx;
>> >> +
>> >> + offset = phys & (PAGE_SIZE - 1);
>> >> + mapped_size = PAGE_SIZE - offset;
>> >> + set_fixmap(FIXMAP_ACPI_BEGIN, phys >> PAGE_SHIFT, PAGE_HYPERVISOR);
>> >> + base = FIXMAP_ADDR(FIXMAP_ACPI_BEGIN);
>> >> +
>> >> + /*
>> >> +  * Most cases can be covered by the below.
>> >> +  */
>> >> + idx = FIXMAP_ACPI_BEGIN;
>> >> + while (mapped_size < size) {
>> >> + if (++idx > FIXMAP_ACPI_END)
>> >> + return NULL;/* cannot handle this */
>> >> + phys += PAGE_SIZE;
>> >> + set_fixmap(idx, phys >> PAGE_SHIFT, PAGE_HYPERVISOR);
>> >> + mapped_size += PAGE_SIZE;
>> >> + }
>> >> +
>> >> + return ((char *) base + offset);
>> >> +}
>> 
>> ... this new file using Linux instead of Xen coding style?
> 
> That's true, but it is clearly taken from
> xen/arch/x86/acpi/lib.c:__acpi_map_table, which is also Linux coding
> style.

The main difference being that the x86 file is a clone of the Linux
original, whereas the way the new ARM one gets introduced it
doesn't look to be quite that way.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v6 03/22] arm/acpi: Add __acpi_map_table function for ARM

2016-02-29 Thread Stefano Stabellini

On Mon, 29 Feb 2016, Jan Beulich wrote:
> >>> On 29.02.16 at 12:43,  wrote:
> > On Mon, 29 Feb 2016, Jan Beulich wrote:
> >> >>> On 29.02.16 at 11:54,  wrote:
> >> > On Sat, 27 Feb 2016, Shannon Zhao wrote:
> >> >> From: Shannon Zhao 
> >> >> 
> >> >> Implement __acpi_map_table function for ARM. Move FIX_ACPI_PAGES to
> >> >> common place and rename it to NUM_FIXMAP_ACPI_PAGES.
> >> >> 
> >> >> Cc: Jan Beulich 
> >> >> Signed-off-by: Shannon Zhao 
> >> > 
> >> > Reviewed-by: Stefano Stabellini 
> >> 
> >> Are you sure, particularly with ...
> >> 
> >> >> --- /dev/null
> >> >> +++ b/xen/arch/arm/acpi/lib.c
> >> >> @@ -0,0 +1,52 @@
> >> >> +/*
> >> >> + *  lib.c - Architecture-Specific Low-Level ACPI Support
> >> >> + *
> >> >> + *  Copyright (C) 2015, Shannon Zhao 
> >> >> + *
> >> >> + * 
> > ~~
> >> >> + *
> >> >> + *  This program is free software; you can redistribute it and/or 
> >> >> modify
> >> >> + *  it under the terms of the GNU General Public License as published 
> >> >> by
> >> >> + *  the Free Software Foundation; either version 2 of the License, or
> >> >> + *  (at your option) any later version.
> >> >> + *
> >> >> + *  This program is distributed in the hope that it will be useful,
> >> >> + *  but WITHOUT ANY WARRANTY; without even the implied warranty of
> >> >> + *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> >> >> + *  GNU General Public License for more details.
> >> >> + *
> >> >> + *  You should have received a copy of the GNU General Public License
> >> >> + *  along with this program; if not, write to the Free Software
> >> >> + *  Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  
> >> >> 02111-1307  
> > USA
> >> >> + *
> >> >> + * 
> > ~~
> >> >> + */
> >> >> +
> >> >> +#include 
> >> >> +#include 
> >> >> +#include 
> >> >> +
> >> >> +char *__acpi_map_table(paddr_t phys, unsigned long size)
> >> >> +{
> >> >> +   unsigned long base, offset, mapped_size;
> >> >> +   int idx;
> >> >> +
> >> >> +   offset = phys & (PAGE_SIZE - 1);
> >> >> +   mapped_size = PAGE_SIZE - offset;
> >> >> +   set_fixmap(FIXMAP_ACPI_BEGIN, phys >> PAGE_SHIFT, 
> >> >> PAGE_HYPERVISOR);
> >> >> +   base = FIXMAP_ADDR(FIXMAP_ACPI_BEGIN);
> >> >> +
> >> >> +   /*
> >> >> +* Most cases can be covered by the below.
> >> >> +*/
> >> >> +   idx = FIXMAP_ACPI_BEGIN;
> >> >> +   while (mapped_size < size) {
> >> >> +   if (++idx > FIXMAP_ACPI_END)
> >> >> +   return NULL;/* cannot handle this */
> >> >> +   phys += PAGE_SIZE;
> >> >> +   set_fixmap(idx, phys >> PAGE_SHIFT, PAGE_HYPERVISOR);
> >> >> +   mapped_size += PAGE_SIZE;
> >> >> +   }
> >> >> +
> >> >> +   return ((char *) base + offset);
> >> >> +}
> >> 
> >> ... this new file using Linux instead of Xen coding style?
> > 
> > That's true, but it is clearly taken from
> > xen/arch/x86/acpi/lib.c:__acpi_map_table, which is also Linux coding
> > style.
> 
> The main difference being that the x86 file is a clone of the Linux
> original, whereas the way the new ARM one gets introduced it
> doesn't look to be quite that way.

Fair enough, you are right, it would be nice if __acpi_map_table was
following the Xen coding style.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [RFC Design Doc] Add vNVDIMM support for Xen

2016-02-29 Thread Jan Beulich

>>> On 29.02.16 at 12:52,  wrote:
> On 02/29/16 03:12, Jan Beulich wrote:
>> >>> On 29.02.16 at 10:45,  wrote:
>> > On 02/29/16 02:01, Jan Beulich wrote:
>> >> >>> On 28.02.16 at 15:48,  wrote:
>> >> > Anyway, we may avoid some conflicts between ACPI tables/objects by
>> >> > restricting which tables and objects can be passed from QEMU to Xen:
>> >> > (1) For ACPI tables, xen does not accept those built by itself,
>> >> > e.g. DSDT and SSDT.
>> >> > (2) xen does not accept ACPI tables for devices that are not attached to
>> >> > a domain, e.g. if NFIT cannot be passed if a domain does not have
>> >> > vNVDIMM.
>> >> > (3) For ACPI objects, xen only accepts namespace devices and requires
>> >> > their names does not conflict with existing ones provided by Xen.
>> >> 
>> >> And how do you imagine to enforce this without parsing the
>> >> handed AML? (Remember there's no AML parser in hvmloader.)
>> > 
>> > As I proposed in last reply, instead of passing an entire ACPI object,
>> > QEMU passes the device name and the AML code under the AML device entry
>> > separately. Because the name is explicitly given, no AML parser is
>> > needed in hvmloader.
>> 
>> I must not only have missed that proposal, but I also don't see
>> how you mean this to work: Are you suggesting for hvmloader to
>> construct valid AML from the passed in blob? Or are you instead
>> considering to pass redundant information (name once given
>> explicitly and once embedded in the AML blob), allowing the two
>> to be out of sync?
> 
> I mean the former one.

Which will involve adding how much new code to it?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v6 03/22] arm/acpi: Add __acpi_map_table function for ARM

2016-02-29 Thread Jan Beulich

>>> On 29.02.16 at 12:56,  wrote:
> On Mon, 29 Feb 2016, Jan Beulich wrote:
>> >>> On 29.02.16 at 12:43,  wrote:
>> > On Mon, 29 Feb 2016, Jan Beulich wrote:
>> >> >>> On 29.02.16 at 11:54,  wrote:
>> >> > On Sat, 27 Feb 2016, Shannon Zhao wrote:
>> >> >> From: Shannon Zhao 
>> >> >> 
>> >> >> Implement __acpi_map_table function for ARM. Move FIX_ACPI_PAGES to
>> >> >> common place and rename it to NUM_FIXMAP_ACPI_PAGES.
>> >> >> 
>> >> >> Cc: Jan Beulich 
>> >> >> Signed-off-by: Shannon Zhao 
>> >> > 
>> >> > Reviewed-by: Stefano Stabellini 
>> >> 
>> >> Are you sure, particularly with ...
>> >> 
>> >> >> --- /dev/null
>> >> >> +++ b/xen/arch/arm/acpi/lib.c
>> >> >> @@ -0,0 +1,52 @@
>> >> >> +/*
>> >> >> + *  lib.c - Architecture-Specific Low-Level ACPI Support
>> >> >> + *
>> >> >> + *  Copyright (C) 2015, Shannon Zhao 
>> >> >> + *
>> >> >> + * 
>> > ~~
>> >> >> + *
>> >> >> + *  This program is free software; you can redistribute it and/or 
>> >> >> modify
>> >> >> + *  it under the terms of the GNU General Public License as published 
>> >> >> by
>> >> >> + *  the Free Software Foundation; either version 2 of the License, or
>> >> >> + *  (at your option) any later version.
>> >> >> + *
>> >> >> + *  This program is distributed in the hope that it will be useful,
>> >> >> + *  but WITHOUT ANY WARRANTY; without even the implied warranty of
>> >> >> + *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> >> >> + *  GNU General Public License for more details.
>> >> >> + *
>> >> >> + *  You should have received a copy of the GNU General Public License
>> >> >> + *  along with this program; if not, write to the Free Software
>> >> >> + *  Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  
>> >> >> 02111-1307  
>> > USA
>> >> >> + *
>> >> >> + * 
>> > ~~
>> >> >> + */
>> >> >> +
>> >> >> +#include 
>> >> >> +#include 
>> >> >> +#include 
>> >> >> +
>> >> >> +char *__acpi_map_table(paddr_t phys, unsigned long size)
>> >> >> +{
>> >> >> +  unsigned long base, offset, mapped_size;
>> >> >> +  int idx;
>> >> >> +
>> >> >> +  offset = phys & (PAGE_SIZE - 1);
>> >> >> +  mapped_size = PAGE_SIZE - offset;
>> >> >> +  set_fixmap(FIXMAP_ACPI_BEGIN, phys >> PAGE_SHIFT, 
>> >> >> PAGE_HYPERVISOR);
>> >> >> +  base = FIXMAP_ADDR(FIXMAP_ACPI_BEGIN);
>> >> >> +
>> >> >> +  /*
>> >> >> +   * Most cases can be covered by the below.
>> >> >> +   */
>> >> >> +  idx = FIXMAP_ACPI_BEGIN;
>> >> >> +  while (mapped_size < size) {
>> >> >> +  if (++idx > FIXMAP_ACPI_END)
>> >> >> +  return NULL;/* cannot handle this */
>> >> >> +  phys += PAGE_SIZE;
>> >> >> +  set_fixmap(idx, phys >> PAGE_SHIFT, PAGE_HYPERVISOR);
>> >> >> +  mapped_size += PAGE_SIZE;
>> >> >> +  }
>> >> >> +
>> >> >> +  return ((char *) base + offset);
>> >> >> +}
>> >> 
>> >> ... this new file using Linux instead of Xen coding style?
>> > 
>> > That's true, but it is clearly taken from
>> > xen/arch/x86/acpi/lib.c:__acpi_map_table, which is also Linux coding
>> > style.
>> 
>> The main difference being that the x86 file is a clone of the Linux
>> original, whereas the way the new ARM one gets introduced it
>> doesn't look to be quite that way.
> 
> Fair enough, you are right, it would be nice if __acpi_map_table was
> following the Xen coding style.

In which case, Shannon, please also fix the comment style in
xen/include/xen/acpi.h!

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [BUG] documentation of xs.transaction_end is inconsistent with the implementation

2016-02-29 Thread Sergei Lebedev

Hi list,

The documentation of ``xs.transaction_end`` (from xen.lowlevel xs) doesn’t 
mention that the method accepts transaction handle 

#define xspy_transaction_end_doc "\n"   \
"End the current transaction.\n"\
"Attempts to commit the transaction unless abort is true.\n"\
" abort [int]: abort flag (default 0).\n"   \
"\n"\
"Returns True on success, False if you need to try again.\n"\
"Raises xen.lowlevel.xs.Error on error.\n"  
\
"\n"

while the implementation clearly expects it [*]

int abort = 0;
char *thstr;

if (!PyArg_ParseTupleAndKeywords(args, kwds, arg_spec, kwd_spec,
 &thstr, &abort))
return NULL;


Regards,
Sergei

[*]: 
http://xenbits.xen.org/gitweb/?p=xen.git;a=blob_plain;f=tools/python/xen/lowlevel/xs/xs.c;hb=HEAD
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [BUG] documentation of xs.transaction_end is inconsistent with the implementation

2016-02-29 Thread Sergei Lebedev

The same is true of ``xs.mkdir``.

#define xspy_mkdir_doc "\n" \
"Make a directory.\n"   \
" path [string]: path to directory to create.\n"\
"\n"\
"Returns None on success.\n"\
"Raises xen.lowlevel.xs.Error on error.\n"  \
"\n"

Sergei

> On 29 Feb 2016, at 14:55, Sergei Lebedev  wrote:
> 
> Hi list,
> 
> The documentation of ``xs.transaction_end`` (from xen.lowlevel xs) doesn’t 
> mention that the method accepts transaction handle 
> 
> #define xspy_transaction_end_doc "\n" \
>   "End the current transaction.\n"\
>   "Attempts to commit the transaction unless abort is true.\n"\
>   " abort [int]: abort flag (default 0).\n"   \
>   "\n"\
>   "Returns True on success, False if you need to try again.\n"\
>   "Raises xen.lowlevel.xs.Error on error.\n"  
> \
>   "\n"
> 
> while the implementation clearly expects it [*]
> 
>int abort = 0;
>char *thstr;
> 
>if (!PyArg_ParseTupleAndKeywords(args, kwds, arg_spec, kwd_spec,
> &thstr, &abort))
>return NULL;
> 
> 
> Regards,
> Sergei
> 
> [*]: 
> http://xenbits.xen.org/gitweb/?p=xen.git;a=blob_plain;f=tools/python/xen/lowlevel/xs/xs.c;hb=HEAD


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v4 04/24] arm/acpi: Estimate memory required for acpi/efi tables

2016-02-29 Thread Stefano Stabellini

On Sun, 28 Feb 2016, Shannon Zhao wrote:
> From: Shannon Zhao 
> 
> Estimate the memory required for loading acpi/efi tables in Dom0. Make
> the length of each table aligned with 64bit. Alloc the pages to store
> the new created EFI and ACPI tables and free these pages when
> destroying domain.
> 
> Cc: Jan Beulich 
> Signed-off-by: Parth Dixit 
> Signed-off-by: Shannon Zhao 
> ---
> v4: make the length of each table aligned with 64bit
> ---
>  xen/arch/arm/domain.c   |  4 +++
>  xen/arch/arm/domain_build.c | 81 
> -
>  xen/common/efi/boot.c   | 20 +++
>  xen/include/asm-arm/setup.h |  2 ++
>  4 files changed, 106 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
> index 3d274ae..1365b4a 100644
> --- a/xen/arch/arm/domain.c
> +++ b/xen/arch/arm/domain.c
> @@ -640,6 +640,10 @@ void arch_domain_destroy(struct domain *d)
>  domain_vgic_free(d);
>  domain_vuart_free(d);
>  free_xenheap_page(d->shared_info);
> +#ifdef CONFIG_ACPI
> +free_xenheap_pages(d->arch.efi_acpi_table,
> +   get_order_from_bytes(d->arch.efi_acpi_len));
> +#endif
>  }
>  
>  void arch_domain_shutdown(struct domain *d)
> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> index 83676e4..b10a69d 100644
> --- a/xen/arch/arm/domain_build.c
> +++ b/xen/arch/arm/domain_build.c
> @@ -12,6 +12,8 @@
>  #include 
>  #include 
>  #include 
> +#include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -1354,6 +1356,79 @@ static int prepare_dtb(struct domain *d, struct 
> kernel_info *kinfo)
>  return -EINVAL;
>  }
>  
> +#ifdef CONFIG_ACPI
> +static int estimate_acpi_efi_size(struct domain *d, struct kernel_info 
> *kinfo)
> +{
> +u64 efi_size, acpi_size = 0, addr;
> +u32 madt_size;
> +struct acpi_table_rsdp *rsdp_tbl;
> +struct acpi_table_header *table = NULL;
> +
> +efi_size = estimate_efi_size(kinfo->mem.nr_banks);
> +
> +acpi_size += ROUNDUP(sizeof(struct acpi_table_fadt), 8);
> +acpi_size += ROUNDUP(sizeof(struct acpi_table_stao), 8);
> +
> +madt_size = sizeof(struct acpi_table_madt)
> ++ sizeof(struct acpi_madt_generic_interrupt) * d->max_vcpus
> ++ sizeof(struct acpi_madt_generic_distributor);
> +if ( d->arch.vgic.version == GIC_V3 )
> +madt_size += sizeof(struct acpi_madt_generic_redistributor)
> + * d->arch.vgic.nr_regions;
> +acpi_size += ROUNDUP(madt_size, 8);
> +
> +addr = acpi_os_get_root_pointer();
> +if ( !addr )
> +{
> +printk("Unable to get acpi root pointer\n");
> +return -EINVAL;
> +}
> +rsdp_tbl = acpi_os_map_memory(addr, sizeof(struct acpi_table_rsdp));
> +table = acpi_os_map_memory(rsdp_tbl->xsdt_physical_address,
> +   sizeof(struct acpi_table_header));
> +/* Add place for STAO table in XSDT table */
> +acpi_size += ROUNDUP(table->length + sizeof(u64), 8);
> +acpi_os_unmap_memory(table, sizeof(struct acpi_table_header));
> +acpi_os_unmap_memory(rsdp_tbl, sizeof(struct acpi_table_rsdp));
> +
> +acpi_size += ROUNDUP(sizeof(struct acpi_table_rsdp), 8);
> +d->arch.efi_acpi_len = ROUNDUP(efi_size, 8) + ROUNDUP(acpi_size, 8);
> +
> +return 0;
> +}
> +
> +static int prepare_acpi(struct domain *d, struct kernel_info *kinfo)
> +{
> +int rc = 0;
> +int order;
> +
> +rc = estimate_acpi_efi_size(d, kinfo);
> +if ( rc != 0 )
> +return rc;
> +
> +order = get_order_from_bytes(d->arch.efi_acpi_len);
> +d->arch.efi_acpi_table = alloc_xenheap_pages(order, 0);
> +if ( d->arch.efi_acpi_table == NULL )
> +{
> +printk("unable to allocate memory!\n");
> +return -ENOMEM;
> +}
> +memset(d->arch.efi_acpi_table, 0, d->arch.efi_acpi_len);
> +
> +/* For ACPI, Dom0 doesn't use kinfo->gnttab_start to get the grant table
> + * region. So we use it as the ACPI table mapped address. */
> +d->arch.efi_acpi_gpa = kinfo->gnttab_start;
> +
> +return 0;
> +}
> +#else
> +static int prepare_acpi(struct domain *d, struct kernel_info *kinfo)
> +{
> +/* Only booting with ACPI will hit here */
> +BUG_ON(1);
> +return -EINVAL;
> +}
> +#endif
>  static void dtb_load(struct kernel_info *kinfo)
>  {
>  void * __user dtb_virt = (void * __user)(register_t)kinfo->dtb_paddr;
> @@ -1540,7 +1615,11 @@ int construct_dom0(struct domain *d)
>  allocate_memory(d, &kinfo);
>  find_gnttab_region(d, &kinfo);
>  
> -rc = prepare_dtb(d, &kinfo);
> +if ( acpi_disabled )
> +rc = prepare_dtb(d, &kinfo);
> +else
> +rc = prepare_acpi(d, &kinfo);
> +
>  if ( rc < 0 )
>  return rc;
>  
> diff --git a/xen/common/efi/boot.c b/xen/common/efi/boot.c
> index 53c7452..535c685 100644
> --- a/xen/common/efi/boot.c
> +++ b/xen/common/efi/boot.c
> @@ -13,6 +13,7 @@
>  #include 
>  #include 
>  #include 
>

[Xen-devel] [PATCH 1/2] build: consolidate CONFIG_HAS_VGA and CONFIG_VGA

2016-02-29 Thread Doug Goldstein

No real advantage to keeping these separate. The use case of this from
Linux is when the platform or target board has support for something but
the user wants to be given the option to disable it.

Signed-off-by: Doug Goldstein 
---
CC: Keir Fraser 
CC: Jan Beulich 
CC: Andrew Cooper 
---
 xen/arch/x86/Kconfig | 2 +-
 xen/drivers/video/Kconfig| 4 ++--
 xen/drivers/video/Makefile   | 4 ++--
 xen/include/asm-x86/config.h | 1 -
 4 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/xen/arch/x86/Kconfig b/xen/arch/x86/Kconfig
index ca233b7..73f79cc 100644
--- a/xen/arch/x86/Kconfig
+++ b/xen/arch/x86/Kconfig
@@ -19,8 +19,8 @@ config X86
select HAS_PASSTHROUGH
select HAS_PCI
select HAS_PDX
-   select HAS_VGA
select NUMA
+   select VGA
 
 config ARCH_DEFCONFIG
string
diff --git a/xen/drivers/video/Kconfig b/xen/drivers/video/Kconfig
index 8b25694..03e1e18 100644
--- a/xen/drivers/video/Kconfig
+++ b/xen/drivers/video/Kconfig
@@ -3,8 +3,8 @@
 config HAS_VIDEO
bool
 
-# Select HAS_VGA if VGA is supported
-config HAS_VGA
+# Select VGA if VGA is supported
+config VGA
bool
select HAS_VIDEO
 
diff --git a/xen/drivers/video/Makefile b/xen/drivers/video/Makefile
index b9b5e23..fab7aba 100644
--- a/xen/drivers/video/Makefile
+++ b/xen/drivers/video/Makefile
@@ -1,7 +1,7 @@
-obj-$(CONFIG_HAS_VGA) := vga.o
+obj-$(CONFIG_VGA) := vga.o
 obj-$(CONFIG_HAS_VIDEO) += font_8x14.o
 obj-$(CONFIG_HAS_VIDEO) += font_8x16.o
 obj-$(CONFIG_HAS_VIDEO) += font_8x8.o
 obj-$(CONFIG_HAS_VIDEO) += lfb.o
-obj-$(CONFIG_HAS_VGA) += vesa.o
+obj-$(CONFIG_VGA) += vesa.o
 obj-$(CONFIG_HAS_ARM_HDLCD) += arm_hdlcd.o
diff --git a/xen/include/asm-x86/config.h b/xen/include/asm-x86/config.h
index 08addfb..004a7f6 100644
--- a/xen/include/asm-x86/config.h
+++ b/xen/include/asm-x86/config.h
@@ -42,7 +42,6 @@
 #define CONFIG_ACPI_SRAT 1
 #define CONFIG_ACPI_CSTATE 1
 
-#define CONFIG_VGA 1
 #define CONFIG_VIDEO 1
 
 #define CONFIG_WATCHDOG 1
-- 
2.4.10


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH 2/2] build: consolidate CONFIG_HAS_VIDEO and CONFIG_VIDEO

2016-02-29 Thread Doug Goldstein

No real advantage to keeping these separate. The use case of this from
Linux is when the platform or target board has support for something but
the user wants to be given the option to disable it.

Signed-off-by: Doug Goldstein 
---
CC: Ian Campbell 
CC: Stefano Stabellini 
CC: Keir Fraser 
CC: Jan Beulich 
CC: Andrew Cooper 
---
 xen/arch/arm/Kconfig | 2 +-
 xen/drivers/Makefile | 2 +-
 xen/drivers/video/Kconfig| 6 +++---
 xen/drivers/video/Makefile   | 8 
 xen/include/asm-arm/config.h | 2 --
 xen/include/asm-x86/config.h | 2 --
 6 files changed, 9 insertions(+), 13 deletions(-)

diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
index 60e923c..cb99df5 100644
--- a/xen/arch/arm/Kconfig
+++ b/xen/arch/arm/Kconfig
@@ -22,7 +22,7 @@ config ARM
select HAS_MEM_ACCESS
select HAS_PASSTHROUGH
select HAS_PDX
-   select HAS_VIDEO
+   select VIDEO
 
 config ARCH_DEFCONFIG
string
diff --git a/xen/drivers/Makefile b/xen/drivers/Makefile
index 6270e83..1939180 100644
--- a/xen/drivers/Makefile
+++ b/xen/drivers/Makefile
@@ -3,4 +3,4 @@ subdir-$(CONFIG_HAS_CPUFREQ) += cpufreq
 subdir-$(CONFIG_HAS_PCI) += pci
 subdir-$(CONFIG_HAS_PASSTHROUGH) += passthrough
 subdir-$(CONFIG_ACPI) += acpi
-subdir-$(CONFIG_HAS_VIDEO) += video
+subdir-$(CONFIG_VIDEO) += video
diff --git a/xen/drivers/video/Kconfig b/xen/drivers/video/Kconfig
index 03e1e18..739fe6f 100644
--- a/xen/drivers/video/Kconfig
+++ b/xen/drivers/video/Kconfig
@@ -1,12 +1,12 @@
 
-# Select HAS_VIDEO if video is supported
-config HAS_VIDEO
+# Select VIDEO if video is supported
+config VIDEO
bool
 
 # Select VGA if VGA is supported
 config VGA
bool
-   select HAS_VIDEO
+   select VIDEO
 
 # Select HAS_ARM_HDLCD if ARM HDLCD is supported
 config HAS_ARM_HDLCD
diff --git a/xen/drivers/video/Makefile b/xen/drivers/video/Makefile
index fab7aba..2bb91d6 100644
--- a/xen/drivers/video/Makefile
+++ b/xen/drivers/video/Makefile
@@ -1,7 +1,7 @@
 obj-$(CONFIG_VGA) := vga.o
-obj-$(CONFIG_HAS_VIDEO) += font_8x14.o
-obj-$(CONFIG_HAS_VIDEO) += font_8x16.o
-obj-$(CONFIG_HAS_VIDEO) += font_8x8.o
-obj-$(CONFIG_HAS_VIDEO) += lfb.o
+obj-$(CONFIG_VIDEO) += font_8x14.o
+obj-$(CONFIG_VIDEO) += font_8x16.o
+obj-$(CONFIG_VIDEO) += font_8x8.o
+obj-$(CONFIG_VIDEO) += lfb.o
 obj-$(CONFIG_VGA) += vesa.o
 obj-$(CONFIG_HAS_ARM_HDLCD) += arm_hdlcd.o
diff --git a/xen/include/asm-arm/config.h b/xen/include/asm-arm/config.h
index c0ad469..b5d155e 100644
--- a/xen/include/asm-arm/config.h
+++ b/xen/include/asm-arm/config.h
@@ -33,8 +33,6 @@
 
 #define CONFIG_SMP 1
 
-#define CONFIG_VIDEO 1
-
 #define CONFIG_IRQ_HAS_MULTIPLE_ACTION 1
 
 #define CONFIG_PAGEALLOC_MAX_ORDER 18
diff --git a/xen/include/asm-x86/config.h b/xen/include/asm-x86/config.h
index 004a7f6..4527ce3 100644
--- a/xen/include/asm-x86/config.h
+++ b/xen/include/asm-x86/config.h
@@ -42,8 +42,6 @@
 #define CONFIG_ACPI_SRAT 1
 #define CONFIG_ACPI_CSTATE 1
 
-#define CONFIG_VIDEO 1
-
 #define CONFIG_WATCHDOG 1
 
 #define CONFIG_MULTIBOOT 1
-- 
2.4.10


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 0/7] libxl: add support for FreeBSD block hotplug scripts

2016-02-29 Thread George Dunlap

On Thu, Feb 25, 2016 at 7:25 PM, Roger Pau Monne  wrote:
> This series enables using hotplug scripts with the FreeBSD blkback
> implementation. Since FreeBSD blkback can use both block devices and regular
> RAW files as disks, the physical-device xenstore backend node is now
> OS-specific, Linux and NetBSD will encode the device major and minor numbers
> there, while FreeBSD simply puts an absolute path to a disk image.

Just to catch me up here -- is this an incompatible thing  that
FreeBSD *already* does, or something you're adding with this series?

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v4 3/4] libelf: rewrite symtab/strtab loading

2016-02-29 Thread Jan Beulich

>>> On 29.02.16 at 11:57,  wrote:
> El 29/2/16 a les 10:31, Jan Beulich ha escrit:
> On 26.02.16 at 18:02,  wrote:
>>> The layout is as follows (I should add this to the patch itself as a
>>> comment, since I guess this is still quite confusing):
>>>
>>> ++
>>> ||
>>> | KERNEL |
>>> ||
>>> ++ pend
>>> | ALIGNMENT  |
>>> ++ bsd_symtab_pstart
>>> ||
>>> | size   | bsd_symtab_pend - bsd_symtab_pstart
>>> ||
>>> ++
>>> ||
>>> | ELF header |
>>> ||
>>> ++
>>> ||
>>> | ELF section header 0   | Dummy section header
>>> ||
>>> ++
>>> ||
>>> | ELF section header 1   | SYMTAB section header
>>> ||
>>> ++
>>> ||
>>> | ELF section header 2   | STRTAB section header
>>> ||
>>> ++
>>> ||
>>> | SYMTAB |
>>> ||
>>> ++
>>> ||
>>> | STRTAB |
>>> ||
>>> ++ bsd_symtab_pend
>>>
>>> There are always only tree sections because that's all we need, section
>>> 0 is ignored, section 1 is used to describe the SYMTAB, and section 2 is
>>> used to describe the STRTAB.
>> 
>> All fine, but this still doesn't clarify how the kernel learns where
>> bsd_symtab_pstart is.
> 
> The BSDs linker scripts places an "end" symbol after all the loaded
> sections:
> 
> https://svnweb.freebsd.org/base/head/sys/conf/ldscript.amd64?revision=284870&; 
> view=co

That's fine. But how do they know it is legitimate to even touch what
follows, not to speak of assign meaning to the value found there?
And are there no alignment/padding considerations necessary?

>>> According to the ELF spec there can only be
>>> one SYMTAB, so that's all we need.
>> 
>> Did you perhaps overlook the spec saying "... but this restriction
>> may be relaxed in the future", plus the fact that we're talking
>> about an executable file here (which commonly have two symbol
>> tables - .dynsym and .symtab), not an object one? (This isn't to
>> say that you need to make the code handle multiple ones, if you
>> know that *BSD will only ever have one.)
> 
> DYNSYM is just a subset of SYMTAB, BSDs prefer (it's not a strict
> requirement) the SYMTAB because of the in-kernel debugger. Also DYNSYM
> is already loaded by default because it's covered by the program headers.
> 
> I can add support for loading multiple SYMTABs/STRTABs, but shouldn't we
> wait until the spec is updated? As I read the spec ATM, an ELF file with
> multiple SYMTABs is invalid. I assume that if the ELF format ever allows
> more than one SYMTAB, the version is going to be bumped at least (or
> some other field is going to be used to signal this change).

As said I don't see the need for you to add multiple symbol table
support. I only meant to point out that the potential exists.

>  sz = elf_round_up(elf, sz);
>  
> -/* Space for the symbol and string tables. */
> +/* Space for the symbol and string table. */
>  for ( i = 0; i < elf_shdr_count(elf); i++ )
>  {
>  shdr = elf_shdr_by_index(elf, i);
>  if ( !elf_access_ok(elf, ELF_HANDLE_PTRVAL(shdr), 1) )
>  /* input has an insane section header count field */
>  break;
> -type = elf_uval(elf, shdr, sh_type);
> -if ( (type == SHT_STRTAB) || (type == SHT_SYMTAB) )
> -sz = elf_round_up(elf, sz + elf_uval(elf, shdr, sh_size));
> +
> +if ( elf_uval(elf, shdr, sh_type) != SHT_SYMTAB )
> +continue;
> +
> +sz = elf_round_up(elf, sz + elf_uval(elf, shdr, sh_size));
> +shdr = elf_shdr_by_index(elf, elf_uval(elf, shdr, sh_link));
> +
> +if ( !elf_access_ok(elf, ELF_HANDLE_PTRVAL(shdr), 1) )
> +/* input has an insane section header count field */
> +break;
> +
> +if ( elf_uval(elf, shdr, sh_type) != SHT_STRTAB )
> +/* Invalid symtab -> strtab link */
> +break;

 This is not sufficient - what if sh_link is out of bounds, but the
 bogusly accessed field happens to hold SHT_STRTAB? (Oddly
 enough you have at least an SHN_UNDEF check in the second
 loop below.)
>>>
>>> The out-of-bounds access would be detected by elf_access_ok (if it's out
>>> of the scope of the ELF file, which is all we care about). In fact the
>>> elf_access_ok above could be removed because elf_uva

Re: [Xen-devel] [PATCH v4 05/24] arm/acpi: Add a helper function to get the acpi table offset

2016-02-29 Thread Stefano Stabellini

On Sun, 28 Feb 2016, Shannon Zhao wrote:
> From: Shannon Zhao 
> 
> These tables are aligned with 64bit.
> 
> Signed-off-by: Shannon Zhao 
> ---
> v4: aligned with 64bit
> ---
>  xen/arch/arm/acpi/lib.c| 14 ++
>  xen/include/asm-arm/acpi.h |  2 ++
>  2 files changed, 16 insertions(+)
> 
> diff --git a/xen/arch/arm/acpi/lib.c b/xen/arch/arm/acpi/lib.c
> index 5698245..d5ac590 100644
> --- a/xen/arch/arm/acpi/lib.c
> +++ b/xen/arch/arm/acpi/lib.c
> @@ -62,3 +62,17 @@ bool_t __init acpi_psci_hvc_present(void)
>  {
>  return acpi_gbl_FADT.arm_boot_flags & ACPI_FADT_PSCI_USE_HVC;
>  }
> +
> +unsigned int acpi_get_table_offset(struct membank tbl_add[], EFI_MEM_RES 
> index)
> +{
> +int i;
> +unsigned int offset = 0;
> +
> +for ( i = 0; i < index; i++ )
> +{
> +/* Aligned with 64bit (8 bytes) */
> +offset += ROUNDUP(tbl_add[i].size, 8);
> +}
> +
> +return offset;

offset and the return type of this function need to be of the same type
of tbl_add[i].size, which is paddr_t (u64).


> +}
> diff --git a/xen/include/asm-arm/acpi.h b/xen/include/asm-arm/acpi.h
> index 7f59761..b507dca 100644
> --- a/xen/include/asm-arm/acpi.h
> +++ b/xen/include/asm-arm/acpi.h
> @@ -25,6 +25,7 @@
>  
>  #include 
>  #include 
> +#include 
>  
>  #define COMPILER_DEPENDENT_INT64   long long
>  #define COMPILER_DEPENDENT_UINT64  unsigned long long
> @@ -45,6 +46,7 @@ typedef enum {
>  bool_t __init acpi_psci_present(void);
>  bool_t __init acpi_psci_hvc_present(void);
>  void __init acpi_smp_init_cpus(void);
> +unsigned int acpi_get_table_offset(struct membank tbl_add[], EFI_MEM_RES 
> index);
>  
>  #ifdef CONFIG_ACPI
>  extern bool_t acpi_disabled;
> -- 
> 2.0.4
> 
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 0/7] libxl: add support for FreeBSD block hotplug scripts

2016-02-29 Thread Roger Pau Monné

El 29/2/16 a les 13:15, George Dunlap ha escrit:
> On Thu, Feb 25, 2016 at 7:25 PM, Roger Pau Monne  wrote:
>> This series enables using hotplug scripts with the FreeBSD blkback
>> implementation. Since FreeBSD blkback can use both block devices and regular
>> RAW files as disks, the physical-device xenstore backend node is now
>> OS-specific, Linux and NetBSD will encode the device major and minor numbers
>> there, while FreeBSD simply puts an absolute path to a disk image.
> 
> Just to catch me up here -- is this an incompatible thing  that
> FreeBSD *already* does, or something you're adding with this series?

I assume you mean the usage of the "physical-device" node, right? In
which case, this is something that I'm adding with this series (and some
FreeBSD kernel changes, of course). Current FreeBSD code doesn't use
physical-device at all.

Roger.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v4 10/11] xen: modify page table construction

2016-02-29 Thread Juergen Gross

On 29/02/16 10:13, Juergen Gross wrote:
> On 25/02/16 19:33, Andrei Borzenkov wrote:
>> 22.02.2016 16:14, Juergen Gross пишет:
>>> On 22/02/16 13:48, Daniel Kiper wrote:
 On Mon, Feb 22, 2016 at 01:30:30PM +0100, Juergen Gross wrote:
> On 22/02/16 13:18, Daniel Kiper wrote:
>> On Mon, Feb 22, 2016 at 10:29:04AM +0100, Juergen Gross wrote:
>>> On 22/02/16 10:17, Daniel Kiper wrote:
 On Mon, Feb 22, 2016 at 07:03:18AM +0100, Juergen Gross wrote:
> diff --git a/grub-core/lib/xen/relocator.c 
> b/grub-core/lib/xen/relocator.c
> index 8f427d3..a05b253 100644
> --- a/grub-core/lib/xen/relocator.c
> +++ b/grub-core/lib/xen/relocator.c
> @@ -29,6 +29,11 @@
>
>  typedef grub_addr_t grub_xen_reg_t;
>
> +struct grub_relocator_xen_paging_area {
> +  grub_xen_reg_t start;
> +  grub_xen_reg_t size;
> +};
> +

 ... this should have GRUB_PACKED because compiler may
 add padding to align size member.
>>>
>>> Why would the compiler add padding to a structure containing two items
>>> of the same type? I don't think the C standard would allow this.
>>>
>>> grub_xen_reg_t is either unsigned (32 bit) or unsigned long (64 bit).
>>> There is no way this could require any padding.
>>
>> You are right but we should add this here just in case.
>
> Sorry, I don't think this makes any sense. The C standard is very clear
> in this case: a type requiring a special alignment has always a length
> being a multiple of that alignment. Otherwise arrays wouldn't work.

 Sorry, I am not sure what do you mean by that.
>>>
>>> The size of any C type (no matter whether it is an integral type like
>>> "int" or a structure) has always the same alignment restriction as the
>>> type itself. So a type requiring 8 byte alignment will always have a
>>> size of a multiple of 8 bytes. This is mandatory for arrays to work, as
>>> otherwise either the elements wouldn't be placed consecutively in memory
>>> or the alignment restrictions wouldn't be obeyed for all elements.
>>>
>>
>> I too not follow how it is relevant to this case. We talk about internal
>> padding between structure members, not between array elements.
>>
>>> For our case it means that two structure elements of the same type will
>>> never require a padding between them, thus the annotation with "packed"
>>> can't serve any purpose.
>>>
>>
>> Well, I am not aware of any requirement. Compiler may add arbitrary
>> padding between structure elements; it is only prohibited to add padding
>> at the beginning. Sure, it would be unusual, but never say "never" ...
>> also should Xen ever be ported to architecture where types are not
>> self-aligned it will become an issue.
> 
> So you are telling me that _all_ interfaces between e.g. Linux, grub2,
> Xen and all wire protocols not attributed with "packed" are just wrong?
> 
> Sorry, I don't think this is true.

Okay, just found a reference: The x86 ABI states:

Aggregates and Unions
-
Structures and unions assume the alignment of their most strictly
aligned component. Each member is assigned to the lowest available
offset with the appropriate alignment. The size of any object is always
a multiple of the object‘s alignment.

I don't think any x86 C-compiler will violate the x86 ABI.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [RFC Design Doc] Add vNVDIMM support for Xen

2016-02-29 Thread Haozhong Zhang

On 02/29/16 05:04, Jan Beulich wrote:
> >>> On 29.02.16 at 12:52,  wrote:
> > On 02/29/16 03:12, Jan Beulich wrote:
> >> >>> On 29.02.16 at 10:45,  wrote:
> >> > On 02/29/16 02:01, Jan Beulich wrote:
> >> >> >>> On 28.02.16 at 15:48,  wrote:
> >> >> > Anyway, we may avoid some conflicts between ACPI tables/objects by
> >> >> > restricting which tables and objects can be passed from QEMU to Xen:
> >> >> > (1) For ACPI tables, xen does not accept those built by itself,
> >> >> > e.g. DSDT and SSDT.
> >> >> > (2) xen does not accept ACPI tables for devices that are not attached 
> >> >> > to
> >> >> > a domain, e.g. if NFIT cannot be passed if a domain does not have
> >> >> > vNVDIMM.
> >> >> > (3) For ACPI objects, xen only accepts namespace devices and requires
> >> >> > their names does not conflict with existing ones provided by Xen.
> >> >> 
> >> >> And how do you imagine to enforce this without parsing the
> >> >> handed AML? (Remember there's no AML parser in hvmloader.)
> >> > 
> >> > As I proposed in last reply, instead of passing an entire ACPI object,
> >> > QEMU passes the device name and the AML code under the AML device entry
> >> > separately. Because the name is explicitly given, no AML parser is
> >> > needed in hvmloader.
> >> 
> >> I must not only have missed that proposal, but I also don't see
> >> how you mean this to work: Are you suggesting for hvmloader to
> >> construct valid AML from the passed in blob? Or are you instead
> >> considering to pass redundant information (name once given
> >> explicitly and once embedded in the AML blob), allowing the two
> >> to be out of sync?
> > 
> > I mean the former one.
> 
> Which will involve adding how much new code to it?
>

Because hvmloader only accepts AML device rather than arbitrary objects,
only code that builds the outmost part of AML device is needed. In ACPI
spec, an AML device is defined as
DefDevice := DeviceOp PkgLength NameString ObjectList
hvmloader only needs to build the first 3 terms, while the last one is
passed from qemu.

Haozhong

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v4 06/24] arm/acpi: Prepare FADT table for Dom0

2016-02-29 Thread Stefano Stabellini

On Sun, 28 Feb 2016, Shannon Zhao wrote:
> From: Shannon Zhao 
> 
> Copy and modify FADT table before passing it to Dom0. Set PSCI_COMPLIANT
> and PSCI_USE_HVC.
> 
> Signed-off-by: Shannon Zhao 

Reviewed-by: Stefano Stabellini 


>  xen/arch/arm/domain_build.c | 43 +++
>  1 file changed, 43 insertions(+)
> 
> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> index b10a69d..f041a8a 100644
> --- a/xen/arch/arm/domain_build.c
> +++ b/xen/arch/arm/domain_build.c
> @@ -1357,6 +1357,44 @@ static int prepare_dtb(struct domain *d, struct 
> kernel_info *kinfo)
>  }
>  
>  #ifdef CONFIG_ACPI
> +
> +static int acpi_create_fadt(struct domain *d, struct membank tbl_add[])
> +{
> +struct acpi_table_header *table = NULL;
> +struct acpi_table_fadt *fadt = NULL;
> +u64 table_size;
> +acpi_status status;
> +u8 *base_ptr;
> +u8 checksum;
> +
> +status = acpi_get_table(ACPI_SIG_FADT, 0, &table);
> +
> +if ( ACPI_FAILURE(status) )
> +{
> +const char *msg = acpi_format_exception(status);
> +
> +printk("Failed to get FADT table, %s\n", msg);
> +return -EINVAL;
> +}
> +
> +table_size = table->length;
> +base_ptr = d->arch.efi_acpi_table
> +   + acpi_get_table_offset(tbl_add, TBL_FADT);
> +ACPI_MEMCPY(base_ptr, table, table_size);
> +fadt = (struct acpi_table_fadt *)base_ptr;
> +
> +/* Set PSCI_COMPLIANT and PSCI_USE_HVC */
> +fadt->arm_boot_flags |= (ACPI_FADT_PSCI_COMPLIANT | 
> ACPI_FADT_PSCI_USE_HVC);
> +checksum = acpi_tb_checksum(ACPI_CAST_PTR(u8, fadt), table_size);
> +fadt->header.checksum -= checksum;
> +
> +tbl_add[TBL_FADT].start = d->arch.efi_acpi_gpa
> +  + acpi_get_table_offset(tbl_add, TBL_FADT);
> +tbl_add[TBL_FADT].size = table_size;
> +
> +return 0;
> +}
> +
>  static int estimate_acpi_efi_size(struct domain *d, struct kernel_info 
> *kinfo)
>  {
>  u64 efi_size, acpi_size = 0, addr;
> @@ -1401,6 +1439,7 @@ static int prepare_acpi(struct domain *d, struct 
> kernel_info *kinfo)
>  {
>  int rc = 0;
>  int order;
> +struct membank tbl_add[TBL_MMAX] = {};
>  
>  rc = estimate_acpi_efi_size(d, kinfo);
>  if ( rc != 0 )
> @@ -1419,6 +1458,10 @@ static int prepare_acpi(struct domain *d, struct 
> kernel_info *kinfo)
>   * region. So we use it as the ACPI table mapped address. */
>  d->arch.efi_acpi_gpa = kinfo->gnttab_start;
>  
> +rc = acpi_create_fadt(d, tbl_add);
> +if ( rc != 0 )
> +return rc;
> +
>  return 0;
>  }
>  #else
> -- 
> 2.0.4
> 
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Domctl and physdevop for passthrough (Was: Re: Stabilising some tools only HVMOPs?)

2016-02-29 Thread Wei Liu

On Tue, Feb 23, 2016 at 02:31:30PM +, Wei Liu wrote:
> On Mon, Feb 22, 2016 at 04:28:19AM -0700, Jan Beulich wrote:
> > >>> On 19.02.16 at 17:05,  wrote:
> > > On Wed, Feb 17, 2016 at 05:28:08PM +, Wei Liu wrote:
> > >> Hi all
> > >> 
> > >> Tools people are in the process of splitting libxenctrl into a set of
> > >> stable libraries. One of the proposed libraries is libxendevicemodel
> > >> which has a collection of APIs that can be used by device model.
> > >> 
> > >> Currently we use QEMU as reference to extract symbols and go through
> > >> them one by one. Along the way we discover QEMU is using some tools
> > >> only HVMOPs.
> > >> 
> > >> The list of tools only HVMOPs used by QEMU are:
> > >> 
> > >>   #define HVMOP_track_dirty_vram6
> > >>   #define HVMOP_modified_memory7
> > >>   #define HVMOP_set_mem_type8
> > >>   #define HVMOP_inject_msi 16
> > >>   #define HVMOP_create_ioreq_server 17
> > >>   #define HVMOP_get_ioreq_server_info 18
> > >>   #define HVMOP_map_io_range_to_ioreq_server 19
> > >>   #define HVMOP_unmap_io_range_from_ioreq_server 20
> > >>   #define HVMOP_destroy_ioreq_server 21
> > >>   #define HVMOP_set_ioreq_server_state 22
> > >> 
> > > 
> > > In the process of ploughing through QEMU symbols, there are some domctls
> > > and physdevops used to do  passthrough. To make passthrough APIs in
> > > libxendevicemodel we need to stabilise them as well. Can I use the same
> > > trick __XEN_TOOLS_STABLE__ here? If not, what would be the preferred way
> > > of doing this?
> > > 
> > > PASSTHRU
> > > `xc_domain_bind_pt_pci_irq` `XEN_DOMCTL_bind_pt_irq`
> > > `xc_domain_ioport_mapping`  `XEN_DOMCTL_ioport_mapping` 
> > > `xc_domain_memory_mapping`  `XEN_DOMCTL_memory_mapping` 
> > > `xc_domain_unbind_msi_irq`  `XEN_DOMCTL_unbind_pt_irq`  
> > > `xc_domain_unbind_pt_irq`   `XEN_DOMCTL_unbind_pt_irq`  
> > > `xc_domain_update_msi_irq`  `XEN_DOMCTL_bind_pt_irq`
> > > `xc_physdev_map_pirq`   `PHYSDEVOP_map_pirq`
> > > `xc_physdev_map_pirq_msi`   `PHYSDEVOP_map_pirq`
> > > `xc_physdev_unmap_pirq` `PHYSDEVOP_unmap_pirq`  
> > 
> > Mechanically I would say yes, but anything here which is also on
> > the XSA-77 waiver list would first need removing there (with
> > proper auditing and, if necessary, fixing).
> > 
> 
> I admit I failed to parse xsm-flask.txt and XSA-77 and its implication,
> so let's take a concrete example instead.
> 
> Say, now I need to stabilise XEN_DOMCTL_pin_mem_cacheattr, which is on

The conversation thus far has indicated stabilising this particular
hypercall is no go.

The higher order goal is actually pinning the memory cache attribute for
video ram. I was thinking to have a set of dedicated hypercalls for
video ram.

But then my reading of XSA-154 suggests that no untrusted entity should
be allowed to alter the caching attribute, so a set of restricted
hypercalls might not be feasible either. I would like to know if my
reading is correct.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Domctl and physdevop for passthrough (Was: Re: Stabilising some tools only HVMOPs?)

2016-02-29 Thread Jan Beulich

>>> On 29.02.16 at 13:23,  wrote:
> On Tue, Feb 23, 2016 at 02:31:30PM +, Wei Liu wrote:
>> On Mon, Feb 22, 2016 at 04:28:19AM -0700, Jan Beulich wrote:
>> > >>> On 19.02.16 at 17:05,  wrote:
>> > > On Wed, Feb 17, 2016 at 05:28:08PM +, Wei Liu wrote:
>> > >> Hi all
>> > >> 
>> > >> Tools people are in the process of splitting libxenctrl into a set of
>> > >> stable libraries. One of the proposed libraries is libxendevicemodel
>> > >> which has a collection of APIs that can be used by device model.
>> > >> 
>> > >> Currently we use QEMU as reference to extract symbols and go through
>> > >> them one by one. Along the way we discover QEMU is using some tools
>> > >> only HVMOPs.
>> > >> 
>> > >> The list of tools only HVMOPs used by QEMU are:
>> > >> 
>> > >>   #define HVMOP_track_dirty_vram6
>> > >>   #define HVMOP_modified_memory7
>> > >>   #define HVMOP_set_mem_type8
>> > >>   #define HVMOP_inject_msi 16
>> > >>   #define HVMOP_create_ioreq_server 17
>> > >>   #define HVMOP_get_ioreq_server_info 18
>> > >>   #define HVMOP_map_io_range_to_ioreq_server 19
>> > >>   #define HVMOP_unmap_io_range_from_ioreq_server 20
>> > >>   #define HVMOP_destroy_ioreq_server 21
>> > >>   #define HVMOP_set_ioreq_server_state 22
>> > >> 
>> > > 
>> > > In the process of ploughing through QEMU symbols, there are some domctls
>> > > and physdevops used to do  passthrough. To make passthrough APIs in
>> > > libxendevicemodel we need to stabilise them as well. Can I use the same
>> > > trick __XEN_TOOLS_STABLE__ here? If not, what would be the preferred way
>> > > of doing this?
>> > > 
>> > > PASSTHRU
>> > > `xc_domain_bind_pt_pci_irq` `XEN_DOMCTL_bind_pt_irq`
>> > > `xc_domain_ioport_mapping`  `XEN_DOMCTL_ioport_mapping` 
>> > > `xc_domain_memory_mapping`  `XEN_DOMCTL_memory_mapping` 
>> > > `xc_domain_unbind_msi_irq`  `XEN_DOMCTL_unbind_pt_irq`  
>> > > `xc_domain_unbind_pt_irq`   `XEN_DOMCTL_unbind_pt_irq`  
>> > > `xc_domain_update_msi_irq`  `XEN_DOMCTL_bind_pt_irq`
>> > > `xc_physdev_map_pirq`   `PHYSDEVOP_map_pirq`
>> > > `xc_physdev_map_pirq_msi`   `PHYSDEVOP_map_pirq`
>> > > `xc_physdev_unmap_pirq` `PHYSDEVOP_unmap_pirq`  
>> > 
>> > Mechanically I would say yes, but anything here which is also on
>> > the XSA-77 waiver list would first need removing there (with
>> > proper auditing and, if necessary, fixing).
>> > 
>> 
>> I admit I failed to parse xsm-flask.txt and XSA-77 and its implication,
>> so let's take a concrete example instead.
>> 
>> Say, now I need to stabilise XEN_DOMCTL_pin_mem_cacheattr, which is on
> 
> The conversation thus far has indicated stabilising this particular
> hypercall is no go.
> 
> The higher order goal is actually pinning the memory cache attribute for
> video ram. I was thinking to have a set of dedicated hypercalls for
> video ram.
> 
> But then my reading of XSA-154 suggests that no untrusted entity should
> be allowed to alter the caching attribute, so a set of restricted
> hypercalls might not be feasible either. I would like to know if my
> reading is correct.

Yes, your reading is mostly correct: Of course this can be
permitted eventually, but only after having made such a model
safe against abuse.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v6 08/22] arm/acpi: Parse MADT to map logical cpu to MPIDR and get cpu_possible_map

2016-02-29 Thread Jan Beulich

>>> On 27.02.16 at 07:33,  wrote:
> From: Parth Dixit 
> 
> MADT contains the information for MPIDR which is essential for SMP
> initialization, parse the GIC cpu interface structures to get the MPIDR
> value and map it to cpu_logical_map(), and add enabled cpu with valid
> MPIDR into cpu_possible_map.
> 
> Move BAD_MADT_ENTRY to common place, parenthesize its parameters and
> drop the pointer cast.
> 
> Cc: Jan Beulich 
> Signed-off-by: Hanjun Guo 
> Signed-off-by: Tomasz Nowicki 
> Signed-off-by: Naresh Bhat 
> Signed-off-by: Parth Dixit 
> Signed-off-by: Shannon Zhao 
> Reviewed-by: Stefano Stabellini 

Acked-by: Jan Beulich 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v6 10/22] acpi/table: Introduce acpi_table_get_entry_madt to get specified entry

2016-02-29 Thread Jan Beulich

>>> On 27.02.16 at 07:37,  wrote:
> From: Shannon Zhao 
> 
> This function could get the specified index entry of MADT table. This
> would be useful when it needs to get the contens of the entry.
> 
> Signed-off-by: Shannon Zhao 

Acked-by: Jan Beulich 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v4 07/24] arm/gic: Add a new callback for creating MADT table for Dom0

2016-02-29 Thread Stefano Stabellini

On Sun, 28 Feb 2016, Shannon Zhao wrote:
> From: Shannon Zhao 
> 
> Add a new member in gic_hw_operations which is used to creat MADT table
> for Dom0.
> 
> Signed-off-by: Shannon Zhao 

Reviewed-by: Stefano Stabellini 


>  xen/arch/arm/gic-v2.c | 34 ++
>  xen/arch/arm/gic-v3.c | 47 
> +++
>  xen/arch/arm/gic.c|  5 +
>  xen/include/asm-arm/gic.h |  3 +++
>  4 files changed, 89 insertions(+)
> 
> diff --git a/xen/arch/arm/gic-v2.c b/xen/arch/arm/gic-v2.c
> index 0fcb894..02db5f2 100644
> --- a/xen/arch/arm/gic-v2.c
> +++ b/xen/arch/arm/gic-v2.c
> @@ -685,6 +685,35 @@ static void __init gicv2_dt_init(void)
>  }
>  
>  #ifdef CONFIG_ACPI
> +static u32 gicv2_make_hwdom_madt(const struct domain *d, u32 offset)
> +{
> +struct acpi_subtable_header *header;
> +struct acpi_madt_generic_interrupt *host_gicc, *gicc;
> +u32 i, size, table_len = 0;
> +u8 *base_ptr = d->arch.efi_acpi_table + offset;
> +
> +header = acpi_table_get_entry_madt(ACPI_MADT_TYPE_GENERIC_INTERRUPT, 0);
> +if ( !header )
> +panic("Can't get GICC entry");
> +host_gicc = container_of(header, struct acpi_madt_generic_interrupt,
> + header);
> +
> +size = sizeof(struct acpi_madt_generic_interrupt);
> +/* Add Generic Interrupt */
> +for ( i = 0; i < d->max_vcpus; i++ )
> +{
> +gicc = (struct acpi_madt_generic_interrupt *)(base_ptr + table_len);
> +ACPI_MEMCPY(gicc, host_gicc, size);
> +gicc->cpu_interface_number = i;
> +gicc->uid = i;
> +gicc->flags = ACPI_MADT_ENABLED;
> +gicc->arm_mpidr = vcpuid_to_vaffinity(i);
> +table_len += size;
> +}
> +
> +return table_len;
> +}
> +
>  static int __init
>  gic_acpi_parse_madt_cpu(struct acpi_subtable_header *header,
>  const unsigned long end)
> @@ -776,6 +805,10 @@ static void __init gicv2_acpi_init(void)
>  }
>  #else
>  static void __init gicv2_acpi_init(void) { }
> +static u32 gicv2_make_hwdom_madt(const struct domain *d, u32 offset)
> +{
> +return 0;
> +}
>  #endif
>  
>  static int __init gicv2_init(void)
> @@ -868,6 +901,7 @@ const static struct gic_hw_operations gicv2_ops = {
>  .read_vmcr_priority  = gicv2_read_vmcr_priority,
>  .read_apr= gicv2_read_apr,
>  .make_hwdom_dt_node  = gicv2_make_hwdom_dt_node,
> +.make_hwdom_madt = gicv2_make_hwdom_madt,
>  };
>  
>  /* Set up the GIC */
> diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
> index f83fd88..d9fce4b 100644
> --- a/xen/arch/arm/gic-v3.c
> +++ b/xen/arch/arm/gic-v3.c
> @@ -1236,6 +1236,48 @@ static void __init gicv3_dt_init(void)
>  }
>  
>  #ifdef CONFIG_ACPI
> +static u32 gicv3_make_hwdom_madt(const struct domain *d, u32 offset)
> +{
> +struct acpi_subtable_header *header;
> +struct acpi_madt_generic_interrupt *host_gicc, *gicc;
> +struct acpi_madt_generic_redistributor *gicr;
> +u8 *base_ptr = d->arch.efi_acpi_table + offset;
> +u32 i, table_len = 0, size;
> +
> +/* Add Generic Interrupt */
> +header = acpi_table_get_entry_madt(ACPI_MADT_TYPE_GENERIC_INTERRUPT, 0);
> +if ( !header )
> +panic("Can't get GICC entry");
> +host_gicc = container_of(header, struct acpi_madt_generic_interrupt,
> + header);
> +
> +size = sizeof(struct acpi_madt_generic_interrupt);
> +for ( i = 0; i < d->max_vcpus; i++ )
> +{
> +gicc = (struct acpi_madt_generic_interrupt *)(base_ptr + table_len);
> +ACPI_MEMCPY(gicc, host_gicc, size);
> +gicc->cpu_interface_number = i;
> +gicc->uid = i;
> +gicc->flags = ACPI_MADT_ENABLED;
> +gicc->arm_mpidr = vcpuid_to_vaffinity(i);
> +table_len += size;
> +}
> +
> +/* Add Generic Redistributor */
> +size = sizeof(struct acpi_madt_generic_redistributor);
> +for ( i = 0; i < d->arch.vgic.nr_regions; i++ )
> +{
> +gicr = (struct acpi_madt_generic_redistributor *)(base_ptr + 
> table_len);
> +gicr->header.type = ACPI_MADT_TYPE_GENERIC_REDISTRIBUTOR;
> +gicr->header.length = size;
> +gicr->base_address = d->arch.vgic.rdist_regions[i].base;
> +gicr->length = d->arch.vgic.rdist_regions[i].size;
> +table_len += size;
> +}
> +
> +return table_len;
> +}
> +
>  static int __init
>  gic_acpi_parse_madt_cpu(struct acpi_subtable_header *header,
>  const unsigned long end)
> @@ -1380,6 +1422,10 @@ static void __init gicv3_acpi_init(void)
>  }
>  #else
>  static void __init gicv3_acpi_init(void) { }
> +static u32 gicv3_make_hwdom_madt(const struct domain *d, u32 offset)
> +{
> +return 0;
> +}
>  #endif
>  
>  /* Set up the GIC */
> @@ -1474,6 +1520,7 @@ static const struct gic_hw_operations gicv3_ops = {
>  .read_apr= gicv3_read_apr,
>  .secondary_init  = gicv3_secondary_cpu_init

Re: [Xen-devel] [PATCH v4 08/24] arm/acpi: Prepare MADT table for Dom0

2016-02-29 Thread Stefano Stabellini

On Sun, 28 Feb 2016, Shannon Zhao wrote:
> From: Shannon Zhao 
> 
> Copy main MADT table contents and distributor subtable from physical
> ACPI MADT table. Make other subtables through the callback of
> gic_hw_ops.
> 
> Signed-off-by: Shannon Zhao 
> ---
> v4: use make_hwdom_madt callback to create MADT
> ---
>  xen/arch/arm/domain_build.c | 50 
> +
>  1 file changed, 50 insertions(+)
> 
> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> index f041a8a..e373165 100644
> --- a/xen/arch/arm/domain_build.c
> +++ b/xen/arch/arm/domain_build.c
> @@ -1357,6 +1357,52 @@ static int prepare_dtb(struct domain *d, struct 
> kernel_info *kinfo)
>  }
>  
>  #ifdef CONFIG_ACPI
> +static int acpi_create_madt(struct domain *d, struct membank tbl_add[])
> +{
> +struct acpi_table_header *table = NULL;
> +struct acpi_table_madt *madt = NULL;
> +struct acpi_subtable_header *header;
> +struct acpi_madt_generic_distributor *gicd;
> +u32 table_size = sizeof(struct acpi_table_madt);
> +u32 offset = acpi_get_table_offset(tbl_add, TBL_MADT);
> +acpi_status status;
> +u8 *base_ptr, checksum;
> +
> +status = acpi_get_table(ACPI_SIG_MADT, 0, &table);
> +
> +if ( ACPI_FAILURE(status) )
> +{
> +const char *msg = acpi_format_exception(status);
> +
> +printk("Failed to get MADT table, %s\n", msg);
> +return -EINVAL;
> +}
> +
> +base_ptr = d->arch.efi_acpi_table + offset;
> +ACPI_MEMCPY(base_ptr, table, table_size);
> +
> +/* Add Generic Distributor */
> +header = acpi_table_get_entry_madt(ACPI_MADT_TYPE_GENERIC_DISTRIBUTOR, 
> 0);
> +if ( !header )
> +panic("Can't get GICC entry");
^GICD

Aside from this:

Reviewed-by: Stefano Stabellini 


> +gicd = container_of(header, struct acpi_madt_generic_distributor, 
> header);
> +ACPI_MEMCPY(base_ptr + table_size, gicd,
> +sizeof(struct acpi_madt_generic_distributor));
> +table_size += sizeof(struct acpi_madt_generic_distributor);
> +
> +/* Add other subtables*/
> +table_size += gic_make_hwdom_madt(d, offset + table_size);
> +
> +madt = (struct acpi_table_madt *)base_ptr;
> +madt->header.length = table_size;
> +checksum = acpi_tb_checksum(ACPI_CAST_PTR(u8, madt), table_size);
> +madt->header.checksum -= checksum;
> +
> +tbl_add[TBL_MADT].start = d->arch.efi_acpi_gpa + offset;
> +tbl_add[TBL_MADT].size = table_size;
> +
> +return 0;
> +}
>  
>  static int acpi_create_fadt(struct domain *d, struct membank tbl_add[])
>  {
> @@ -1462,6 +1508,10 @@ static int prepare_acpi(struct domain *d, struct 
> kernel_info *kinfo)
>  if ( rc != 0 )
>  return rc;
>  
> +rc = acpi_create_madt(d, tbl_add);
> +if ( rc != 0 )
> +return rc;
> +
>  return 0;
>  }
>  #else
> -- 
> 2.0.4
> 
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Xen 4.7 Development Update

2016-02-29 Thread Meng Xu

On Mon, Feb 29, 2016 at 6:17 AM, Wei Liu  wrote:
> NOTE: We are one month away from freeze. Features that wish to be in 4.7 must
> be posted to xen-devel by March 18.
>
> This email only tracks big items for xen.git tree. Please reply for items you
> woulk like to see in 4.7 so that people have an idea what is going on and
> prioritise accordingly.
>
> You're welcome to provide description and use cases of the feature you're
> working on.
>
> = Timeline =
>
> We now adopt a fixed cut-off date scheme. We will release twice a
> year. The upcoming 4.7 timeline are as followed:
>
> * Last posting date: March 18, 2016
> * Hard code freeze: April 1, 2016
> * RC1: TBD
> * Release: June 3, 2016
>
> Note that we don't have freeze exception scheme anymore. All patches
> that wish to go into 4.7 must be posted no later than the last posting
> date. All patches posted after that date will be automatically queued
> into next release.
>
> RCs will be arranged immediately after freeze.
>
> = Projects =
>
> == Hypervisor ==
>
> *  Convert RTDS from time to event-driven model
>   -  Meng Xu
>   -  Tianyang Chen

We will try our best to meet this release cycle, although we are not
very sure if we can definitely meet the deadline. The patch is in a
correct direction and is in a good state, though.

Dario, what do you think? If we move faster this week in resolving the
comments, is there any chance that we could potentially finish it?

Thanks,

Meng

---
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v14 1/2] vmx: VT-d posted-interrupt core logic handling

2016-02-29 Thread Jan Beulich

>>> On 29.02.16 at 04:00,  wrote:
> This is the core logic handling for VT-d posted-interrupts. Basically it
> deals with how and when to update posted-interrupts during the following
> scenarios:
> - vCPU is preempted
> - vCPU is slept
> - vCPU is blocked
> 
> When vCPU is preempted/slept, we update the posted-interrupts during
> scheduling by introducing two new architecutral scheduler hooks:
> vmx_pi_switch_from() and vmx_pi_switch_to(). When vCPU is blocked, we
> introduce a new architectural hook: arch_vcpu_block() to update
> posted-interrupts descriptor.
> 
> Besides that, before VM-entry, we will make sure the 'NV' filed is set
> to 'posted_intr_vector' and the vCPU is not in any blocking lists, which
> is needed when vCPU is running in non-root mode. The reason we do this check
> is because we change the posted-interrupts descriptor in vcpu_block(),
> however, we don't change it back in vcpu_unblock() or when vcpu_block()
> directly returns due to event delivery (in fact, we don't need to do it
> in the two places, that is why we do it before VM-Entry).
> 
> When we handle the lazy context switch for the following two scenarios:
> - Preempted by a tasklet, which uses in an idle context.
> - the prev vcpu is in offline and no new available vcpus in run queue.
> We don't change the 'SN' bit in posted-interrupt descriptor, this
> may incur spurious PI notification events, but since PI notification
> event is only sent when 'ON' is clear, and once the PI notificatoin
> is sent, ON is set by hardware, hence no more notification events
> before 'ON' is clear. Besides that, spurious PI notification events are
> going to happen from time to time in Xen hypervisor, such as, when
> guests trap to Xen and PI notification event happens, there is
> nothing Xen actually needs to do about it, the interrupts will be
> delivered to guest atht the next time we do a VMENTRY.
> 
> CC: Keir Fraser 
> CC: Jan Beulich 
> CC: Andrew Cooper 
> CC: Kevin Tian 
> CC: George Dunlap 
> CC: Dario Faggioli 
> Suggested-by: Yang Zhang 
> Suggested-by: Dario Faggioli 
> Suggested-by: George Dunlap 
> Suggested-by: Jan Beulich 
> Signed-off-by: Feng Wu 
> Reviewed-by: George Dunlap 

With the comments George gave on v13 subsequent to this tag
I'm not sure it was correct to retain it. George?

Reviewed-by: Jan Beulich 
albeit in case another version is needed ...

> --- a/xen/include/asm-x86/hvm/hvm.h
> +++ b/xen/include/asm-x86/hvm/hvm.h
> @@ -565,6 +565,18 @@ const char *hvm_efer_valid(const struct vcpu *v, 
> uint64_t value,
> signed int cr0_pg);
>  unsigned long hvm_cr4_guest_reserved_bits(const struct vcpu *v, bool_t 
> restore);
>  
> +/*
> + * This must be defined as a macro instead of an inline function,
> + * because it uses 'struct vcpu' and 'struct domain' which have
> + * not been defined yet.
> + */
> +#define arch_vcpu_block(v) ({ \
> +struct vcpu *v_ = (v);\
> +if ( has_hvm_container_vcpu(v_) &&\
> + (v_)->domain->arch.hvm_domain.vmx.vcpu_block )   \
> +(v_)->domain->arch.hvm_domain.vmx.vcpu_block(v_); \
> +})

... please drop the stray parentheses here (I'll try to remember to
do so while committing if this is the version to go in).

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v6 1/5] x86/hvm: Setup TSC scaling ratio

2016-02-29 Thread Jan Beulich

>>> On 28.02.16 at 13:54,  wrote:
> This patch adds a field tsc_scaling_ratio in struct hvm_domain to record
> the per-domain TSC scaling ratio, and sets it in tsc_set_info().
> 
> Before setting the per-domain TSC scaling ratio, we check its validity
> in tsc_set_info(). If an invalid ratio is given, we will leave the
> default value in tsc_scaling_ratio (i.e. ratio = 1) and setup guest TSC
> as if no TSC scaling is used:
> * For TSC_MODE_FAULT,
>   - if a user-specified TSC frequency is given, we will set the guest
> TSC frequency to it; otherwise, we set it to the host TSC frequency.
>   - if guest TSC frequency does not equal to host TSC frequency, we will
> emulate guest TSC (i.e. d->arch.vtsc is set to 1). In both cases,
> guest TSC runs in the guest TSC frequency.
> * For TSC_MODE_PVRDTSCP,
>   - we set the guest TSC frequency to the host TSC frequency.
>   - guest rdtsc is executed natively in the host TSC frequency as
> before.
>   - if rdtscp is not available to guest, it will be emulated; otherwise,
> it will be executed natively. In both cases, guest rdtscp gets TSC
> in the host TSC frequency as before.
> 
> Signed-off-by: Haozhong Zhang 

Reviewed-by: Jan Beulich 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v6 2/5] x86/hvm: Replace architecture TSC scaling by a common function

2016-02-29 Thread Jan Beulich

>>> On 28.02.16 at 13:54,  wrote:
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -333,6 +333,23 @@ u64 hvm_get_tsc_scaling_ratio(u32 gtsc_khz)
>  return ratio > max_ratio ? 0 : ratio;
>  }
>  
> +u64 hvm_scale_tsc(const struct domain *d, u64 tsc)
> +{
> +u64 ratio = d->arch.hvm_domain.tsc_scaling_ratio;
> +u64 dummy;
> +
> +if ( ratio == hvm_default_tsc_scaling_ratio )
> +return tsc;
> +
> +/* tsc = (tsc * ratio) >> hvm_funcs.tsc_scaling.ratio_frac_bits */
> +asm ( "mulq %[ratio]; shrdq %[frac],%%rdx,%[tsc]"
> +  : [tsc] "+a" (tsc), "=d" (dummy)

Is mixing named and positional asm() operands supported by all
gcc versions we care about? Also strictly speaking "=d" needs
to be switched to "=&d".

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v6 1/5] x86/hvm: Setup TSC scaling ratio

2016-02-29 Thread Boris Ostrovsky


On 02/29/2016 08:41 AM, Jan Beulich wrote:

On 28.02.16 at 13:54,  wrote:

This patch adds a field tsc_scaling_ratio in struct hvm_domain to record
the per-domain TSC scaling ratio, and sets it in tsc_set_info().

Before setting the per-domain TSC scaling ratio, we check its validity
in tsc_set_info(). If an invalid ratio is given, we will leave the
default value in tsc_scaling_ratio (i.e. ratio = 1) and setup guest TSC
as if no TSC scaling is used:
* For TSC_MODE_FAULT,


s/TSC_MODE_FAULT/TSC_MODE_DEFAULT/

-boris


   - if a user-specified TSC frequency is given, we will set the guest
 TSC frequency to it; otherwise, we set it to the host TSC frequency.
   - if guest TSC frequency does not equal to host TSC frequency, we will
 emulate guest TSC (i.e. d->arch.vtsc is set to 1). In both cases,
 guest TSC runs in the guest TSC frequency.
* For TSC_MODE_PVRDTSCP,
   - we set the guest TSC frequency to the host TSC frequency.
   - guest rdtsc is executed natively in the host TSC frequency as
 before.
   - if rdtscp is not available to guest, it will be emulated; otherwise,
 it will be executed natively. In both cases, guest rdtscp gets TSC
 in the host TSC frequency as before.

Signed-off-by: Haozhong Zhang 

Reviewed-by: Jan Beulich 




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v14 1/2] vmx: VT-d posted-interrupt core logic handling

2016-02-29 Thread Dario Faggioli

On Mon, 2016-02-29 at 06:33 -0700, Jan Beulich wrote:
> > > > On 29.02.16 at 04:00,  wrote:
> > This is the core logic handling for VT-d posted-interrupts.
> > Basically it
> > deals with how and when to update posted-interrupts during the
> > following
> > scenarios:
> > - vCPU is preempted
> > - vCPU is slept
> > - vCPU is blocked
> > 
> [...]
> > CC: Keir Fraser 
> > CC: Jan Beulich 
> > CC: Andrew Cooper 
> > CC: Kevin Tian 
> > CC: George Dunlap 
> > CC: Dario Faggioli 
> > Suggested-by: Yang Zhang 
> > Suggested-by: Dario Faggioli 
> > Suggested-by: George Dunlap 
> > Suggested-by: Jan Beulich 
> > Signed-off-by: Feng Wu 
> > Reviewed-by: George Dunlap 
> With the comments George gave on v13 subsequent to this tag
> I'm not sure it was correct to retain it. George?
> 
> Reviewed-by: Jan Beulich 
> albeit in case another version is needed ...
> 
And, if I'm still in time (if not, no big deal :-))

Reviewed-by: Dario Faggioli 

Also, personally, I don't think that all those Suggested-by add much
useful info, and I'd be fine to see mine one removed (in any case, but
even more so, if the Rev-by: above is applied).

But, again, no big deal.

Thanks and Regards,
Dario
-- 
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)



signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v6 1/5] x86/hvm: Setup TSC scaling ratio

2016-02-29 Thread Jan Beulich

>>> On 29.02.16 at 14:49,  wrote:
> On 02/29/2016 08:41 AM, Jan Beulich wrote:
> On 28.02.16 at 13:54,  wrote:
>>> This patch adds a field tsc_scaling_ratio in struct hvm_domain to record
>>> the per-domain TSC scaling ratio, and sets it in tsc_set_info().
>>>
>>> Before setting the per-domain TSC scaling ratio, we check its validity
>>> in tsc_set_info(). If an invalid ratio is given, we will leave the
>>> default value in tsc_scaling_ratio (i.e. ratio = 1) and setup guest TSC
>>> as if no TSC scaling is used:
>>> * For TSC_MODE_FAULT,
> 
> s/TSC_MODE_FAULT/TSC_MODE_DEFAULT/

If that's the only issue, I would offer to fix it up while committing.
Committing, though, requires the ack on the SVM changes to be
given (again).

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v6 1/5] x86/hvm: Setup TSC scaling ratio

2016-02-29 Thread Haozhong Zhang

On 02/29/16 08:49, Boris Ostrovsky wrote:
> On 02/29/2016 08:41 AM, Jan Beulich wrote:
> On 28.02.16 at 13:54,  wrote:
> >>This patch adds a field tsc_scaling_ratio in struct hvm_domain to record
> >>the per-domain TSC scaling ratio, and sets it in tsc_set_info().
> >>
> >>Before setting the per-domain TSC scaling ratio, we check its validity
> >>in tsc_set_info(). If an invalid ratio is given, we will leave the
> >>default value in tsc_scaling_ratio (i.e. ratio = 1) and setup guest TSC
> >>as if no TSC scaling is used:
> >>* For TSC_MODE_FAULT,
> 
> s/TSC_MODE_FAULT/TSC_MODE_DEFAULT/
>

will fix in the next version.

Thanks,
Haozhong

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v4 02/24] arm/acpi: Add placeholder for efi and acpi load address

2016-02-29 Thread Stefano Stabellini

On Sun, 28 Feb 2016, Shannon Zhao wrote:
> From: Shannon Zhao 
> 
> We will create EFI table, memory description table and some of acpi
> tables and we're going to map them to kinfo->gnttab_start of Dom0.
> Add placeholder for the starting address for loading in DOM0 and the
> size of new added tables. Also add a placeholder to store the new
> created tables.
> 
> Signed-off-by: Parth Dixit 
> Signed-off-by: Shannon Zhao 

Acked-by: Stefano Stabellini 


> V4: explain why these need in commit message
> ---
>  xen/include/asm-arm/domain.h | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
> index aa7f283..8e1161f 100644
> --- a/xen/include/asm-arm/domain.h
> +++ b/xen/include/asm-arm/domain.h
> @@ -124,6 +124,11 @@ struct arch_domain
>  } vuart;
>  
>  unsigned int evtchn_irq;
> +#ifdef CONFIG_ACPI
> +void *efi_acpi_table;
> +paddr_t efi_acpi_gpa;
> +paddr_t efi_acpi_len;
> +#endif
>  }  __cacheline_aligned;
>  
>  struct arch_vcpu
> -- 
> 2.0.4
> 
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v6 2/5] x86/hvm: Replace architecture TSC scaling by a common function

2016-02-29 Thread Haozhong Zhang

On 02/29/16 06:44, Jan Beulich wrote:
> >>> On 28.02.16 at 13:54,  wrote:
> > --- a/xen/arch/x86/hvm/hvm.c
> > +++ b/xen/arch/x86/hvm/hvm.c
> > @@ -333,6 +333,23 @@ u64 hvm_get_tsc_scaling_ratio(u32 gtsc_khz)
> >  return ratio > max_ratio ? 0 : ratio;
> >  }
> >  
> > +u64 hvm_scale_tsc(const struct domain *d, u64 tsc)
> > +{
> > +u64 ratio = d->arch.hvm_domain.tsc_scaling_ratio;
> > +u64 dummy;
> > +
> > +if ( ratio == hvm_default_tsc_scaling_ratio )
> > +return tsc;
> > +
> > +/* tsc = (tsc * ratio) >> hvm_funcs.tsc_scaling.ratio_frac_bits */
> > +asm ( "mulq %[ratio]; shrdq %[frac],%%rdx,%[tsc]"
> > +  : [tsc] "+a" (tsc), "=d" (dummy)
> 
> Is mixing named and positional asm() operands supported by all
> gcc versions we care about?

I see in Config.mk the oldest version is gcc 4.1. I'll test on that
version.

> Also strictly speaking "=d" needs to be switched to "=&d".
>

I'll fix in the next version.

Thanks,
Haozhong

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v4 01/24] arm/acpi: Define a enum for reserved tables

2016-02-29 Thread Stefano Stabellini

On Sun, 28 Feb 2016, Shannon Zhao wrote:
> From: Shannon Zhao 
> 
> It needs to copy and change the contents of some ACPI and EFI tables for
> Dom0. Here define a enum for those tables.
> 
> Signed-off-by: Parth Dixit 
> Signed-off-by: Shannon Zhao 

Acked-by: Stefano Stabellini 


>  xen/include/asm-arm/acpi.h | 12 
>  1 file changed, 12 insertions(+)
> 
> diff --git a/xen/include/asm-arm/acpi.h b/xen/include/asm-arm/acpi.h
> index a0f2f24..7f59761 100644
> --- a/xen/include/asm-arm/acpi.h
> +++ b/xen/include/asm-arm/acpi.h
> @@ -30,6 +30,18 @@
>  #define COMPILER_DEPENDENT_UINT64  unsigned long long
>  #define ACPI_MAP_MEM_ATTR  PAGE_HYPERVISOR
>  
> +/* Tables marked as reserved in efi table */
> +typedef enum {
> +TBL_FADT,
> +TBL_MADT,
> +TBL_STAO,
> +TBL_XSDT,
> +TBL_RSDP,
> +TBL_EFIT,
> +TBL_MMAP,
> +TBL_MMAX,
> +} EFI_MEM_RES;
> +
>  bool_t __init acpi_psci_present(void);
>  bool_t __init acpi_psci_hvc_present(void);
>  void __init acpi_smp_init_cpus(void);
> -- 
> 2.0.4
> 
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v4 09/24] arm/acpi: Prepare STAO table for Dom0

2016-02-29 Thread Stefano Stabellini

On Sun, 28 Feb 2016, Shannon Zhao wrote:
> From: Shannon Zhao 
> 
> Create STAO table for Dom0. This table is used to tell Dom0 whether it
> should ignore UART defined in SPCR table or the ACPI namespace names.
> 
> Look at below url for details:
> http://wiki.xenproject.org/mediawiki/images/0/02/Status-override-table.pdf
> 
> Signed-off-by: Parth Dixit 
> Signed-off-by: Shannon Zhao 
>
>  xen/arch/arm/domain_build.c | 41 +
>  1 file changed, 41 insertions(+)
> 
> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> index e373165..53cddb7 100644
> --- a/xen/arch/arm/domain_build.c
> +++ b/xen/arch/arm/domain_build.c
> @@ -1357,6 +1357,43 @@ static int prepare_dtb(struct domain *d, struct 
> kernel_info *kinfo)
>  }
>  
>  #ifdef CONFIG_ACPI
> +static int acpi_create_stao(struct domain *d, struct membank tbl_add[])
> +{
> +struct acpi_table_header *table = NULL;
> +struct acpi_table_stao *stao = NULL;
> +u32 table_size = sizeof(struct acpi_table_stao);
> +u32 offset = acpi_get_table_offset(tbl_add, TBL_STAO);
> +acpi_status status;
> +u8 *base_ptr, checksum;
> +
> +/* Copy OEM and ASL compiler fields of some table, here use MADT */

"Copy OEM and ASL compiler fields from another table, use MADT" would be
better.

Reviewed-by: Stefano Stabellini 


> +status = acpi_get_table(ACPI_SIG_MADT, 0, &table);
> +
> +if ( ACPI_FAILURE(status) )
> +{
> +const char *msg = acpi_format_exception(status);
> +
> +printk("Failed to get MADT table, %s\n", msg);
> +return -EINVAL;
> +}
> +
> +base_ptr = d->arch.efi_acpi_table + offset;
> +ACPI_MEMCPY(base_ptr, table, sizeof(struct acpi_table_header));
> +
> +stao = (struct acpi_table_stao *)base_ptr;
> +ACPI_MEMCPY(stao->header.signature, ACPI_SIG_STAO, 4);
> +stao->header.revision = 1;
> +stao->header.length = table_size;
> +stao->ignore_uart = 1;
> +checksum = acpi_tb_checksum(ACPI_CAST_PTR(u8, stao), table_size);
> +stao->header.checksum -= checksum;
> +
> +tbl_add[TBL_STAO].start = d->arch.efi_acpi_gpa + offset;
> +tbl_add[TBL_STAO].size = table_size;
> +
> +return 0;
> +}
> +
>  static int acpi_create_madt(struct domain *d, struct membank tbl_add[])
>  {
>  struct acpi_table_header *table = NULL;
> @@ -1512,6 +1549,10 @@ static int prepare_acpi(struct domain *d, struct 
> kernel_info *kinfo)
>  if ( rc != 0 )
>  return rc;
>  
> +rc = acpi_create_stao(d, tbl_add);
> +if ( rc != 0 )
> +return rc;
> +
>  return 0;
>  }
>  #else
> -- 
> 2.0.4
> 
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v6 1/5] x86/hvm: Setup TSC scaling ratio

2016-02-29 Thread Boris Ostrovsky


On 02/29/2016 08:55 AM, Jan Beulich wrote:

On 29.02.16 at 14:49,  wrote:

On 02/29/2016 08:41 AM, Jan Beulich wrote:

On 28.02.16 at 13:54,  wrote:

This patch adds a field tsc_scaling_ratio in struct hvm_domain to record
the per-domain TSC scaling ratio, and sets it in tsc_set_info().

Before setting the per-domain TSC scaling ratio, we check its validity
in tsc_set_info(). If an invalid ratio is given, we will leave the
default value in tsc_scaling_ratio (i.e. ratio = 1) and setup guest TSC
as if no TSC scaling is used:
* For TSC_MODE_FAULT,

s/TSC_MODE_FAULT/TSC_MODE_DEFAULT/

If that's the only issue, I would offer to fix it up while committing.
Committing, though, requires the ack on the SVM changes to be
given (again).



For SVM bits:

Acked-by: Boris Ostrovsky 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v4 12/24] arm/p2m: Add helper functions to map memory regions

2016-02-29 Thread Stefano Stabellini

On Sun, 28 Feb 2016, Shannon Zhao wrote:
> From: Parth Dixit 
> 
> Create a helper function for mapping with cached attributes and
> read-only range.
> 
> Signed-off-by: Parth Dixit 
> Signed-off-by: Shannon Zhao 

This is good, but for clarity please rename the two functions to
map_regions_ro and unmap_regions_ro.


> v4: use p2m_access_r
> ---
>  xen/arch/arm/p2m.c| 26 ++
>  xen/include/asm-arm/p2m.h | 10 ++
>  2 files changed, 36 insertions(+)
> 
> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> index a2a9c4b..c36fdf6 100644
> --- a/xen/arch/arm/p2m.c
> +++ b/xen/arch/arm/p2m.c
> @@ -1218,6 +1218,32 @@ int p2m_populate_ram(struct domain *d,
>   d->arch.p2m.default_access);
>  }
>  
> +int map_regions(struct domain *d,
> + unsigned long start_gfn,
> + unsigned long nr,
> + unsigned long mfn)
> +{
> +return apply_p2m_changes(d, INSERT,
> + pfn_to_paddr(start_gfn),
> + pfn_to_paddr(start_gfn + nr),
> + pfn_to_paddr(mfn),
> + MATTR_MEM, 0, p2m_mmio_direct,
> + p2m_access_r);
> +}
> +
> +int unmap_regions(struct domain *d,
> +   unsigned long start_gfn,
> +   unsigned long nr,
> +   unsigned long mfn)
> +{
> +return apply_p2m_changes(d, REMOVE,
> + pfn_to_paddr(start_gfn),
> + pfn_to_paddr(start_gfn + nr),
> + pfn_to_paddr(mfn),
> + MATTR_MEM, 0, p2m_invalid,
> + p2m_access_r);
> +}
> +
>  int map_mmio_regions(struct domain *d,
>   unsigned long start_gfn,
>   unsigned long nr,
> diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h
> index 433952a..945b613 100644
> --- a/xen/include/asm-arm/p2m.h
> +++ b/xen/include/asm-arm/p2m.h
> @@ -144,6 +144,16 @@ int p2m_cache_flush(struct domain *d, xen_pfn_t 
> start_mfn, xen_pfn_t end_mfn);
>  /* Setup p2m RAM mapping for domain d from start-end. */
>  int p2m_populate_ram(struct domain *d, paddr_t start, paddr_t end);
>  
> +int map_regions(struct domain *d,
> +unsigned long start_gfn,
> +unsigned long nr_mfns,
> +unsigned long mfn);
> +
> +int unmap_regions(struct domain *d,
> +unsigned long start_gfn,
> +unsigned long nr_mfns,
> +unsigned long mfn);
> +
>  int guest_physmap_add_entry(struct domain *d,
>  unsigned long gfn,
>  unsigned long mfn,

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v4 13/24] arm/acpi: Map all other tables for Dom0

2016-02-29 Thread Stefano Stabellini

On Sun, 28 Feb 2016, Shannon Zhao wrote:
> From: Shannon Zhao 
> 
> Map all other tables to Dom0 using 1:1 mappings.
> 
> Signed-off-by: Shannon Zhao 
> ---
> v4: fix commit message
> ---
>  xen/arch/arm/domain_build.c | 26 ++
>  1 file changed, 26 insertions(+)
> 
> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> index 64e48ae..6ad420c 100644
> --- a/xen/arch/arm/domain_build.c
> +++ b/xen/arch/arm/domain_build.c
> @@ -1357,6 +1357,30 @@ static int prepare_dtb(struct domain *d, struct 
> kernel_info *kinfo)
>  }
>  
>  #ifdef CONFIG_ACPI
> +static void acpi_map_other_tables(struct domain *d)
> +{
> +int i;
> +unsigned long res;
> +u64 addr, size;
> +
> +/* Map all other tables to Dom0 using 1:1 mappings. */
> +for( i = 0; i < acpi_gbl_root_table_list.count; i++ )
> +{
> +addr = acpi_gbl_root_table_list.tables[i].address;
> +size = acpi_gbl_root_table_list.tables[i].length;
> +res = map_regions(d,
> +  paddr_to_pfn(addr & PAGE_MASK),
> +  DIV_ROUND_UP(size, PAGE_SIZE),
> +  paddr_to_pfn(addr & PAGE_MASK));
> +if ( res )
> +{
> + panic(XENLOG_ERR "Unable to map 0x%"PRIx64
> +   " - 0x%"PRIx64" in domain \n",
> +   addr & PAGE_MASK, PAGE_ALIGN(addr + size) - 1);
> +}
> +}
> +}

The problem with this function is that it is mapping all other tables to
the guest, including the unmodified FADT and MADT. This way dom0 is
going to find two different versions of the FADT and MADT, isn't that
right?


>  static int acpi_create_rsdp(struct domain *d, struct membank tbl_add[])
>  {
>  
> @@ -1661,6 +1685,8 @@ static int prepare_acpi(struct domain *d, struct 
> kernel_info *kinfo)
>  if ( rc != 0 )
>  return rc;
>  
> +acpi_map_other_tables(d);
> +
>  return 0;
>  }
>  #else
> -- 
> 2.0.4
> 
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v6 2/5] x86/hvm: Replace architecture TSC scaling by a common function

2016-02-29 Thread Boris Ostrovsky


On 02/28/2016 07:54 AM, Haozhong Zhang wrote:

This patch implements a common function hvm_scale_tsc() to scale TSC by
using TSC scaling information collected by architecture code.

Signed-off-by: Haozhong Zhang 
---
CC: Keir Fraser 
CC: Jan Beulich 
CC: Andrew Cooper 
CC: Boris Ostrovsky 
CC: Suravee Suthikulpanit 
CC: Aravind Gopalakrishnan 
CC: Kevin Tian 


SVM bits:

Acked-by: Boris Ostrovsky 



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 0/7] libxl: add support for FreeBSD block hotplug scripts

2016-02-29 Thread George Dunlap

On 29/02/16 12:18, Roger Pau Monné wrote:
> El 29/2/16 a les 13:15, George Dunlap ha escrit:
>> On Thu, Feb 25, 2016 at 7:25 PM, Roger Pau Monne  
>> wrote:
>>> This series enables using hotplug scripts with the FreeBSD blkback
>>> implementation. Since FreeBSD blkback can use both block devices and regular
>>> RAW files as disks, the physical-device xenstore backend node is now
>>> OS-specific, Linux and NetBSD will encode the device major and minor numbers
>>> there, while FreeBSD simply puts an absolute path to a disk image.
>>
>> Just to catch me up here -- is this an incompatible thing  that
>> FreeBSD *already* does, or something you're adding with this series?
> 
> I assume you mean the usage of the "physical-device" node, right? In
> which case, this is something that I'm adding with this series (and some
> FreeBSD kernel changes, of course). Current FreeBSD code doesn't use
> physical-device at all.

So there are cases where libxl could use the actual path to the
resulting device (if available) as well.  Rather than overloading
physical-device, what about making a new node, with a path, that can be
used by all of them?

I submitted such a patch here:

<1439233885-22218-4-git-send-email-george.dun...@eu.citrix.com>

As part of a series allowing HVM domains to use hotplug scripts.

Thoughts?

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v4 14/24] arm/acpi: Prepare EFI system table for Dom0

2016-02-29 Thread Stefano Stabellini

On Mon, 29 Feb 2016, Jan Beulich wrote:
> >>> On 28.02.16 at 12:19,  wrote:
> > --- a/xen/common/efi/boot.c
> > +++ b/xen/common/efi/boot.c
> > @@ -1173,6 +1173,10 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE 
> > *SystemTable)
> >  }
> >  
> >  #if defined (CONFIG_ACPI) && defined (CONFIG_ARM)
> > +#include "../../../common/decompress.h"
> > +#define XZ_EXTERN STATIC
> > +#include "../../../common/xz/crc32.c"
> > +
> >  /* Constant to indicate "Xen" in unicode u16 format */
> >  static const u16 XEN_EFI_FW_VENDOR[] ={0x0058,0x0065,0x006E,0x};
> >  
> > @@ -1189,6 +1193,46 @@ int __init estimate_efi_size(int mem_nr_banks)
> >  
> >  return size;
> >  }
> > +
> > +void __init acpi_create_efi_system_table(paddr_t paddr, void 
> > *efi_acpi_table,
> > + struct membank tbl_add[])
> > +{
> > +u64 table_addr, table_size, offset = 0;
> > +u8 *base_ptr;
> > +EFI_CONFIGURATION_TABLE *efi_conf_tbl;
> > +EFI_SYSTEM_TABLE *efi_sys_tbl;
> > +
> > +table_addr = paddr + acpi_get_table_offset(tbl_add, TBL_EFIT);
> > +table_size = sizeof(EFI_SYSTEM_TABLE) + sizeof(EFI_CONFIGURATION_TABLE)
> > + + sizeof(XEN_EFI_FW_VENDOR);
> > +base_ptr = efi_acpi_table + acpi_get_table_offset(tbl_add, TBL_EFIT);
> > +efi_sys_tbl = (EFI_SYSTEM_TABLE *)base_ptr;
> > +
> > +efi_sys_tbl->Hdr.Signature = EFI_SYSTEM_TABLE_SIGNATURE;
> > +/* Specify the revision as 2.5 */
> > +efi_sys_tbl->Hdr.Revision = (2 << 16 | 50);
> > +efi_sys_tbl->Hdr.HeaderSize = table_size;
> > +
> > +efi_sys_tbl->FirmwareRevision = 1;
> > +efi_sys_tbl->NumberOfTableEntries = 1;
> > +offset += sizeof(EFI_SYSTEM_TABLE);
> > +memcpy((u16 *)(base_ptr + offset), XEN_EFI_FW_VENDOR,
> > +   sizeof(XEN_EFI_FW_VENDOR));
> > +efi_sys_tbl->FirmwareVendor = (CHAR16 *)(table_addr + offset);
> > +
> > +offset += sizeof(XEN_EFI_FW_VENDOR);
> > +efi_conf_tbl = (EFI_CONFIGURATION_TABLE *)(base_ptr + offset);
> > +efi_conf_tbl->VendorGuid = (EFI_GUID)ACPI_20_TABLE_GUID;
> > +efi_conf_tbl->VendorTable = (VOID *)tbl_add[TBL_RSDP].start;
> > +efi_sys_tbl->ConfigurationTable = (EFI_CONFIGURATION_TABLE 
> > *)(table_addr
> > +  + 
> > offset);
> > +xz_crc32_init();
> > +efi_sys_tbl->Hdr.CRC32 = xz_crc32((uint8_t *)efi_sys_tbl,
> > +  efi_sys_tbl->Hdr.HeaderSize, 0);
> > +
> > +tbl_add[TBL_EFIT].start = table_addr;
> > +tbl_add[TBL_EFIT].size = table_size;
> > +}
> >  #endif
> >  
> >  #ifndef CONFIG_ARM /* TODO - runtime service support */
> 
> While the previous addition here was probably fine on its own, here
> it becomes clear that these additions all belong into arch/arm/efi/.

Right, or maybe into xen/arch/arm/domain_build.c


> Also it doesn't look very nice to me to (ab)use xz's CRC32 code
> here; I don't know who has suggested doing so.
 
It was suggested by Julien.

I agree that including ../../../common/xz/crc32.c seems a bit fragile
but introducing another copy of xz_crc32 seems even worse to me (see
http://marc.info/?l=xen-devel&m=144775375427921&w=2). Maybe you have a
better suggestion?

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v4] xen: credit1: avoid boosting vCPUs being "just" migrated

2016-02-29 Thread George Dunlap

On 24/02/16 17:42, Dario Faggioli wrote:
> Moving a vCPU to a different pCPU means offlining it and
> then waking it up, on the new pCPU. Credit1 grants BOOST
> priority to vCPUs that wakes up, with the aim of improving
> I/O latency. The net effect of this all is that vCPUs get
> boosted when migrating, which shouldn't happen.
> 
> For instance, this causes scheduling anomalies and,
> potentially, performance problems, as reported here:
>   http://lists.xen.org/archives/html/xen-devel/2015-10/msg02851.html
> 
> This patch fixes this by noting down (by means of a flag)
> the fact that the vCPU is about to undergo a migration.
> This way we can tell, later, during a wakeup, whether the
> vCPU is migrating or unblocking, and decide whether or
> not to apply the boosting.
> 
> Note that it is important that atomic-safe bit operations
> are used when manipulating vCPUs' flags. Take the chance
> and add a comment about this.
> 
> Signed-off-by: Dario Faggioli 

Reviewed-by: George Dunlap 

Thanks, sorry for the delay. :-)

> ---
> Cc: George Dunlap 
> Cc: Jan Beulich 
> ---
> Changes from v3:
>  * use a local boolean variable to track if the vcpu is migrating,
>to make the code clearer and easier to read, as suggested during
>review.
> 
> Changes from v2:
>  * test_and_clear() is necessary when accessing svc->flags;
>  * added a comment about such need at the top, where the flags
>are defined.
> 
> Changes from v1:
>  * rewritten, following suggestion got during review: there
>are no wakeup flags any longer, and all is done in sched_credit.c
>by setting a flag in csched_cpu_pick() and testing (and
>cleating) it in csched_vcpu_wake().
> ---
>  xen/common/sched_credit.c |   28 
>  1 file changed, 24 insertions(+), 4 deletions(-)
> 
> diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c
> index 0af5f69..305889a 100644
> --- a/xen/common/sched_credit.c
> +++ b/xen/common/sched_credit.c
> @@ -63,9 +63,14 @@
>  
>  /*
>   * Flags
> + *
> + * Note that svc->flags (where these flags live) is protected by an
> + * inconsistent set of locks. Therefore atomic-safe bit operations must
> + * be used for accessing it.
>   */
>  #define CSCHED_FLAG_VCPU_PARKED0x0  /* VCPU over capped credits */
>  #define CSCHED_FLAG_VCPU_YIELD 0x1  /* VCPU yielding */
> +#define CSCHED_FLAG_VCPU_MIGRATING 0x2  /* VCPU may have moved to a new pcpu 
> */
>  
>  
>  /*
> @@ -786,6 +791,16 @@ _csched_cpu_pick(const struct scheduler *ops, struct 
> vcpu *vc, bool_t commit)
>  static int
>  csched_cpu_pick(const struct scheduler *ops, struct vcpu *vc)
>  {
> +struct csched_vcpu *svc = CSCHED_VCPU(vc);
> +
> +/*
> + * We have been called by vcpu_migrate() (in schedule.c), as part
> + * of the process of seeing if vc can be migrated to another pcpu.
> + * We make a note about this in svc->flags so that later, in
> + * csched_vcpu_wake() (still called from vcpu_migrate()) we won't
> + * get boosted, which we don't deserve as we are "only" migrating.
> + */
> +set_bit(CSCHED_FLAG_VCPU_MIGRATING, &svc->flags);
>  return _csched_cpu_pick(ops, vc, 1);
>  }
>  
> @@ -985,6 +1000,7 @@ static void
>  csched_vcpu_wake(const struct scheduler *ops, struct vcpu *vc)
>  {
>  struct csched_vcpu * const svc = CSCHED_VCPU(vc);
> +bool_t migrating;
>  
>  BUG_ON( is_idle_vcpu(vc) );
>  
> @@ -1020,11 +1036,15 @@ csched_vcpu_wake(const struct scheduler *ops, struct 
> vcpu *vc)
>   * more CPU resource intensive VCPUs without impacting overall 
>   * system fairness.
>   *
> - * The one exception is for VCPUs of capped domains unpausing
> - * after earning credits they had overspent. We don't boost
> - * those.
> + * There are two cases, when we don't want to boost:
> + *  - VCPUs that are waking up after a migration, rather than
> + *after having block;
> + *  - VCPUs of capped domains unpausing after earning credits
> + *they had overspent.
>   */
> -if ( svc->pri == CSCHED_PRI_TS_UNDER &&
> +migrating = test_and_clear_bit(CSCHED_FLAG_VCPU_MIGRATING, &svc->flags);
> +
> +if ( !migrating && svc->pri == CSCHED_PRI_TS_UNDER &&
>   !test_bit(CSCHED_FLAG_VCPU_PARKED, &svc->flags) )
>  {
>  TRACE_2D(TRC_CSCHED_BOOST_START, vc->domain->domain_id, vc->vcpu_id);
> 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v4 14/24] arm/acpi: Prepare EFI system table for Dom0

2016-02-29 Thread Jan Beulich

>>> On 29.02.16 at 15:25,  wrote:
> On Mon, 29 Feb 2016, Jan Beulich wrote:
>> Also it doesn't look very nice to me to (ab)use xz's CRC32 code
>> here; I don't know who has suggested doing so.
>  
> It was suggested by Julien.
> 
> I agree that including ../../../common/xz/crc32.c seems a bit fragile
> but introducing another copy of xz_crc32 seems even worse to me (see
> http://marc.info/?l=xen-devel&m=144775375427921&w=2). Maybe you have a
> better suggestion?

Well, at some point there was talk of ARM not wanting to
ExitBootServices() as early as x86 does. In that case there
would be a CRC32 function among the various boot services
ones.

The other option would be to have a generic crc32() function,
and maybe make xz use it.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v4 15/24] arm/acpi: Prepare EFI memory descriptor for Dom0

2016-02-29 Thread Stefano Stabellini

On Sun, 28 Feb 2016, Shannon Zhao wrote:
> From: Shannon Zhao 
> 
> Create a few EFI memory descriptors to tell Dom0 the RAM region
> information, ACPI table regions and EFI tables reserved resions.
> 
> Cc: Jan Beulich 
> Signed-off-by: Parth Dixit 
> Signed-off-by: Shannon Zhao 
> ---
> v4: use a single descriptor for new created tables
> ---
>  xen/arch/arm/domain_build.c |  2 ++
>  xen/common/efi/boot.c   | 38 ++
>  xen/include/asm-arm/setup.h |  5 +
>  3 files changed, 45 insertions(+)
> 
> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> index 09f9770..1ec6271 100644
> --- a/xen/arch/arm/domain_build.c
> +++ b/xen/arch/arm/domain_build.c
> @@ -1688,6 +1688,8 @@ static int prepare_acpi(struct domain *d, struct 
> kernel_info *kinfo)
>  acpi_map_other_tables(d);
>  acpi_create_efi_system_table(d->arch.efi_acpi_gpa, 
> d->arch.efi_acpi_table,
>   tbl_add);
> +acpi_create_efi_mmap_table(d->arch.efi_acpi_gpa, d->arch.efi_acpi_len,
> +   d->arch.efi_acpi_table, &kinfo->mem, tbl_add);
>  
>  return 0;
>  }
> diff --git a/xen/common/efi/boot.c b/xen/common/efi/boot.c
> index 238c5fd..b8d7409 100644
> --- a/xen/common/efi/boot.c
> +++ b/xen/common/efi/boot.c
> @@ -1233,6 +1233,44 @@ void __init acpi_create_efi_system_table(paddr_t 
> paddr, void *efi_acpi_table,
>  tbl_add[TBL_EFIT].start = table_addr;
>  tbl_add[TBL_EFIT].size = table_size;
>  }
> +
> +void __init acpi_create_efi_mmap_table(paddr_t paddr, paddr_t size,
> +   void *efi_acpi_table,
> +   const struct meminfo *mem,
> +   struct membank tbl_add[])

This function probably belongs to xen/arch/arm/domain_build.c or
arch/arm/efi


> +{
> +EFI_MEMORY_DESCRIPTOR *memory_map;
> +int i, offset;
> +u8 *base_ptr;
> +
> +tbl_add[TBL_MMAP].start = paddr + acpi_get_table_offset(tbl_add, 
> TBL_MMAP);
> +tbl_add[TBL_MMAP].size = sizeof(EFI_MEMORY_DESCRIPTOR)
> + * (mem->nr_banks + acpi_mem.nr_banks + 1);

I don't like the fact that we are getting the number of memory banks
from a parameter and the number of acpi banks from a global variable.
Can we pass in acpi_mem.nr_banks too?


> +base_ptr = efi_acpi_table + acpi_get_table_offset(tbl_add, TBL_MMAP);
> +memory_map = (EFI_MEMORY_DESCRIPTOR *)(base_ptr);
> +
> +offset = 0;
> +for( i = 0; i < mem->nr_banks; i++, offset++ )
> +{
> +memory_map[offset].Type = EfiConventionalMemory;
> +memory_map[offset].PhysicalStart = mem->bank[i].start;
> +memory_map[offset].NumberOfPages = PFN_UP(mem->bank[i].size);
> +memory_map[offset].Attribute = EFI_MEMORY_WB;
> +}
> +
> +for( i = 0; i < acpi_mem.nr_banks; i++, offset++ )
> +{
> +memory_map[offset].Type = EfiACPIReclaimMemory;
> +memory_map[offset].PhysicalStart = acpi_mem.bank[i].start;
> +memory_map[offset].NumberOfPages = PFN_UP(acpi_mem.bank[i].size);
> +memory_map[offset].Attribute = EFI_MEMORY_WB;
> +}
> +
> +memory_map[offset].Type = EfiACPIReclaimMemory;
> +memory_map[offset].PhysicalStart = paddr;
> +memory_map[offset].NumberOfPages = PFN_UP(size);
> +memory_map[offset].Attribute = EFI_MEMORY_WB;
> +}
>  #endif
>  
>  #ifndef CONFIG_ARM /* TODO - runtime service support */
> diff --git a/xen/include/asm-arm/setup.h b/xen/include/asm-arm/setup.h
> index 2d65796..af5a038 100644
> --- a/xen/include/asm-arm/setup.h
> +++ b/xen/include/asm-arm/setup.h
> @@ -56,6 +56,11 @@ int estimate_efi_size(int mem_nr_banks);
>  void acpi_create_efi_system_table(paddr_t paddr, void *efi_acpi_table,
>struct membank tbl_add[]);
>  
> +void acpi_create_efi_mmap_table(paddr_t paddr, paddr_t size,
> +void *efi_acpi_table,
> +const struct meminfo *mem,
> +struct membank tbl_add[]);
> +
>  int construct_dom0(struct domain *d);
>  
>  void discard_initial_modules(void);

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v4 14/24] arm/acpi: Prepare EFI system table for Dom0

2016-02-29 Thread Stefano Stabellini

On Mon, 29 Feb 2016, Jan Beulich wrote:
> >>> On 29.02.16 at 15:25,  wrote:
> > On Mon, 29 Feb 2016, Jan Beulich wrote:
> >> Also it doesn't look very nice to me to (ab)use xz's CRC32 code
> >> here; I don't know who has suggested doing so.
> >  
> > It was suggested by Julien.
> > 
> > I agree that including ../../../common/xz/crc32.c seems a bit fragile
> > but introducing another copy of xz_crc32 seems even worse to me (see
> > http://marc.info/?l=xen-devel&m=144775375427921&w=2). Maybe you have a
> > better suggestion?
> 
> Well, at some point there was talk of ARM not wanting to
> ExitBootServices() as early as x86 does. In that case there
> would be a CRC32 function among the various boot services
> ones.
> 
> The other option would be to have a generic crc32() function,
> and maybe make xz use it.

Either way seems good to me.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 4/4] arm64: update the introduction of xen boot commands in docs/grub.texi

2016-02-29 Thread Konrad Rzeszutek Wilk

On Sun, Feb 28, 2016 at 08:10:33AM +0300, Andrei Borzenkov wrote:
> 27.02.2016 23:33, Konrad Rzeszutek Wilk пишет:
> > On Fri, Feb 26, 2016 at 07:15:52PM +0800, Fu Wei wrote:
> >> Hi Andrei,
> >>
> >> On 26 February 2016 at 18:50, Andrei Borzenkov  wrote:
> >>> On Fri, Feb 26, 2016 at 8:59 AM, Fu Wei  wrote:
>  +@subsection xen_module
> 
>  -@deffn Command xen_linux file [arguments]
>  -Load a dom0 kernel image for xen hypervisor at the booting process 
>  of xen.
>  +@deffn Command xen_module [--nounzip] file [arguments]
>  +Load a module for xen hypervisor at the booting process of xen.
>  The rest of the line is passed verbatim as the module command line.
>  +Each module will be identified by the order in which the modules 
>  are added.
>  +The 1st module: dom0 kernel image
>  +The 2nd module: dom0 ramdisk
>  +All subsequent modules: UNKNOW
>  @end deffn
> >>>
> >>> Hmm ... from previous discussion I gathered that Xen can detect module
> >>> type. What if there is no initrd for dom0? How can subsequent modules 
> >>> be
> >>
> >> Now , Xen detect module type by the order. (at least on ARM64).
> >> I think i386 is using Multiboot(2) protocol, so maybe this order is
> >> nothing to do with i386.
> >>
> >
> > Then we have obvious problem with your XSM patch 
> > (http://savannah.gnu.org/bugs/?43420) - XSM may land as the first 
> > module. That's actually something to solve on Xen side I think. It's 
> > just that so far we had just kernel and initrd, so that was non issue.
> 
>  Oh, did you mean Wei Liu's patch?
> 
>  I guess XSM may land as the third module (or the module after linux
>  kernel, if you don't have initrd)
> 
>  Yes, agree. (That's actually something to solve on Xen side)
> 
>  I guess xen can get xsm from a special initrd. so for now there is not
>  big problem on xsm.
> 
>  Please correct me if I misunderstand something. :-)
> 
>  Thanks!
> 
>  Back to this patch, is that OK for you, or any suggestion?  Thanks !
> 
> >>>
> >>> Yes, as this is dedicated Xen loader we should document this mandatory
> >>> order - first module must be kernel image, second module must be
> >>> initrd. I do not think we need to mention possibility to load more
> >>> than two modules until there is clear understanding how it can be done
> >>> without initrd.
> >>
> >> Great thanks for your review, I have updated and sent the v3 patchset,
> >> Hope I understood your suggestion correctly, Please check.  :-)
> > 
> > What if the initrd is catted to the kernel image (which you can
> > do on x86)? And then the 1st module is your XSM?
> > 
> 
> On x86 Xen can detect microcode and xsm modules; the first unknown

'can'. If you use 'ucode=scan' on the Xen command line. Otherwise it has
no clue.

It actually would be just much nicer if Multiboot2 had some tag describing
this and Xen (or any other OS) could just parse it like that. But that is
just hand-waving, nice-to-do.


> module after that is assumed to be initrd (dom0 kernel always must be
> the very first module provided).

That is true.
> 
> On arm there is no detection - module type is taken from FDT; if no
> module type is provided, the first unknown module is assumed to be
> kernel, the second - initrd.
> 
> See also http://lists.gnu.org/archive/html/grub-devel/2016-02/msg00333.html
> 
> > Is this .. order dependency written somewhere in a document? In the
> > Xen code-base that is?
> > 
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v4 16/24] arm/acpi: Map the new created EFI and ACPI tables to Dom0

2016-02-29 Thread Stefano Stabellini

On Sun, 28 Feb 2016, Shannon Zhao wrote:
> From: Shannon Zhao 
> 
> Map the UEFI and ACPI tables which we created to non-RAM space in Dom0.
> 
> Signed-off-by: Shannon Zhao 

Reviewed-by: Stefano Stabellini 


>  xen/arch/arm/domain_build.c | 15 +++
>  1 file changed, 15 insertions(+)
> 
> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> index 1ec6271..083ddd5 100644
> --- a/xen/arch/arm/domain_build.c
> +++ b/xen/arch/arm/domain_build.c
> @@ -1691,6 +1691,21 @@ static int prepare_acpi(struct domain *d, struct 
> kernel_info *kinfo)
>  acpi_create_efi_mmap_table(d->arch.efi_acpi_gpa, d->arch.efi_acpi_len,
> d->arch.efi_acpi_table, &kinfo->mem, tbl_add);
>  
> +/* Map the EFI and ACPI tables to Dom0 */
> +rc = map_regions(d,
> + paddr_to_pfn(d->arch.efi_acpi_gpa),
> + PFN_UP(d->arch.efi_acpi_len),
> + paddr_to_pfn(virt_to_maddr(d->arch.efi_acpi_table)));
> +if ( rc != 0 )
> +{
> +printk(XENLOG_ERR "Unable to map 0x%"PRIx64
> +   " - 0x%"PRIx64" in domain %d\n",
> +   d->arch.efi_acpi_gpa & PAGE_MASK,
> +   PAGE_ALIGN(d->arch.efi_acpi_gpa + d->arch.efi_acpi_len) - 1,
> +   d->domain_id);
> +return rc;
> +}
> +
>  return 0;
>  }
>  #else
> -- 
> 2.0.4
> 
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v4 03/24] arm/acpi: Read acpi memory info from uefi

2016-02-29 Thread Jan Beulich

>>> On 28.02.16 at 12:18,  wrote:
> From: Parth Dixit 
> 
> ACPI memory is seperate from conventional memory and should be marked
> as reserved while passing to DOM0. Create a new meminfo structure to
> store all the acpi tables listed in uefi.
> 
> Signed-off-by: Parth Dixit 
> Signed-off-by: Shannon Zhao 
> Acked-by: Stefano Stabellini 

With this I'll commit as is, but I'd like to note that ...

> @@ -129,6 +132,9 @@ static EFI_STATUS __init 
> efi_process_memory_map_bootinfo(EFI_MEMORY_DESCRIPTOR *
>  {
>  int Index;
>  int i = 0;
> +#ifdef CONFIG_ACPI
> +int j = 0;
> +#endif
>  EFI_MEMORY_DESCRIPTOR *desc_ptr = map;
>  
>  for ( Index = 0; Index < (mmap_size / desc_size); Index++ )
> @@ -148,10 +154,27 @@ static EFI_STATUS __init 
> efi_process_memory_map_bootinfo(EFI_MEMORY_DESCRIPTOR *
>  bootinfo.mem.bank[i].size = desc_ptr->NumberOfPages * 
> EFI_PAGE_SIZE;
>  ++i;
>  }
> +#if defined (CONFIG_ACPI) && defined (CONFIG_ARM)
> +else if ( desc_ptr->Type == EfiACPIReclaimMemory )
> +{
> +if ( j >= NR_MEM_BANKS )
> +{
> +PrintStr(L"Error: All " __stringify(NR_MEM_BANKS)
> +  " acpi meminfo mem banks exhausted.\r\n");
> +return EFI_LOAD_ERROR;
> +}
> +acpi_mem.bank[j].start = desc_ptr->PhysicalStart;
> +acpi_mem.bank[j].size  = desc_ptr->NumberOfPages * EFI_PAGE_SIZE;
> +++j;
> +}
> +#endif
>  desc_ptr = NextMemoryDescriptor(desc_ptr, desc_size);
>  }
>  
>  bootinfo.mem.nr_banks = i;
> +#if defined (CONFIG_ACPI) && defined (CONFIG_ARM)
> +acpi_mem.nr_banks = j;
> +#endif
>  return EFI_SUCCESS;
>  }

... the three #ifdef-s here aren't consistent.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [RFC PATCH] xen-block: introduces extra request to pass-through SCSI commands

2016-02-29 Thread Konrad Rzeszutek Wilk

On Mon, Feb 29, 2016 at 09:13:41AM +, Paul Durrant wrote:
> > -Original Message-
> > From: Bob Liu [mailto:bob@oracle.com]
> > Sent: 29 February 2016 03:37
> > To: xen-devel@lists.xen.org
> > Cc: Ian Jackson; jbeul...@suse.com; Roger Pau Monne; jgr...@suse.com;
> > Paul Durrant; konrad.w...@oracle.com; Bob Liu
> > Subject: [RFC PATCH] xen-block: introduces extra request to pass-through
> > SCSI commands
> > 
> > 1) What is this patch about?
> > This patch introduces an new block operation (BLKIF_OP_EXTRA_FLAG).
> > A request with BLKIF_OP_EXTRA_FLAG set means the following request is an
> > extra request which is used to pass through SCSI commands.
> > This is like a simplified version of XEN_NETIF_EXTRA_* in netif.h.
> > It can be extended easily to transmit other per-request/bio data from
> > frontend
> > to backend e.g Data Integrity Field per bio.
> > 
> > 2) Why we need this?
> > Currently only raw data segments are transmitted from blkfront to blkback,
> > which
> > means some advanced features are lost.
> >  * Guest knows nothing about features of the real backend storage.
> > For example, on bare-metal environment INQUIRY SCSI command
> > can be used
> > to query storage device information. If it's a SSD or flash device we
> > can have the option to use the device as a fast cache.
> > But this can't happen in current domU guests, because blkfront only
> > knows it's just a normal virtual disk
> > 
> 
> That's the sort of information that should be advertised via xenstore then. 
> There already feature flags for specific things but if some form of 
> throughput/latency information is meaningful to a frontend stack then perhaps 
> that can be advertised too.

Certainly could be put on the XenStore. Do you envision this being done
pre guest creation (so toolstack does it), or the backend finds this
and populates the XenStore keys?

Or that the frontend writes an XenStore key 'scsi-inq=vpd80' and the backend
responds by populating an 'scsi-inq-vpd80=' '? If so can
the XenStore accept binary payloads? Can it be more than 4K?


> 
> >  * Failover Clusters in Windows
> > Failover clusters require SCSI-3 persistent reservation target disks,
> > but now this can't work in domU.
> > 
> 
> That's true but allowing arbitrary SCSI messages through is not the way 
> forward IMO. Just because Windows thinks every HBA is SCSI doesn't mean other 
> OS do so I think reservation/release should have dedicated messages in the 
> blkif protocol if it's desirable to support clustering in the frontend.

Could you expand a bit on the 'dedicated message' you have in mind please?

> 
> > 3) Known issues:
> >  * Security issues, how to 'validate' this extra request payload.
> >E.g SCSI operates on LUN bases (the whole disk) while we really just want
> > to
> >operate on partitions
> > 
> >  * Can't pass SCSI commands through if the backend storage driver is bio-
> > based
> >instead of request-based.
> > 
> > 4) Alternative approach: Using PVSCSI instead:
> >  * Doubt PVSCSI can support as many type of backend storage devices as
> > Xen-block.
> > 
> 
> LIO can interface to any block device in much the same way blkback does IIRC.

But it can't do multipath or LVMs - which is an integral component.

Anyhow that is more of a implementation specific quirk.
> 
> >  * Much longer path:
> >ioctl() -> SCSI upper layer -> Middle layer -> PVSCSI-frontend -> PVSCSI-
> > backend -> Target framework(LIO?) ->
> > 
> >With xen-block we only need:
> >ioctl() -> blkfront -> blkback ->
> > 
> 
> ...and what happens if the block device that blkback is talking to is a SCSI 
> LUN?
> 
> That latter path is also not true for Windows. You've got all the SCSI 
> translation logic in the frontend when using blkif so that first path would 
> collapse to:
> 
> Disk driver -> (SCSI) HBA Driver -> xen-scsiback -> LIO -> backstore -> XXX

I don't know if it matters on the length of the path for say SCSI INQ. It isn't 
like
that is performance specific. Neither are the clustering SCSI commands.

> 
> >  * xen-block has been existed for many years, widely used and more stable.
> > 
> 
> It's definitely widely used, but it has had stability issues in recent times.

Oh? Could you send the bug-reports to me and Roger, CC xen-devel and LKML 
please ?

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v4 17/24] arm/acpi: Create min DT stub for Dom0

2016-02-29 Thread Stefano Stabellini

On Sun, 28 Feb 2016, Shannon Zhao wrote:
> From: Shannon Zhao 
> 
> Create a DT for Dom0 for ACPI-case only. DT contains minimal required
> informations such as Dom0 bootargs, initrd, efi description table and
> address of uefi memory table.
> 
> Also port the document of this device tree bindings from Linux.
> 
> Cc: Jan Beulich 
> Signed-off-by: Naresh Bhat 
> Signed-off-by: Parth Dixit 
> Signed-off-by: Shannon Zhao 
> ---
> v4: port the document of this device tree bindings from Linux
> ---
>  docs/misc/arm/device-tree/xen.txt |  58 
>  xen/arch/arm/domain_build.c   | 143 
> ++
>  xen/common/efi/boot.c |  47 +
>  xen/include/asm-arm/setup.h   |   2 +
>  4 files changed, 250 insertions(+)
>  create mode 100644 docs/misc/arm/device-tree/xen.txt
> 
> diff --git a/docs/misc/arm/device-tree/xen.txt 
> b/docs/misc/arm/device-tree/xen.txt
> new file mode 100644
> index 000..6f83f76
> --- /dev/null
> +++ b/docs/misc/arm/device-tree/xen.txt

Calling this file xen.txt inside the Xen repository is not very
meaningful, maybe we should rename it to uefi.txt

Otherwise you can add this info to the existing booting.txt


> @@ -0,0 +1,58 @@
> +* Xen hypervisor device tree bindings
> +
> +Xen ARM virtual platforms shall have a top-level "hypervisor" node with
> +the following properties:
> +
> +- compatible:
> + compatible = "xen,xen-", "xen,xen";
> +  where  is the version of the Xen ABI of the platform.
> +
> +- reg: specifies the base physical address and size of a region in
> +  memory where the grant table should be mapped to, using an
> +  HYPERVISOR_memory_op hypercall. The memory region is large enough to map
> +  the whole grant table (it is larger or equal to gnttab_max_grant_frames()).
> +
> +- interrupts: the interrupt used by Xen to inject event notifications.
> +  A GIC node is also required.
> +
> +To support UEFI on Xen ARM virtual platforms, Xen populates the FDT "uefi" 
> node
> +under /hypervisor with following parameters:
> +
> +
> +Name  | Size   | Description
> +
> +xen,uefi-system-table | 64-bit | Guest physical address of the UEFI 
> System
> +   || Table.
> +
> +xen,uefi-mmap-start   | 64-bit | Guest physical address of the UEFI 
> memory
> +   || map.
> +
> +xen,uefi-mmap-size| 32-bit | Size in bytes of the UEFI memory map
> +  || pointed to in previous entry.
> +
> +xen,uefi-mmap-desc-size   | 32-bit | Size in bytes of each entry in the UEFI
> +  || memory map.
> +
> +xen,uefi-mmap-desc-ver| 32-bit | Version of the mmap descriptor format.
> +
> +
> +Example (assuming #address-cells = <2> and #size-cells = <2>):
> +
> +hypervisor {
> + compatible = "xen,xen-4.3", "xen,xen";
> + reg = <0 0xb000 0 0x2>;
> + interrupts = <1 15 0xf08>;
> + uefi {
> + xen,uefi-system-table = <0x>;
> + xen,uefi-mmap-start = <0x>;
> + xen,uefi-mmap-size = <0x>;
> + xen,uefi-mmap-desc-size = <0x>;
> + xen,uefi-mmap-desc-ver = <0x>;
> +};
> +};
> +
> +The format and meaning of the "xen,uefi-*" parameters are similar to those in
> +Documentation/arm/uefi.txt, which are provided by the regular UEFI stub. 
> However
 ^ in Linux ^ Linux


> +they differ because they are provided by the Xen hypervisor, together with a 
> set
> +of UEFI runtime services implemented via hypercalls, see
> +http://xenbits.xen.org/docs/unstable/hypercall/x86_64/include,public,platform.h.html.

I think that here is fine to use a simple path within the Xen repo:

xen/include/public/platform.h


> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> index 083ddd5..4b1f387 100644
> --- a/xen/arch/arm/domain_build.c
> +++ b/xen/arch/arm/domain_build.c
> @@ -1357,6 +1357,145 @@ static int prepare_dtb(struct domain *d, struct 
> kernel_info *kinfo)
>  }
>  
>  #ifdef CONFIG_ACPI
> +#define ACPI_DOM0_FDT_MIN_SIZE 4096
> +
> +static int make_chosen_node(const struct kernel_info *kinfo,
> +struct membank tbl_add[])
> +{
> +int res;
> +const char *bootargs = NULL;
> +const struct bootmodule *mod = kinfo->kernel_bootmodul

Re: [Xen-devel] identify a Xen PV domU to fix devmem_is_allowed

2016-02-29 Thread Konrad Rzeszutek Wilk

On Mon, Feb 29, 2016 at 11:28:49AM +0100, Olaf Hering wrote:
> What is the correct way to identify a Xen PV domU in the kenrel?
> devmem_is_allowed() used to disable access to pages < 256 in domU.
> With pvops this check was removed, or never ported forward.

CC-ing Boris and Daniel.

Why is this needed? The first 640KB of memory in a guest are
RAM pages with no BIOS data in it.

> 
> Would this change be the correct fix?

.. A fix for what issue?
> 
> +++ b/arch/x86/mm/init.c
> @@ -637,7 +637,7 @@ void __init init_mem_mapping(void)
>  int devmem_is_allowed(unsigned long pagenr)
>  {
> if (pagenr < 256)
> -   return 1;
> +   return !xen_pv_domain();
> if (iomem_is_exclusive(pagenr << PAGE_SHIFT))
> return 0;
> if (!page_is_ram(pagenr))
> 
> Olaf
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [RFC PATCH] xen-block: introduces extra request to pass-through SCSI commands

2016-02-29 Thread Konrad Rzeszutek Wilk

On Mon, Feb 29, 2016 at 09:12:30AM +0100, Juergen Gross wrote:
> On 29/02/16 04:37, Bob Liu wrote:
> > 1) What is this patch about?
> > This patch introduces an new block operation (BLKIF_OP_EXTRA_FLAG).
> > A request with BLKIF_OP_EXTRA_FLAG set means the following request is an
> > extra request which is used to pass through SCSI commands.
> > This is like a simplified version of XEN_NETIF_EXTRA_* in netif.h.
> > It can be extended easily to transmit other per-request/bio data from 
> > frontend
> > to backend e.g Data Integrity Field per bio.
> > 
> > 2) Why we need this?
> > Currently only raw data segments are transmitted from blkfront to blkback, 
> > which
> > means some advanced features are lost.
> >  * Guest knows nothing about features of the real backend storage.
> > For example, on bare-metal environment INQUIRY SCSI command can be used
> > to query storage device information. If it's a SSD or flash device we
> > can have the option to use the device as a fast cache.
> > But this can't happen in current domU guests, because blkfront only
> > knows it's just a normal virtual disk
> > 
> >  * Failover Clusters in Windows
> > Failover clusters require SCSI-3 persistent reservation target disks,
> > but now this can't work in domU.
> > 
> > 3) Known issues:
> >  * Security issues, how to 'validate' this extra request payload.
> >E.g SCSI operates on LUN bases (the whole disk) while we really just 
> > want to
> >operate on partitions
> 
> It's not only validation: some operations just affect the whole LUN
> (e.g. Reserve/Release). And what about "multi-LUN" commands like
> "report LUNs"?

Don't expose them. Bob and I want to get an idea of what would be a good
compromise to allow some SCSI specific (or perhaps ATA specific or DIF/DIX?) 
type of
commands go through the PV driver.

Would it be better if it was through XenBus? But that may not work for some
that are tied closely to requests, such as DIF/DIX.

However the 'DISCARD' for example worked out - it is an umbrella for both
SCSI UNMAP and ATA DISCARD operation and hides the complexity of the low level
protocol. Could there be an 'INQ' ? Since the SCSI VPD is the most exhaustive
in terms of details it may make sense to base it on that..?

> 
> >  * Can't pass SCSI commands through if the backend storage driver is 
> > bio-based
> >instead of request-based.
> > 
> > 4) Alternative approach: Using PVSCSI instead:
> >  * Doubt PVSCSI can support as many type of backend storage devices as 
> > Xen-block.
> 
> pvSCSI won't need to support all types of backends. It's enough to
> support those where passing through SCSI commands makes sense.
> 
> Seems to be a similar issue as the above mentioned problem with
> bio-based backend storage drivers.

In particular the ones we care about are:
 - Multipath over FibreChannel devices.
 - Linear mapping (LVM) over the multipath.
 - And then potentially an filesystem on top of that
 - .. and a raw file on the filesystem.

Having SCSI VPD 0x83 page sent to frontend for that would be good.

Not sure about SCSI reservations. Perhaps those are more of .. unique
in that the implementation would have to make sure that the guest
owns the whole LUN. But that is implementation question.

This is about the design - how would you envision to to cram in 
SCSI commands or DIF/DIX commands or ATA commands via PV block layer?

> 
> >  * Much longer path:
> >ioctl() -> SCSI upper layer -> Middle layer -> PVSCSI-frontend -> 
> > PVSCSI-backend -> Target framework(LIO?) ->
> > 
> >With xen-block we only need:
> >ioctl() -> blkfront -> blkback ->
> 
> I'd like to see performance numbers before making a decision.

For SCSI INQ? 

Or are you talking about raw READ/WRITE?
> 
> >  * xen-block has been existed for many years, widely used and more stable.
> 
> Adding another SCSI passthrough capability wasn't accepted for pvSCSI
> (that's the reason I used the Target Framework). Why do you think it
> will be accepted for pvblk?
> 
> This is not my personal opinion, just a heads up from someone who had a
> try already. ;-)

Right. So SCSI passthrough out. What about mediated access for
specific SCSI, or specific ATA, or DIF/DIX ones? And how would you do it
knowing the SCSI maintainers much better than we do?


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v4 18/24] arm/acpi: Permit access all Xen unused SPIs for Dom0

2016-02-29 Thread Stefano Stabellini

On Sun, 28 Feb 2016, Shannon Zhao wrote:
> From: Shannon Zhao 
> 
> Permit access all Xen unused SPIs for Dom0 except the interrupts that
> Xen uses. Then when Dom0 configures the interrupt, it could set the
> interrupt type and route it to Dom0.
> 
> Signed-off-by: Shannon Zhao 

Reviewed-by: Stefano Stabellini 


> v4: only permmit access all Xen unused SPIs for Dom0 and don't set type
> ---
>  xen/arch/arm/domain_build.c | 31 +++
>  1 file changed, 31 insertions(+)
> 
> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> index 4b1f387..4aaffae 100644
> --- a/xen/arch/arm/domain_build.c
> +++ b/xen/arch/arm/domain_build.c
> @@ -1359,6 +1359,33 @@ static int prepare_dtb(struct domain *d, struct 
> kernel_info *kinfo)
>  #ifdef CONFIG_ACPI
>  #define ACPI_DOM0_FDT_MIN_SIZE 4096
>  
> +static int acpi_permit_spi_access(struct domain *d)
> +{
> +int i, res;
> +struct irq_desc *desc;
> +
> +/* Here just permit Dom0 to access the SPIs which Xen doesn't use. Then 
> when
> + * Dom0 configures the interrupt, set the interrupt type and route it to
> + * Dom0.
> + */
> +for( i = NR_LOCAL_IRQS; i < vgic_num_irqs(d); i++ )
> +{
> +desc = irq_to_desc(i);
> +if( desc->action != NULL)
> +continue;
> +
> +res = irq_permit_access(d, i);
> +if ( res )
> +{
> +printk(XENLOG_ERR "Unable to permit to dom%u access to IRQ %u\n",
> +   d->domain_id, i);
> +return res;
> +}
> +}
> +
> +return 0;
> +}
> +
>  static int make_chosen_node(const struct kernel_info *kinfo,
>  struct membank tbl_add[])
>  {
> @@ -1849,6 +1876,10 @@ static int prepare_acpi(struct domain *d, struct 
> kernel_info *kinfo)
>  if ( rc != 0 )
>  return rc;
>  
> +rc = acpi_permit_spi_access(d);
> +if ( rc != 0 )
> +return rc;
> +
>  return 0;
>  }
>  #else
> -- 
> 2.0.4
> 
> 
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v4 03/24] arm/acpi: Read acpi memory info from uefi

2016-02-29 Thread Stefano Stabellini

On Mon, 29 Feb 2016, Jan Beulich wrote:
> >>> On 28.02.16 at 12:18,  wrote:
> > From: Parth Dixit 
> > 
> > ACPI memory is seperate from conventional memory and should be marked
> > as reserved while passing to DOM0. Create a new meminfo structure to
> > store all the acpi tables listed in uefi.
> > 
> > Signed-off-by: Parth Dixit 
> > Signed-off-by: Shannon Zhao 
> > Acked-by: Stefano Stabellini 
> 
> With this I'll commit as is, but I'd like to note that ...
> 
> > @@ -129,6 +132,9 @@ static EFI_STATUS __init 
> > efi_process_memory_map_bootinfo(EFI_MEMORY_DESCRIPTOR *
> >  {
> >  int Index;
> >  int i = 0;
> > +#ifdef CONFIG_ACPI
> > +int j = 0;
> > +#endif
> >  EFI_MEMORY_DESCRIPTOR *desc_ptr = map;
> >  
> >  for ( Index = 0; Index < (mmap_size / desc_size); Index++ )
> > @@ -148,10 +154,27 @@ static EFI_STATUS __init 
> > efi_process_memory_map_bootinfo(EFI_MEMORY_DESCRIPTOR *
> >  bootinfo.mem.bank[i].size = desc_ptr->NumberOfPages * 
> > EFI_PAGE_SIZE;
> >  ++i;
> >  }
> > +#if defined (CONFIG_ACPI) && defined (CONFIG_ARM)
> > +else if ( desc_ptr->Type == EfiACPIReclaimMemory )
> > +{
> > +if ( j >= NR_MEM_BANKS )
> > +{
> > +PrintStr(L"Error: All " __stringify(NR_MEM_BANKS)
> > +  " acpi meminfo mem banks exhausted.\r\n");
> > +return EFI_LOAD_ERROR;
> > +}
> > +acpi_mem.bank[j].start = desc_ptr->PhysicalStart;
> > +acpi_mem.bank[j].size  = desc_ptr->NumberOfPages * 
> > EFI_PAGE_SIZE;
> > +++j;
> > +}
> > +#endif
> >  desc_ptr = NextMemoryDescriptor(desc_ptr, desc_size);
> >  }
> >  
> >  bootinfo.mem.nr_banks = i;
> > +#if defined (CONFIG_ACPI) && defined (CONFIG_ARM)
> > +acpi_mem.nr_banks = j;
> > +#endif
> >  return EFI_SUCCESS;
> >  }
> 
> ... the three #ifdef-s here aren't consistent.
 
Well spotted, the CONFIG_ARM is redundant, could you please remove it
while committing?

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v4 01/24] arm/acpi: Define a enum for reserved tables

2016-02-29 Thread Jan Beulich

>>> On 28.02.16 at 12:18,  wrote:
> --- a/xen/include/asm-arm/acpi.h
> +++ b/xen/include/asm-arm/acpi.h
> @@ -30,6 +30,18 @@
>  #define COMPILER_DEPENDENT_UINT64  unsigned long long
>  #define ACPI_MAP_MEM_ATTR  PAGE_HYPERVISOR
>  
> +/* Tables marked as reserved in efi table */
> +typedef enum {
> +TBL_FADT,
> +TBL_MADT,
> +TBL_STAO,
> +TBL_XSDT,
> +TBL_RSDP,
> +TBL_EFIT,
> +TBL_MMAP,
> +TBL_MMAX,
> +} EFI_MEM_RES;
> +
>  bool_t __init acpi_psci_present(void);
>  bool_t __init acpi_psci_hvc_present(void);
>  void __init acpi_smp_init_cpus(void);

This patch apparently expected the other series to be fully applied
first, which afaics hasn't been stated anywhere. I've fixed up the
context, but please avoid the committer needing to do such by
explicitly stating interdependencies between individually submitted
series, or by keeping the later series RFC (making clear it is not
expected to go in yet).

Jan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v4 03/24] arm/acpi: Read acpi memory info from uefi

2016-02-29 Thread Jan Beulich

>>> On 29.02.16 at 16:07,  wrote:
> On Mon, 29 Feb 2016, Jan Beulich wrote:
>> >>> On 28.02.16 at 12:18,  wrote:
>> > From: Parth Dixit 
>> > 
>> > ACPI memory is seperate from conventional memory and should be marked
>> > as reserved while passing to DOM0. Create a new meminfo structure to
>> > store all the acpi tables listed in uefi.
>> > 
>> > Signed-off-by: Parth Dixit 
>> > Signed-off-by: Shannon Zhao 
>> > Acked-by: Stefano Stabellini 
>> 
>> With this I'll commit as is, but I'd like to note that ...
>> 
>> > @@ -129,6 +132,9 @@ static EFI_STATUS __init 
>> > efi_process_memory_map_bootinfo(EFI_MEMORY_DESCRIPTOR *
>> >  {
>> >  int Index;
>> >  int i = 0;
>> > +#ifdef CONFIG_ACPI
>> > +int j = 0;
>> > +#endif
>> >  EFI_MEMORY_DESCRIPTOR *desc_ptr = map;
>> >  
>> >  for ( Index = 0; Index < (mmap_size / desc_size); Index++ )
>> > @@ -148,10 +154,27 @@ static EFI_STATUS __init 
>> > efi_process_memory_map_bootinfo(EFI_MEMORY_DESCRIPTOR *
>> >  bootinfo.mem.bank[i].size = desc_ptr->NumberOfPages * 
>> > EFI_PAGE_SIZE;
>> >  ++i;
>> >  }
>> > +#if defined (CONFIG_ACPI) && defined (CONFIG_ARM)
>> > +else if ( desc_ptr->Type == EfiACPIReclaimMemory )
>> > +{
>> > +if ( j >= NR_MEM_BANKS )
>> > +{
>> > +PrintStr(L"Error: All " __stringify(NR_MEM_BANKS)
>> > +  " acpi meminfo mem banks exhausted.\r\n");
>> > +return EFI_LOAD_ERROR;
>> > +}
>> > +acpi_mem.bank[j].start = desc_ptr->PhysicalStart;
>> > +acpi_mem.bank[j].size  = desc_ptr->NumberOfPages * 
>> > EFI_PAGE_SIZE;
>> > +++j;
>> > +}
>> > +#endif
>> >  desc_ptr = NextMemoryDescriptor(desc_ptr, desc_size);
>> >  }
>> >  
>> >  bootinfo.mem.nr_banks = i;
>> > +#if defined (CONFIG_ACPI) && defined (CONFIG_ARM)
>> > +acpi_mem.nr_banks = j;
>> > +#endif
>> >  return EFI_SUCCESS;
>> >  }
>> 
>> ... the three #ifdef-s here aren't consistent.
>  
> Well spotted, the CONFIG_ARM is redundant, could you please remove it
> while committing?

Too late, will require a fixup patch now.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [linux-mingo-tip-master test] 84467: regressions - FAIL

2016-02-29 Thread osstest service owner

flight 84467 linux-mingo-tip-master real [real]
http://logs.test-lab.xenproject.org/osstest/logs/84467/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-i386-rumpuserxen6 xen-build fail REGR. vs. 60684
 build-amd64-rumpuserxen   6 xen-build fail REGR. vs. 60684
 test-amd64-amd64-xl-multivcpu 17 guest-localmigrate/x10   fail REGR. vs. 60684
 test-amd64-amd64-libvirt 15 guest-saverestore.2   fail REGR. vs. 60684
 test-amd64-amd64-xl-xsm  15 guest-localmigratefail REGR. vs. 60684
 test-amd64-amd64-xl  15 guest-localmigratefail REGR. vs. 60684
 test-amd64-amd64-libvirt-xsm 15 guest-saverestore.2   fail REGR. vs. 60684
 test-amd64-amd64-pair 22 guest-migrate/dst_host/src_host fail in 84335 REGR. 
vs. 60684
 test-amd64-amd64-xl-credit2 15 guest-localmigrate fail in 84335 REGR. vs. 60684

Tests which are failing intermittently (not blocking):
 test-amd64-amd64-xl-multivcpu 15 guest-localmigrate fail in 84335 pass in 84467
 test-amd64-amd64-xl-rtds 14 guest-saverestore  fail in 84335 pass in 84467
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 9 debian-hvm-install fail 
in 84335 pass in 84467
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsm 9 debian-hvm-install fail in 
84335 pass in 84467
 test-amd64-amd64-pair21 guest-migrate/src_host/dst_host fail pass in 84335
 test-amd64-i386-qemut-rhel6hvm-amd 11 guest-start/redhat.repeat fail pass in 
84335
 test-amd64-amd64-xl-credit2  14 guest-saverestore   fail pass in 84335

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-xl-rtds 17 guest-localmigrate/x10fail REGR. vs. 60684
 test-amd64-i386-libvirt-xsm  15 guest-saverestore.2  fail blocked in 60684
 test-amd64-i386-libvirt  15 guest-saverestore.2  fail blocked in 60684
 test-amd64-i386-xl   15 guest-localmigrate   fail blocked in 60684
 test-amd64-i386-xl-xsm   15 guest-localmigrate   fail blocked in 60684
 test-amd64-amd64-libvirt-pair 22 guest-migrate/dst_host/src_host fail blocked 
in 60684
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop   fail blocked in 60684
 test-amd64-i386-pair  22 guest-migrate/dst_host/src_host fail blocked in 60684
 test-amd64-i386-libvirt-pair 22 guest-migrate/dst_host/src_host fail blocked 
in 60684
 test-amd64-amd64-qemuu-nested-intel 9 debian-hvm-install fail in 84335 
baseline untested
 test-amd64-i386-libvirt-pair 21 guest-migrate/src_host/dst_host fail in 84335 
like 60684
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail like 60684
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop  fail like 60684

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-rumpuserxen-amd64  1 build-check(1)   blocked n/a
 test-amd64-i386-rumpuserxen-i386  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvh-intel 14 guest-saverestorefail  never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-qemuu-nested-amd 13 xen-boot/l1   fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-intel 13 xen-boot/l1 fail never pass

version targeted for testing:
 linux47fd654e79dbd2ff224c45dcbeace36b4b92a887
baseline version:
 linux69f75ebe3b1d1e636c4ce0a0ee248edacc69cbe0

Last test of basis60684  2015-08-13 04:21:46 Z  200 days
Failing since 60712  2015-08-15 18:33:48 Z  197 days  143 attempts
Testing same since84335  2016-02-28 05:54:49 Z1 days2 attempts

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  fail
 build-i386-rumpuserxen   fail
 test-amd64-amd64-xl

Re: [Xen-devel] identify a Xen PV domU to fix devmem_is_allowed

2016-02-29 Thread Olaf Hering

On Mon, Feb 29, Konrad Rzeszutek Wilk wrote:

> On Mon, Feb 29, 2016 at 11:28:49AM +0100, Olaf Hering wrote:
> > Would this change be the correct fix?
> .. A fix for what issue?

mmap returns some pointer, but appearently that memory can not be used.

https://bugzilla.suse.com/show_bug.cgi?id=964342

Olaf

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [RFC PATCH] xen-block: introduces extra request to pass-through SCSI commands

2016-02-29 Thread Juergen Gross

On 29/02/16 16:05, Konrad Rzeszutek Wilk wrote:
> On Mon, Feb 29, 2016 at 09:12:30AM +0100, Juergen Gross wrote:
>> On 29/02/16 04:37, Bob Liu wrote:
>>> 1) What is this patch about?
>>> This patch introduces an new block operation (BLKIF_OP_EXTRA_FLAG).
>>> A request with BLKIF_OP_EXTRA_FLAG set means the following request is an
>>> extra request which is used to pass through SCSI commands.
>>> This is like a simplified version of XEN_NETIF_EXTRA_* in netif.h.
>>> It can be extended easily to transmit other per-request/bio data from 
>>> frontend
>>> to backend e.g Data Integrity Field per bio.
>>>
>>> 2) Why we need this?
>>> Currently only raw data segments are transmitted from blkfront to blkback, 
>>> which
>>> means some advanced features are lost.
>>>  * Guest knows nothing about features of the real backend storage.
>>> For example, on bare-metal environment INQUIRY SCSI command can be used
>>> to query storage device information. If it's a SSD or flash device we
>>> can have the option to use the device as a fast cache.
>>> But this can't happen in current domU guests, because blkfront only
>>> knows it's just a normal virtual disk
>>>
>>>  * Failover Clusters in Windows
>>> Failover clusters require SCSI-3 persistent reservation target disks,
>>> but now this can't work in domU.
>>>
>>> 3) Known issues:
>>>  * Security issues, how to 'validate' this extra request payload.
>>>E.g SCSI operates on LUN bases (the whole disk) while we really just 
>>> want to
>>>operate on partitions
>>
>> It's not only validation: some operations just affect the whole LUN
>> (e.g. Reserve/Release). And what about "multi-LUN" commands like
>> "report LUNs"?
> 
> Don't expose them. Bob and I want to get an idea of what would be a good
> compromise to allow some SCSI specific (or perhaps ATA specific or DIF/DIX?) 
> type of
> commands go through the PV driver.
> 
> Would it be better if it was through XenBus? But that may not work for some
> that are tied closely to requests, such as DIF/DIX.
> 
> However the 'DISCARD' for example worked out - it is an umbrella for both
> SCSI UNMAP and ATA DISCARD operation and hides the complexity of the low level
> protocol. Could there be an 'INQ' ? Since the SCSI VPD is the most exhaustive
> in terms of details it may make sense to base it on that..?
> 
>>
>>>  * Can't pass SCSI commands through if the backend storage driver is 
>>> bio-based
>>>instead of request-based.
>>>
>>> 4) Alternative approach: Using PVSCSI instead:
>>>  * Doubt PVSCSI can support as many type of backend storage devices as 
>>> Xen-block.
>>
>> pvSCSI won't need to support all types of backends. It's enough to
>> support those where passing through SCSI commands makes sense.
>>
>> Seems to be a similar issue as the above mentioned problem with
>> bio-based backend storage drivers.
> 
> In particular the ones we care about are:
>  - Multipath over FibreChannel devices.
>  - Linear mapping (LVM) over the multipath.
>  - And then potentially an filesystem on top of that
>  - .. and a raw file on the filesystem.
> 
> Having SCSI VPD 0x83 page sent to frontend for that would be good.
> 
> Not sure about SCSI reservations. Perhaps those are more of .. unique
> in that the implementation would have to make sure that the guest
> owns the whole LUN. But that is implementation question.
> 
> This is about the design - how would you envision to to cram in 
> SCSI commands or DIF/DIX commands or ATA commands via PV block layer?

Have some kind of abstraction which can be mapped to SCSI commands
easily, but don't stick to the SCSI definitions. Use your own
structures, commands etc. and build SCSI commands from those in the
backend. This way you avoid having to emulate SCSI. Instead of
naming it "SCSI passthrough" call it "special commands". :-)
I would add only those operations you really need. Add an inquiry
operation which returns the supported "special commands". The
availability of the inquiry can be reported via Xenstore.

> 
>>
>>>  * Much longer path:
>>>ioctl() -> SCSI upper layer -> Middle layer -> PVSCSI-frontend -> 
>>> PVSCSI-backend -> Target framework(LIO?) ->
>>>
>>>With xen-block we only need:
>>>ioctl() -> blkfront -> blkback ->
>>
>> I'd like to see performance numbers before making a decision.
> 
> For SCSI INQ? 
> 
> Or are you talking about raw READ/WRITE?

READ/WRITE. I'm quite sure pvSCSI will be slower than pvblk, but I don't
know how much slower.

>>
>>>  * xen-block has been existed for many years, widely used and more stable.
>>
>> Adding another SCSI passthrough capability wasn't accepted for pvSCSI
>> (that's the reason I used the Target Framework). Why do you think it
>> will be accepted for pvblk?
>>
>> This is not my personal opinion, just a heads up from someone who had a
>> try already. ;-)
> 
> Right. So SCSI passthrough out. What about mediated access for
> specific SCSI, or specific ATA, or DIF/DIX ones? And how would you do it
>

Re: [Xen-devel] [RFC PATCH] xen-block: introduces extra request to pass-through SCSI commands

2016-02-29 Thread Roger Pau Monné

El 29/2/16 a les 16:28, Paul Durrant ha escrit:
>> -Original Message-
>> From: Konrad Rzeszutek Wilk [mailto:konrad.w...@oracle.com]
>> Sent: 29 February 2016 14:56
>> To: Paul Durrant
>> Cc: Bob Liu; xen-devel@lists.xen.org; Ian Jackson; jbeul...@suse.com; Roger
>> Pau Monne; jgr...@suse.com
>> Subject: Re: [RFC PATCH] xen-block: introduces extra request to pass-
>> through SCSI commands
>>
>> On Mon, Feb 29, 2016 at 09:13:41AM +, Paul Durrant wrote:
 -Original Message-
 From: Bob Liu [mailto:bob@oracle.com]
 Sent: 29 February 2016 03:37
 To: xen-devel@lists.xen.org
 Cc: Ian Jackson; jbeul...@suse.com; Roger Pau Monne;
>> jgr...@suse.com;
 Paul Durrant; konrad.w...@oracle.com; Bob Liu
 Subject: [RFC PATCH] xen-block: introduces extra request to pass-
  * xen-block has been existed for many years, widely used and more
>> stable.

>>>
>>> It's definitely widely used, but it has had stability issues in recent 
>>> times.
>>
>> Oh? Could you send the bug-reports to me and Roger, CC xen-devel and
>> LKML please ?
> 
> I was casting my mind back to incompatibilities that crept in with persistent 
> grants. TBH I haven't used blkback much since then; I tend to use qemu qdisk 
> as my backend these days.

FWIW, QEMU Qdisk also has persistent grants support... and was
implemented more or less at the same time as blkback persistent grants.

Roger.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v5 09/11] xen: add capability to load initrd outside of initial mapping

2016-02-29 Thread Daniel Kiper

On Mon, Feb 29, 2016 at 09:27:42AM +0100, Juergen Gross wrote:
> On 26/02/16 16:41, Daniel Kiper wrote:
> > On Fri, Feb 26, 2016 at 03:28:21PM +0100, Juergen Gross wrote:
> >> On 26/02/16 15:00, Daniel Kiper wrote:
> >>> On Thu, Feb 25, 2016 at 04:33:46PM +0100, Juergen Gross wrote:
>  On 25/02/16 13:47, Daniel Kiper wrote:
> > On Thu, Feb 25, 2016 at 12:33:35PM +0100, Juergen Gross wrote:
> >> Modern pvops linux kernels support an initrd not covered by the initial
> >> mapping. This capability is flagged by an elf-note.
> >>
> >> In case the elf-note is set by the kernel don't place the initrd into
> >> the initial mapping. This will allow to load larger initrds and/or
> >> support domains with larger memory, as the initial mapping is limited
> >> to 2GB and it is containing the p2m list.
> >>
> >> Signed-off-by: Juergen Gross 
> >> ---
> >> V5: let call grub_xen_alloc_final() all subfunctions unconditionally
> >> and let them decide whether they need to do anything
> >> V4: rename grub_xen_alloc_end() to grub_xen_alloc_final()
> >> ---
> >>  grub-core/loader/i386/xen.c| 65 
> >> ++
> >>  grub-core/loader/i386/xen_fileXX.c |  3 ++
> >>  include/grub/xen_file.h|  1 +
> >>  3 files changed, 56 insertions(+), 13 deletions(-)
> >>
> >> diff --git a/grub-core/loader/i386/xen.c b/grub-core/loader/i386/xen.c
> >> index 2e12763..466f0c0 100644
> >> --- a/grub-core/loader/i386/xen.c
> >> +++ b/grub-core/loader/i386/xen.c
> >> @@ -228,6 +228,9 @@ grub_xen_p2m_alloc (void)
> >>grub_size_t p2msize;
> >>grub_err_t err;
> >>
> >> +  if (xen_state.virt_mfn_list)
> >> +return GRUB_ERR_NONE;
> >> +
> >>xen_state.state.mfn_list = xen_state.max_addr;
> >>xen_state.next_start.mfn_list =
> >>  xen_state.max_addr + xen_state.xen_inf.virt_base;
> >> @@ -250,6 +253,9 @@ grub_xen_special_alloc (void)
> >>grub_relocator_chunk_t ch;
> >>grub_err_t err;
> >>
> >> +  if (xen_state.virt_start_info)
> >> +return GRUB_ERR_NONE;
> >> +
> >>err = grub_relocator_alloc_chunk_addr (xen_state.relocator, &ch,
> >> xen_state.max_addr,
> >> sizeof (xen_state.next_start));
> >> @@ -281,6 +287,9 @@ grub_xen_pt_alloc (void)
> >>grub_uint64_t nr_info_pages;
> >>grub_uint64_t nr_pages, nr_pt_pages, nr_need_pages;
> >>
> >> +  if (xen_state.virt_pgtable)
> >> +return GRUB_ERR_NONE;
> >> +
> >>xen_state.next_start.pt_base =
> >>  xen_state.max_addr + xen_state.xen_inf.virt_base;
> >>xen_state.state.paging_start = xen_state.max_addr >> PAGE_SHIFT;
> >> @@ -320,6 +329,24 @@ grub_xen_pt_alloc (void)
> >>  }
> >>
> >>  static grub_err_t
> >> +grub_xen_alloc_final (void)
> >> +{
> >> +  grub_err_t err;
> >> +
> >> +  err = grub_xen_p2m_alloc ();
> >> +  if (err)
> >> +return err;
> >> +  err = grub_xen_special_alloc ();
> >> +  if (err)
> >> +return err;
> >> +  err = grub_xen_pt_alloc ();
> >> +  if (err)
> >> +return err;
> >
> > Could you move grub_xen_p2m_alloc() here? This way 
> > grub_xen_alloc_final()
> > will be real final in patch 11 and you do not need an extra condition
> > around grub_xen_p2m_alloc().
> 
>  No. Page tables must be allocated after p2m unless p2m is outside of
>  initial kernel mapping (patch 11).
> >>>
> >>> OK, I was afraid of that. However, then grub_xen_alloc_final() is not 
> >>> final.
> >>
> >> Sure it is. Patch 11 just adds allocation to grub_xen_alloc_final(), not
> >> after it.
> >
> > Excerpt from patch 11:
> >
> > @@ -474,6 +503,12 @@ grub_xen_boot (void)
> >err = grub_xen_alloc_final ();
> >if (err)
> >  return err;
> > +  if (xen_state.xen_inf.has_p2m_base)
> > +{
> > +  err = grub_xen_p2m_alloc ();
> > +  if (err)
> > +return err;
> > +}
> >
>
> Hmph. No idea what I looked at when writing my previous reply.
>
> >>> So, maybe it should be called grub_xen_alloc_boot_data() or something 
> >>> like that.
> >
> > Then my previous comment is still valid.
>
> What about naming it grub_xen_alloc_kernel_mapping() ? This is what it
> really does: It is allocating all not yet allocated areas which are to
> be included in the initial kernel mapping.

This function does three things:
  - allocate memory for p2m array,
  - allocate special pages (e.g. console),
  - allocate memory for page tables.

Three of them are part of boot data described by struct start_info.
Well, there are also other required stuff to properly boot PV guest.
So, I can agree that grub_xen_alloc_boot_data() is not perfect but
I cannot find anything better. Additionally, IMO grub_xen_alloc_k

Re: [Xen-devel] [PATCH v5 09/11] xen: add capability to load initrd outside of initial mapping

2016-02-29 Thread Juergen Gross

On 29/02/16 16:43, Daniel Kiper wrote:
> On Mon, Feb 29, 2016 at 09:27:42AM +0100, Juergen Gross wrote:
>> On 26/02/16 16:41, Daniel Kiper wrote:
>>> On Fri, Feb 26, 2016 at 03:28:21PM +0100, Juergen Gross wrote:
 On 26/02/16 15:00, Daniel Kiper wrote:
> On Thu, Feb 25, 2016 at 04:33:46PM +0100, Juergen Gross wrote:
>> On 25/02/16 13:47, Daniel Kiper wrote:
>>> On Thu, Feb 25, 2016 at 12:33:35PM +0100, Juergen Gross wrote:
 Modern pvops linux kernels support an initrd not covered by the initial
 mapping. This capability is flagged by an elf-note.

 In case the elf-note is set by the kernel don't place the initrd into
 the initial mapping. This will allow to load larger initrds and/or
 support domains with larger memory, as the initial mapping is limited
 to 2GB and it is containing the p2m list.

 Signed-off-by: Juergen Gross 
 ---
 V5: let call grub_xen_alloc_final() all subfunctions unconditionally
 and let them decide whether they need to do anything
 V4: rename grub_xen_alloc_end() to grub_xen_alloc_final()
 ---
  grub-core/loader/i386/xen.c| 65 
 ++
  grub-core/loader/i386/xen_fileXX.c |  3 ++
  include/grub/xen_file.h|  1 +
  3 files changed, 56 insertions(+), 13 deletions(-)

 diff --git a/grub-core/loader/i386/xen.c b/grub-core/loader/i386/xen.c
 index 2e12763..466f0c0 100644
 --- a/grub-core/loader/i386/xen.c
 +++ b/grub-core/loader/i386/xen.c
 @@ -228,6 +228,9 @@ grub_xen_p2m_alloc (void)
grub_size_t p2msize;
grub_err_t err;

 +  if (xen_state.virt_mfn_list)
 +return GRUB_ERR_NONE;
 +
xen_state.state.mfn_list = xen_state.max_addr;
xen_state.next_start.mfn_list =
  xen_state.max_addr + xen_state.xen_inf.virt_base;
 @@ -250,6 +253,9 @@ grub_xen_special_alloc (void)
grub_relocator_chunk_t ch;
grub_err_t err;

 +  if (xen_state.virt_start_info)
 +return GRUB_ERR_NONE;
 +
err = grub_relocator_alloc_chunk_addr (xen_state.relocator, &ch,
 xen_state.max_addr,
 sizeof (xen_state.next_start));
 @@ -281,6 +287,9 @@ grub_xen_pt_alloc (void)
grub_uint64_t nr_info_pages;
grub_uint64_t nr_pages, nr_pt_pages, nr_need_pages;

 +  if (xen_state.virt_pgtable)
 +return GRUB_ERR_NONE;
 +
xen_state.next_start.pt_base =
  xen_state.max_addr + xen_state.xen_inf.virt_base;
xen_state.state.paging_start = xen_state.max_addr >> PAGE_SHIFT;
 @@ -320,6 +329,24 @@ grub_xen_pt_alloc (void)
  }

  static grub_err_t
 +grub_xen_alloc_final (void)
 +{
 +  grub_err_t err;
 +
 +  err = grub_xen_p2m_alloc ();
 +  if (err)
 +return err;
 +  err = grub_xen_special_alloc ();
 +  if (err)
 +return err;
 +  err = grub_xen_pt_alloc ();
 +  if (err)
 +return err;
>>>
>>> Could you move grub_xen_p2m_alloc() here? This way 
>>> grub_xen_alloc_final()
>>> will be real final in patch 11 and you do not need an extra condition
>>> around grub_xen_p2m_alloc().
>>
>> No. Page tables must be allocated after p2m unless p2m is outside of
>> initial kernel mapping (patch 11).
>
> OK, I was afraid of that. However, then grub_xen_alloc_final() is not 
> final.

 Sure it is. Patch 11 just adds allocation to grub_xen_alloc_final(), not
 after it.
>>>
>>> Excerpt from patch 11:
>>>
>>> @@ -474,6 +503,12 @@ grub_xen_boot (void)
>>>err = grub_xen_alloc_final ();
>>>if (err)
>>>  return err;
>>> +  if (xen_state.xen_inf.has_p2m_base)
>>> +{
>>> +  err = grub_xen_p2m_alloc ();
>>> +  if (err)
>>> +return err;
>>> +}
>>>
>>
>> Hmph. No idea what I looked at when writing my previous reply.
>>
> So, maybe it should be called grub_xen_alloc_boot_data() or something 
> like that.
>>>
>>> Then my previous comment is still valid.
>>
>> What about naming it grub_xen_alloc_kernel_mapping() ? This is what it
>> really does: It is allocating all not yet allocated areas which are to
>> be included in the initial kernel mapping.
> 
> This function does three things:
>   - allocate memory for p2m array,
>   - allocate special pages (e.g. console),
>   - allocate memory for page tables.
> 
> Three of them are part of boot data described by struct start_info.
> Well, there are also other required stuff to properly boot PV guest.
> So, I can agree that grub_xen_alloc_boot_data() is not perfect bu

Re: [Xen-devel] [RFC PATCH] xen-block: introduces extra request to pass-through SCSI commands

2016-02-29 Thread Paul Durrant

> -Original Message-
> From: Konrad Rzeszutek Wilk [mailto:konrad.w...@oracle.com]
> Sent: 29 February 2016 14:56
> To: Paul Durrant
> Cc: Bob Liu; xen-devel@lists.xen.org; Ian Jackson; jbeul...@suse.com; Roger
> Pau Monne; jgr...@suse.com
> Subject: Re: [RFC PATCH] xen-block: introduces extra request to pass-
> through SCSI commands
> 
> On Mon, Feb 29, 2016 at 09:13:41AM +, Paul Durrant wrote:
> > > -Original Message-
> > > From: Bob Liu [mailto:bob@oracle.com]
> > > Sent: 29 February 2016 03:37
> > > To: xen-devel@lists.xen.org
> > > Cc: Ian Jackson; jbeul...@suse.com; Roger Pau Monne;
> jgr...@suse.com;
> > > Paul Durrant; konrad.w...@oracle.com; Bob Liu
> > > Subject: [RFC PATCH] xen-block: introduces extra request to pass-
> through
> > > SCSI commands
> > >
> > > 1) What is this patch about?
> > > This patch introduces an new block operation (BLKIF_OP_EXTRA_FLAG).
> > > A request with BLKIF_OP_EXTRA_FLAG set means the following request
> is an
> > > extra request which is used to pass through SCSI commands.
> > > This is like a simplified version of XEN_NETIF_EXTRA_* in netif.h.
> > > It can be extended easily to transmit other per-request/bio data from
> > > frontend
> > > to backend e.g Data Integrity Field per bio.
> > >
> > > 2) Why we need this?
> > > Currently only raw data segments are transmitted from blkfront to
> blkback,
> > > which
> > > means some advanced features are lost.
> > >  * Guest knows nothing about features of the real backend storage.
> > >   For example, on bare-metal environment INQUIRY SCSI command
> > > can be used
> > >   to query storage device information. If it's a SSD or flash device we
> > >   can have the option to use the device as a fast cache.
> > >   But this can't happen in current domU guests, because blkfront only
> > >   knows it's just a normal virtual disk
> > >
> >
> > That's the sort of information that should be advertised via xenstore then.
> There already feature flags for specific things but if some form of
> throughput/latency information is meaningful to a frontend stack then
> perhaps that can be advertised too.
> 
> Certainly could be put on the XenStore. Do you envision this being done
> pre guest creation (so toolstack does it), or the backend finds this
> and populates the XenStore keys?
> 
> Or that the frontend writes an XenStore key 'scsi-inq=vpd80' and the
> backend
> responds by populating an 'scsi-inq-vpd80=' '? If so can
> the XenStore accept binary payloads? Can it be more than 4K?
> 

I was thinking more along the lines of blkback creating xenstore keys with any 
relevant information. We have sector size and number of sectors already but I 
don't see any harm in having values for other quantities that may be useful to 
a frontend. Bouncing SCSI inquiries via xenstore was certainly not what I was 
thinking.

> 
> >
> > >  * Failover Clusters in Windows
> > >   Failover clusters require SCSI-3 persistent reservation target disks,
> > >   but now this can't work in domU.
> > >
> >
> > That's true but allowing arbitrary SCSI messages through is not the way
> forward IMO. Just because Windows thinks every HBA is SCSI doesn't mean
> other OS do so I think reservation/release should have dedicated messages
> in the blkif protocol if it's desirable to support clustering in the frontend.
> 
> Could you expand a bit on the 'dedicated message' you have in mind please?
> 

As in we create message types to reserve/release whatever backend is being used 
and the backend uses whatever mechanism is appropriate to deal with that. E.g. 
if it were qdisk talking to a raw file then that might just be an flock.

> >
> > > 3) Known issues:
> > >  * Security issues, how to 'validate' this extra request payload.
> > >E.g SCSI operates on LUN bases (the whole disk) while we really just
> want
> > > to
> > >operate on partitions
> > >
> > >  * Can't pass SCSI commands through if the backend storage driver is bio-
> > > based
> > >instead of request-based.
> > >
> > > 4) Alternative approach: Using PVSCSI instead:
> > >  * Doubt PVSCSI can support as many type of backend storage devices as
> > > Xen-block.
> > >
> >
> > LIO can interface to any block device in much the same way blkback does
> IIRC.
> 
> But it can't do multipath or LVMs - which is an integral component.
> 

Really? I was not aware of that limitation and it surprises me since AFAIK LIO 
can also use a raw file as a backstore which seems like it would be above 
either of those.

> Anyhow that is more of a implementation specific quirk.
> >
> > >  * Much longer path:
> > >ioctl() -> SCSI upper layer -> Middle layer -> PVSCSI-frontend -> 
> > > PVSCSI-
> > > backend -> Target framework(LIO?) ->
> > >
> > >With xen-block we only need:
> > >ioctl() -> blkfront -> blkback ->
> > >
> >
> > ...and what happens if the block device that blkback is talking to is a SCSI
> LUN?
> >
> > That latter path is also not true for Windows. You've got all the SCSI
> t

[Xen-devel] Call for tools backport requests for 4.5.x (Re: 4.5.3 preparations)

2016-02-29 Thread Ian Jackson

Jan Beulich writes ("4.5.3 preparations"):
> it just occurred to me that 4.5.2 has been a while back, and indeed
> 4.5.3 would be due later this week. This may be a little too eager,
> but I'd like to aim at getting this out at least some time next week.
> Besides what is in the tree already, I have three more commits I
> intend to backport:

I have cleared out my current backport list.

For reference, I don't do a sweep of the git log of master to look for
things to backport.

I rely on someone (which might be the maintainer or submitter, or
anyone else) noticing that a fix ought to be considered for backport
and bringing it to my attention.

My backport queue was surprisingly empty.  I encourage people to reply
to this message with backport requets.

Thanks,
Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH] tools/python/xs: fix two comments

2016-02-29 Thread Wei Liu

These two functions require transaction handle as the first argument.

Reported-by: Sergei Lebedev 
Signed-off-by: Wei Liu 
---
Cc: Sergei Lebedev 
Cc: Ian Jackson 

Compile test only
---
 tools/python/xen/lowlevel/xs/xs.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/tools/python/xen/lowlevel/xs/xs.c 
b/tools/python/xen/lowlevel/xs/xs.c
index 76ae3ac..a86edbe 100644
--- a/tools/python/xen/lowlevel/xs/xs.c
+++ b/tools/python/xen/lowlevel/xs/xs.c
@@ -191,7 +191,8 @@ static PyObject *xspy_ls(XsHandle *self, PyObject *args)
 
 #define xspy_mkdir_doc "\n"\
"Make a directory.\n"   \
-   " path [string]: path to directory to create.\n"\
+   " transaction [string]: transaction handle.\n"  \
+   " path [string]   : path to directory to create.\n" \
"\n"\
"Returns None on success.\n"\
"Raises xen.lowlevel.xs.Error on error.\n"  \
@@ -555,7 +556,8 @@ static PyObject *xspy_transaction_start(XsHandle *self)
 #define xspy_transaction_end_doc "\n"  \
"End the current transaction.\n"\
"Attempts to commit the transaction unless abort is true.\n"\
-   " abort [int]: abort flag (default 0).\n"   \
+   " transaction [string] : transaction handle.\n" \
+   " abort [int]  : abort flag (default 0).\n" \
"\n"\
"Returns True on success, False if you need to try again.\n"\
"Raises xen.lowlevel.xs.Error on error.\n"  
\
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 0/7] libxl: add support for FreeBSD block hotplug scripts

2016-02-29 Thread Roger Pau Monné

El 29/2/16 a les 15:26, George Dunlap ha escrit:
> On 29/02/16 12:18, Roger Pau Monné wrote:
>> El 29/2/16 a les 13:15, George Dunlap ha escrit:
>>> On Thu, Feb 25, 2016 at 7:25 PM, Roger Pau Monne  
>>> wrote:
 This series enables using hotplug scripts with the FreeBSD blkback
 implementation. Since FreeBSD blkback can use both block devices and 
 regular
 RAW files as disks, the physical-device xenstore backend node is now
 OS-specific, Linux and NetBSD will encode the device major and minor 
 numbers
 there, while FreeBSD simply puts an absolute path to a disk image.
>>>
>>> Just to catch me up here -- is this an incompatible thing  that
>>> FreeBSD *already* does, or something you're adding with this series?
>>
>> I assume you mean the usage of the "physical-device" node, right? In
>> which case, this is something that I'm adding with this series (and some
>> FreeBSD kernel changes, of course). Current FreeBSD code doesn't use
>> physical-device at all.
> 
> So there are cases where libxl could use the actual path to the
> resulting device (if available) as well.  Rather than overloading
> physical-device, what about making a new node, with a path, that can be
> used by all of them?
> 
> I submitted such a patch here:
> 
> <1439233885-22218-4-git-send-email-george.dun...@eu.citrix.com>
> 
> As part of a series allowing HVM domains to use hotplug scripts.
> 
> Thoughts?

TBH, I thought we were planning to use local attach to deal with both
HVM guests with hotplug scripts and HVM guests using disks on driver
domains. The solution you propose only solves the first part (hotplug
scripts), but for disks coming from driver domains we would still need
to use local-attach, so I would be in favour of always using local attach.

Roger.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

1 2 3 >

1 - 100 of 204 matches

Mail list logo