[Xen-devel] minimum Python version

2017-09-20 Thread Jan Beulich
Hello,

for quite a while (apparently as of Marek's series in February) I've
been seeing

xen/lowlevel/xc/xc.c: In function ‘pyxc_dom_extract_cpuid’:
xen/lowlevel/xc/xc.c:692:13: error: implicit declaration of function 
‘PyBytes_AS_STRING’ [-Werror=implicit-function-declaration]
xen/lowlevel/xc/xc.c:692:21: error: assignment makes pointer from integer 
without a cast [-Werror]
xen/lowlevel/xc/xc.c: In function ‘pyxc_create_cpuid_dict’:
xen/lowlevel/xc/xc.c:707:29: error: implicit declaration of function 
‘PyBytes_FromString’ [-Werror=implicit-function-declaration]
xen/lowlevel/xc/xc.c:707:29: error: passing argument 3 of 
‘PyDict_SetItemString’ makes pointer from integer without a cast [-Werror]
In file included from /usr/include/python2.4/Python.h:94:0,
 from xen/lowlevel/xc/xc.c:7:
/usr/include/python2.4/dictobject.h:128:17: note: expected ‘struct PyObject *’ 
but argument is of type ‘int’
xen/lowlevel/xc/xc.c: In function ‘pyxc_readconsolering’:
xen/lowlevel/xc/xc.c:931:5: error: implicit declaration of function 
‘PyBytes_FromStringAndSize’ [-Werror=implicit-function-declaration]
xen/lowlevel/xc/xc.c:931:9: error: assignment makes pointer from integer 
without a cast [-Werror]
xen/lowlevel/xc/xc.c: In function ‘PyXc_dealloc’:
xen/lowlevel/xc/xc.c:2650:5: error: implicit declaration of function ‘Py_TYPE’ 
[-Werror=implicit-function-declaration]
xen/lowlevel/xc/xc.c:2650:18: error: invalid type argument of ‘->’ (have ‘int’)
cc1: all warnings being treated as errors

on an older system having Python 2.4 on it. The minimum version
configure checks for appears to be 2.3, matching up with what
./README says. Since the Python bindings aren't something one
absolutely needs (afaict), wouldn't it be reasonable to check for
a suitable version and if that fails simply disable their building
(which currently is unconditional)? What I'd like to avoid is ending
up with being unable to build the tools on that system - I know
it's rather old, but there's a reason I'd like to keep it (including its
old distro level).

Jan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 04/15] x86: implement data structure and CPU init flow for MBA

2017-09-20 Thread Jan Beulich
>>> On 20.09.17 at 05:22,  wrote:
> On 17-09-19 09:55:28, Roger Pau Monn wrote:
>> On Tue, Sep 05, 2017 at 05:32:26PM +0800, Yi Sun wrote:
>> > @@ -1389,6 +1480,7 @@ static void psr_cpu_init(void)
>> >  unsigned int socket, cpu = smp_processor_id();
>> >  struct feat_node *feat;
>> >  struct cpuid_leaf regs;
>> > +uint32_t reg_b;
>> 
>> Not sure of the benefit between using regs.b or reg_b (it's only 1
>> char shorter).
>> 
> You can see the 'regs' is overwritten in below codes so that the 'regs.b' is 
> not
> kept. To add a new local variable 'reg_b' here, we can avoid calling
> 'cpuid_count_leaf' for L2 CAT and MBA.

In which case - wouldn't "ebx" be a better name for the variable?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] minimum Python version

2017-09-20 Thread Juergen Gross
On 20/09/17 09:00, Jan Beulich wrote:
> Hello,
> 
> for quite a while (apparently as of Marek's series in February) I've
> been seeing
> 
> xen/lowlevel/xc/xc.c: In function ‘pyxc_dom_extract_cpuid’:
> xen/lowlevel/xc/xc.c:692:13: error: implicit declaration of function 
> ‘PyBytes_AS_STRING’ [-Werror=implicit-function-declaration]
> xen/lowlevel/xc/xc.c:692:21: error: assignment makes pointer from integer 
> without a cast [-Werror]
> xen/lowlevel/xc/xc.c: In function ‘pyxc_create_cpuid_dict’:
> xen/lowlevel/xc/xc.c:707:29: error: implicit declaration of function 
> ‘PyBytes_FromString’ [-Werror=implicit-function-declaration]
> xen/lowlevel/xc/xc.c:707:29: error: passing argument 3 of 
> ‘PyDict_SetItemString’ makes pointer from integer without a cast [-Werror]
> In file included from /usr/include/python2.4/Python.h:94:0,
>  from xen/lowlevel/xc/xc.c:7:
> /usr/include/python2.4/dictobject.h:128:17: note: expected ‘struct PyObject 
> *’ but argument is of type ‘int’
> xen/lowlevel/xc/xc.c: In function ‘pyxc_readconsolering’:
> xen/lowlevel/xc/xc.c:931:5: error: implicit declaration of function 
> ‘PyBytes_FromStringAndSize’ [-Werror=implicit-function-declaration]
> xen/lowlevel/xc/xc.c:931:9: error: assignment makes pointer from integer 
> without a cast [-Werror]
> xen/lowlevel/xc/xc.c: In function ‘PyXc_dealloc’:
> xen/lowlevel/xc/xc.c:2650:5: error: implicit declaration of function 
> ‘Py_TYPE’ [-Werror=implicit-function-declaration]
> xen/lowlevel/xc/xc.c:2650:18: error: invalid type argument of ‘->’ (have 
> ‘int’)
> cc1: all warnings being treated as errors
> 
> on an older system having Python 2.4 on it. The minimum version
> configure checks for appears to be 2.3, matching up with what
> ./README says. Since the Python bindings aren't something one
> absolutely needs (afaict), wouldn't it be reasonable to check for
> a suitable version and if that fails simply disable their building
> (which currently is unconditional)? What I'd like to avoid is ending
> up with being unable to build the tools on that system - I know
> it's rather old, but there's a reason I'd like to keep it (including its
> old distro level).

Hmm, remembers me to my attempt removing most of the python bindings in
2015.

I think we should try to avoid such problems in the future by (at least)
adding a comment to xen/lowlevel/xc/xc.c that any enhancement should be
done only in case of a real need. New bindings should only be added with
mentioning who will need this binding in the commit message. Existing
bindings should rather be deleted than updated in case an underlying
libxc function is being modified (unless there is a user of that
binding, of course).

For your case: what about adding a configure option to disable building
the python bindings (and any dependencies like e.g. pygrub) instead of
doing so in case of a version mismatch? This would avoid any surprises
in case someone didn't notice that the bindings have been disabled.


Juergen


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 14/15] tools: implement new generic set value interface and MBA set value command

2017-09-20 Thread Yi Sun
On 17-09-19 12:30:59, Roger Pau Monn� wrote:
> >  int libxl_psr_cat_get_cbm(libxl_ctx *ctx, uint32_t domid,
> > @@ -458,7 +432,33 @@ int libxl_psr_set_val(libxl_ctx *ctx, uint32_t domid,
> >libxl_psr_type type, libxl_bitmap *target_map,
> >uint64_t val)
> >  {
> > -return ERROR_FAIL;
> > +GC_INIT(ctx);
> > +int rc;
> > +int socketid, nr_sockets;
> 
> You could fit them all in a single line.
> 
Yes, thanks.

> > +int main_psr_mba_set(int argc, char **argv)
> > +{
> > +uint32_t domid;
> > +libxl_psr_type type;
> > +uint64_t thrtl;
> > +int ret, opt = 0;
> > +libxl_bitmap target_map;
> > +char *value;
> > +libxl_string_list socket_list;
> > +unsigned long start, end;
> > +unsigned int i, j, len;
> > +
> > +static const struct option opts[] = {
> > +{"socket", 1, 0, 's'},
> > +COMMON_LONG_OPTS
> > +};
> > +
> > +libxl_socket_bitmap_alloc(ctx, &target_map, 0);
> > +libxl_bitmap_set_none(&target_map);
> > +
> > +SWITCH_FOREACH_OPT(opt, "s:", opts, "psr-mba-set", 0) {
> > +case 's':
> > +trim(isspace, optarg, &value);
> > +split_string_into_string_list(value, ",", &socket_list);
> > +len = libxl_string_list_length(&socket_list);
> > +for (i = 0; i < len; i++) {
> > +   parse_range(socket_list[i], &start, &end);
> 
> Indentation.
> 
Sorry.

> > +for (j = start; j <= end; j++)
> > +libxl_bitmap_set(&target_map, j);
> > +}
> > +
> > +libxl_string_list_dispose(&socket_list);
> > +free(value);
> > +break;
> > +}
> > +
> > +type = LIBXL_PSR_CBM_TYPE_MBA_THRTL;
> > +
> > +if (libxl_bitmap_is_empty(&target_map))
> > +libxl_bitmap_set_any(&target_map);
> > +
> > +if (argc != optind + 2) {
> > +help("psr-mba-set");
> > +return 2;
> > +}
> 
> Can you do this check at the beginning of the function? Also why
> return 2 instead of EXIT_FAILURE?
> 
Yes, will move it to the beginning. Will return EXIT_FAILURE.

> Thanks, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 15/15] docs: add MBA description in docs

2017-09-20 Thread Yi Sun
On 17-09-19 12:37:24, Roger Pau Monn� wrote:
> On Tue, Sep 05, 2017 at 05:32:37PM +0800, Yi Sun wrote:
> > +Intel Skylake and later server platforms offer capabilities to configure 
> > and
> > +make use of the Memory Bandwidth Allocation (MBA) mechanisms, which 
> > provides
> > +OS/VMMs the ability to slow misbehaving apps/VMs or create advanced 
> > closed-loop
> 
> I don't get the 'closed-loop' thing again, but that might just be me
> since I'm not a native speaker.
> 
Will modify this to be same as feature doc.

[...]

> > +In the linear mode the input precision is defined as 100-(THRTL_MAX). 
> > Values
> > +not an even multiple of the precision (e.g., 12%) will be rounded down 
> > (e.g.,
> > +to 10% delay applied).
>^ s/applied/by the hardware/
> 
Thanks!

> Thanks, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 04/15] x86: implement data structure and CPU init flow for MBA

2017-09-20 Thread Yi Sun
On 17-09-20 01:11:50, Jan Beulich wrote:
> >>> On 20.09.17 at 05:22,  wrote:
> > On 17-09-19 09:55:28, Roger Pau Monn wrote:
> >> On Tue, Sep 05, 2017 at 05:32:26PM +0800, Yi Sun wrote:
> >> > @@ -1389,6 +1480,7 @@ static void psr_cpu_init(void)
> >> >  unsigned int socket, cpu = smp_processor_id();
> >> >  struct feat_node *feat;
> >> >  struct cpuid_leaf regs;
> >> > +uint32_t reg_b;
> >> 
> >> Not sure of the benefit between using regs.b or reg_b (it's only 1
> >> char shorter).
> >> 
> > You can see the 'regs' is overwritten in below codes so that the 'regs.b' 
> > is not
> > kept. To add a new local variable 'reg_b' here, we can avoid calling
> > 'cpuid_count_leaf' for L2 CAT and MBA.
> 
> In which case - wouldn't "ebx" be a better name for the variable?
> 
Thanks for the suggestion!

> Jan
> 
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC] Unicore Subproject Proposal

2017-09-20 Thread Felipe Huici
Hi Lars,

Great news, thanks! We sent v2 to the mailing list on the 18th, 2 days
ago. When you say “post v2 first”, should we be posting it somewhere else
too?

Thanks,

— Felipe


Dr. Felipe Huici
Chief Researcher, Networked Systems and Data
Analytics Group
NEC Laboratories Europe, Network Research Division
Kurfuerstenanlage 36, D-69115 Heidelberg
Tel. +49
(0)6221 4342-241
Fax: +49
(0)6221 4342-155

e-mail: 
felipe.hu...@neclab.eu

NEC Europe Limited Registered Office: NEC House, 1
Victoria Road, London W3 6BL Registered in England 2832014




On 9/20/17, 2:20 AM, "Lars Kurth"  wrote:

>Felipe, Simon,
>a quick note to let you know that the Advisory Board in today’s AB
>meeting decided to endorse your proposal.
>Let me know how you proceed: from my perspective, we can kick off a
>formal vote before you make modifications to the proposal, but I think it
>is better to post v2 first.
>Lars
>
>On 07/09/2017, 05:26, "Felipe Huici"  wrote:
>
>Dear all,
>
>Following up on discussions that Simon Kuenzer had with several of
>you at
>the last Xen summit, we’re now submitting a Xen subproject proposal
>based
>on our Unicore work. Could you please review it?
>
>Thanks,
>
>Felipe Huici & Simon Kuenzer - NEC Labs Heidelberg.
>
>
>PROPOSAL: Unicore
>=
>
>Roles
>-
>Project Leads: Simon Kuenzer  (main lead)
>   Felipe Huici  (co-lead)
>   Florian Schmidt  (co-lead)
>Project Mentor:  Lars Kurth 
>Project Sponsor: -To be found-
>
>Background
>--
>In recent years, several papers and projects dedicated to unikernels
>have
>shown the immense potential for performance gains that these have. By
>leveraging specialization and the use of minimalistic OSes,
>unikernels are
>able to yield impressive numbers, including fast instantiation times
>(tens
>of milliseconds or less), tiny memory footprints (a few MBs or even
>KBs),
>high network throughput (10-40 Gb/s), and high consolidation (e.g.,
>being
>able to run thousands of instances on a single commodity server), not
>to
>mention a reduced attack surface and the potential for easier
>certification. Unikernel projects worthy of mention include MirageOS,
>ClickOS, Erlang on Xen, OSv, HALVM, and Minicache, among others.
>
>The fundamental drawback of unikernels is that they require that
>applications be manually ported to the underlying minimalistic OS
>(e.g.
>having to port nginx, snort, mysql or memcached to MiniOS or OSv);
>this
>requires both expert work and often considerable amount of time. In
>essence, we need to pick between either high performance with
>unikernels,
>or no porting effort but decreased performance and decreased
>efficiency
>with standard OS/VM images. The goal of this proposal is to change
>this
>status quo by providing a highly configurable unikernel code base; we
>call
>this base Unicore.
>
>This project also aims to concentrate the various efforts currently
>going
>on in the Xen community regarding minimalistic OSes (essentially
>different
>variants of MiniOS). We think that splitting the community across
>these
>variants is counter-productive and hope that Unicore will provide a
>common
>place for all or most improvements and customizations of minimalistic
>OSes. The long term goal is to replace something like MiniOS with a
>tool
>that can automatically build such a minimalistic OS.
>
>Unicore - The "Unikernel Core"
>-
>The high level goal of Unicore is to be able to build unikernels
>targeted
>at specific applications without requiring the time-consuming, expert
>work
>that building such a unikernel requires today. An additional goal (or
>hope) of Unicore is that all developers interested in unikernel
>development would contribute by supplying libraries rather than
>working on
>independent projects with different code bases as it is done now. The
>main
>idea behind Unicore is depicted in Figure 1 and consists of two basic
>components:
> 
>
>[Attachment: unicore-oneslider.pdf]
>
>
>Figure 1. Unicore architecture.
>
> 
>Library pools would contain libraries that the user of Unicore can
>select
>from to create the unikernel. From the bottom up, library pools are
>organized into (1) the architecture library tool, containing libraries
>specific to a computer architecture (e.g., x86_64, ARM32 or MIPS);
>(2) the
>platform tool, where target platforms can be Xen, KVM, bare metal
>(i.e. no
>virtualization) and user-space Linux; and (3) the main library pool,
>containing a rich set of functionality to build the unikernel from.
>This
>last librar

Re: [Xen-devel] minimum Python version

2017-09-20 Thread Jan Beulich
>>> On 20.09.17 at 09:20,  wrote:
> For your case: what about adding a configure option to disable building
> the python bindings (and any dependencies like e.g. pygrub) instead of
> doing so in case of a version mismatch? This would avoid any surprises
> in case someone didn't notice that the bindings have been disabled.

That's certainly an option, but I'm generally not very happy with
having to add such options. My script invoking configure becomes
more and more unreadable because of such, since this way the
requirement to do the version check gets moved from the build
system (i.e. configure, whose job this is imo) to that script. And
as you can probably understand I don't want to unilaterally
disable the building of that code, to be able to build test possible
changes I may end up making to e.g. libxc.

IOW if such a configure option was added, I'd still think it should
default to disabled if the Python in use is not new enough.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v1] x86/vvmx: add hvm_intsrc_vector support to nvmx_intr_intercept()

2017-09-20 Thread Tian, Kevin
> From: Sergey Dyasli [mailto:sergey.dya...@citrix.com]
> Sent: Wednesday, September 13, 2017 9:01 PM
> 
> Under the following circumstances:
> 
> 1. L1 doesn't enable PAUSE exiting or PAUSE-loop exiting controls
> 2. L2 executes PAUSE in a loop with RFLAGS.IE == 0
> 
> L1's PV IPI through event channel will never reach the target L1's vCPU
> which runs L2 because nvmx_intr_intercept() doesn't know about
> hvm_intsrc_vector. This leads to infinite L2 loop without nested
> vmexits and can cause L1 to hang.
> 
> The issue is easily reproduced with Qemu/KVM on CentOS-7-1611 as L1
> and an L2 guest with SMP.
> 
> Fix nvmx_intr_intercept() by injecting hvm_intsrc_vector irq into L1
> which will cause nested vmexit.
> 
> Signed-off-by: Sergey Dyasli 

Acked-by: Kevin Tian 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [distros-debian-squeeze test] 72130: tolerable trouble: broken/fail/pass

2017-09-20 Thread Platform Team regression test user
flight 72130 distros-debian-squeeze real [real]
http://osstest.xs.citrite.net/~osstest/testlogs/logs/72130/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 build-arm64-pvops 2 hosts-allocate   broken like 72101
 build-arm64   2 hosts-allocate   broken like 72101
 build-arm64   3 capture-logs broken like 72101
 build-arm64-pvops 3 capture-logs broken like 72101
 test-amd64-amd64-amd64-squeeze-netboot-pygrub 10 debian-di-install fail 
blocked in 72101
 test-amd64-amd64-i386-squeeze-netboot-pygrub 10 debian-di-install fail blocked 
in 72101
 test-amd64-i386-i386-squeeze-netboot-pygrub 10 debian-di-install fail like 
72101
 test-amd64-i386-amd64-squeeze-netboot-pygrub 10 debian-di-install fail like 
72101

baseline version:
 flight   72101

jobs:
 build-amd64  pass
 build-arm64  broken  
 build-armhf  pass
 build-i386   pass
 build-amd64-pvopspass
 build-arm64-pvopsbroken  
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-amd64-squeeze-netboot-pygrubfail
 test-amd64-i386-amd64-squeeze-netboot-pygrub fail
 test-amd64-amd64-i386-squeeze-netboot-pygrub fail
 test-amd64-i386-i386-squeeze-netboot-pygrub  fail



sg-report-flight on osstest.xs.citrite.net
logs: /home/osstest/logs
images: /home/osstest/images

Logs, config files, etc. are available at
http://osstest.xs.citrite.net/~osstest/testlogs/logs

Test harness code can be found at
http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary


Push not applicable.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 1/1] public/io/netif.h: add gref mapping control messages

2017-09-20 Thread Paul Durrant
> -Original Message-
> From: Joao Martins [mailto:joao.m.mart...@oracle.com]
> Sent: 19 September 2017 20:08
> To: Xen-devel 
> Cc: Wei Liu ; Paul Durrant ;
> Konrad Rzeszutek Wilk ; Joao Martins
> 
> Subject: [PATCH v4 1/1] public/io/netif.h: add gref mapping control messages
> 
> Adds 3 messages to allow guest to let backend keep grants mapped,
> such that 1) guests allowing fast recycling of pages can avoid doing
> grant ops for those cases, or otherwise 2) preferring copies over
> grants and 3) always using a fixed set of pages for network I/O.
> 
> The three control ring messages added are:
>  - Add grefs to be mapped by backend
>  - Remove grefs mappings (If they are not in use)
>  - Get maximum amount of grefs kept mapped.
> 
> Signed-off-by: Joao Martins 

I think the text is clear enough now. Now for the netfront/netback patches :-)

Reviewed-by: Paul Durrant 

> ---
> v4:
> * Declare xen_netif_gref parameters are input or output.
> * Clarify status field and that it doesn't require to be set to zero
> prior to its usage.
> * Clarify on ADD_GREF_MAPPING is 'all or nothing'
> * Improve last paragraph of DEL_GREF_MAPPING
> 
> v3:
> * Use DEL for unmapping grefs instead of PUT
> * Rname from xen_netif_gref_alloc to xen_netif_gref
> * Add 'status' field on xen_netif_gref
> * Clarify what 'inflight' means
> * Use "beginning of the page" instead of "beginning of the grant"
> * Mention that page needs to be r/w (as it will have to modify \.status)
> ---
>  xen/include/public/io/netif.h | 123
> ++
>  1 file changed, 123 insertions(+)
> 
> diff --git a/xen/include/public/io/netif.h b/xen/include/public/io/netif.h
> index ca0061410d..2454448baa 100644
> --- a/xen/include/public/io/netif.h
> +++ b/xen/include/public/io/netif.h
> @@ -353,6 +353,9 @@ struct xen_netif_ctrl_request {
>  #define XEN_NETIF_CTRL_TYPE_SET_HASH_MAPPING_SIZE 5
>  #define XEN_NETIF_CTRL_TYPE_SET_HASH_MAPPING  6
>  #define XEN_NETIF_CTRL_TYPE_SET_HASH_ALGORITHM7
> +#define XEN_NETIF_CTRL_TYPE_GET_GREF_MAPPING_SIZE 8
> +#define XEN_NETIF_CTRL_TYPE_ADD_GREF_MAPPING  9
> +#define XEN_NETIF_CTRL_TYPE_DEL_GREF_MAPPING 10
> 
>  uint32_t data[3];
>  };
> @@ -391,6 +394,44 @@ struct xen_netif_ctrl_response {
>  };
> 
>  /*
> + * Static Grants (struct xen_netif_gref)
> + * =
> + *
> + * A frontend may provide a fixed set of grant references to be mapped on
> + * the backend. The message of type
> XEN_NETIF_CTRL_TYPE_ADD_GREF_MAPPING
> + * prior its usage in the command ring allows for creation of these mappings.
> + * The backend will maintain a fixed amount of these mappings.
> + *
> + * XEN_NETIF_CTRL_TYPE_GET_GREF_MAPPING_SIZE lets a frontend
> query how many
> + * of these mappings can be kept.
> + *
> + * Each entry in the XEN_NETIF_CTRL_TYPE_{ADD,DEL}_GREF_MAPPING
> input table has
> + * the following format:
> + *
> + *0 1 2 3 4 5 6 7  octet
> + * +-+-+-+-+-+-+-+-+
> + * | grant ref |  flags|  status   |
> + * +-+-+-+-+-+-+-+-+
> + *
> + * grant ref: grant reference (IN)
> + * flags: flags describing the control operation (IN)
> + * status: XEN_NETIF_CTRL_STATUS_* (OUT)
> + *
> + * 'status' is an output parameter which does not require to be set to zero
> + * prior to its usage in the corresponding control messages.
> + */
> +
> +struct xen_netif_gref {
> +   grant_ref_t ref;
> +   uint16_t flags;
> +
> +#define _XEN_NETIF_CTRLF_GREF_readonly0
> +#define XEN_NETIF_CTRLF_GREF_readonly
> (1U<<_XEN_NETIF_CTRLF_GREF_readonly)
> +
> +   uint16_t status;
> +};
> +
> +/*
>   * Control messages
>   * 
>   *
> @@ -609,6 +650,88 @@ struct xen_netif_ctrl_response {
>   *   invalidate any table data outside that range.
>   *   The grant reference may be read-only and must remain valid until
>   *   the response has been processed.
> + *
> + * XEN_NETIF_CTRL_TYPE_GET_GREF_MAPPING_SIZE
> + * -
> + *
> + * This is sent by the frontend to fetch the number of grefs that can be kept
> + * mapped in the backend.
> + *
> + * Request:
> + *
> + *  type= XEN_NETIF_CTRL_TYPE_GET_GREF_MAPPING_SIZE
> + *  data[0] = queue index (assumed 0 for single queue)
> + *  data[1] = 0
> + *  data[2] = 0
> + *
> + * Response:
> + *
> + *  status = XEN_NETIF_CTRL_STATUS_NOT_SUPPORTED - Operation not
> + * supported
> + *   XEN_NETIF_CTRL_STATUS_INVALID_PARAMETER - The queue index
> is
> + * out of range
> + *   XEN_NETIF_CTRL_STATUS_SUCCESS   - Operation successful
> + *  data   = maximum number of entries allowed in the gref mapping table
> + *   (if operation was successful) or zero if it is not supported.
> + *
> + * XEN_NETIF_CTRL_TYPE_

Re: [Xen-devel] [PATCH v8 03/15] xen: add new domctl hypercall to set grant table resource limits

2017-09-20 Thread Paul Durrant
> -Original Message-
> From: Xen-devel [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of
> Juergen Gross
> Sent: 20 September 2017 07:34
> To: xen-de...@lists.xenproject.org
> Cc: Juergen Gross ; sstabell...@kernel.org; Wei Liu
> ; George Dunlap ;
> Andrew Cooper ; Ian Jackson
> ; Tim (Xen.org) ;
> julien.gr...@arm.com; jbeul...@suse.com; dgde...@tycho.nsa.gov
> Subject: [Xen-devel] [PATCH v8 03/15] xen: add new domctl hypercall to set
> grant table resource limits
> 
> Add a domctl hypercall to set the domain's resource limits regarding
> grant tables.
> 
> Signed-off-by: Juergen Gross 
> Acked-by: Daniel De Graaf 

Reviewed-by: Paul Durrant 

> ---
> V8:
> - remove stale #if 0, adjust comments (Paul Durrant)
> 
> V6:
> - moved earlier in series to support set_gnttab_limits being
>   mandatory for domain creation
> 
> V5:
> - add set_gnttab_limits to create_domain_common in xen.if
>   (Daniel De Graaf)
> 
> V3:
> - rename *gnttbl* to *gnttab* (Paul Durrant)
> ---
>  tools/flask/policy/modules/dom0.te  |  2 +-
>  tools/flask/policy/modules/xen.if   |  2 +-
>  xen/common/domctl.c |  6 ++
>  xen/common/grant_table.c| 19 +++
>  xen/include/public/domctl.h |  7 +++
>  xen/include/xen/grant_table.h   |  2 ++
>  xen/xsm/flask/hooks.c   |  3 +++
>  xen/xsm/flask/policy/access_vectors |  2 ++
>  8 files changed, 41 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/flask/policy/modules/dom0.te
> b/tools/flask/policy/modules/dom0.te
> index 338caaf41e..1643b400f0 100644
> --- a/tools/flask/policy/modules/dom0.te
> +++ b/tools/flask/policy/modules/dom0.te
> @@ -39,7 +39,7 @@ allow dom0_t dom0_t:domain {
>  };
>  allow dom0_t dom0_t:domain2 {
>   set_cpuid gettsc settsc setscheduler set_max_evtchn
> set_vnumainfo
> - get_vnumainfo psr_cmt_op psr_cat_op
> + get_vnumainfo psr_cmt_op psr_cat_op set_gnttab_limits
>  };
>  allow dom0_t dom0_t:resource { add remove };
> 
> diff --git a/tools/flask/policy/modules/xen.if
> b/tools/flask/policy/modules/xen.if
> index 912640002e..55437496f6 100644
> --- a/tools/flask/policy/modules/xen.if
> +++ b/tools/flask/policy/modules/xen.if
> @@ -52,7 +52,7 @@ define(`create_domain_common', `
>   settime setdomainhandle getvcpucontext
> set_misc_info };
>   allow $1 $2:domain2 { set_cpuid settsc setscheduler setclaim
>   set_max_evtchn set_vnumainfo get_vnumainfo
> cacheflush
> - psr_cmt_op psr_cat_op soft_reset };
> + psr_cmt_op psr_cat_op soft_reset set_gnttab_limits
> };
>   allow $1 $2:security check_context;
>   allow $1 $2:shadow enable;
>   allow $1 $2:mmu { map_read map_write adjust memorymap
> physmap pinpage mmuext_op updatemp };
> diff --git a/xen/common/domctl.c b/xen/common/domctl.c
> index 42658e5744..58381f8fe9 100644
> --- a/xen/common/domctl.c
> +++ b/xen/common/domctl.c
> @@ -14,6 +14,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -1149,6 +1150,11 @@ long
> do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
>  copyback = 1;
>  break;
> 
> +case XEN_DOMCTL_set_gnttab_limits:
> +ret = grant_table_set_limits(d, op->u.set_gnttab_limits.grant_frames,
> + 
> op->u.set_gnttab_limits.maptrack_frames);
> +break;
> +
>  default:
>  ret = arch_do_domctl(op, d, u_domctl);
>  break;
> diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
> index ac845dbb35..f48eeff7ad 100644
> --- a/xen/common/grant_table.c
> +++ b/xen/common/grant_table.c
> @@ -3640,6 +3640,25 @@ void grant_table_init_vcpu(struct vcpu *v)
>  v->maptrack_tail = MAPTRACK_TAIL;
>  }
> 
> +int grant_table_set_limits(struct domain *d, unsigned int grant_frames,
> +   unsigned int maptrack_frames)
> +{
> +struct grant_table *gt = d->grant_table;
> +int ret = -EBUSY;
> +
> +if ( !gt )
> +return -ENOENT;
> +
> +grant_write_lock(gt);
> +
> +ret = 0;
> +/* Set limits, alloc needed arrays. */
> +
> +grant_write_unlock(gt);
> +
> +return ret;
> +}
> +
>  #ifdef CONFIG_HAS_MEM_SHARING
>  int mem_sharing_gref_to_gfn(struct grant_table *gt, grant_ref_t ref,
>  gfn_t *gfn, uint16_t *status)
> diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
> index 50ff58f5b9..167502c60b 100644
> --- a/xen/include/public/domctl.h
> +++ b/xen/include/public/domctl.h
> @@ -1163,6 +1163,11 @@ struct xen_domctl_psr_cat_op {
>  typedef struct xen_domctl_psr_cat_op xen_domctl_psr_cat_op_t;
>  DEFINE_XEN_GUEST_HANDLE(xen_domctl_psr_cat_op_t);
> 
> +struct xen_domctl_set_gnttab_limits {
> +uint32_t grant_frames; /* IN */
> +uint32_t maptrack_frames;  /* IN */
> +};
> +
>  struct xen_domctl {
>  uint32_t cmd;
>  #define XEN_DOMCTL_createdomain   1
> 

Re: [Xen-devel] [PATCH] vt-d: use two 32-bit writes to update DMAR fault address registers

2017-09-20 Thread Roger Pau Monné
On Mon, Sep 11, 2017 at 02:00:48PM +0800, Haozhong Zhang wrote:
> The 64-bit DMAR fault address is composed of two 32 bits registers
> DMAR_FEADDR_REG and DMAR_FEUADDR_REG. According to VT-d spec:
> "Software is expected to access 32-bit registers as aligned doublewords",
> a hypervisor should use two 32-bit writes to DMAR_FEADDR_REG and
> DMAR_FEUADDR_REG separately in order to update a 64-bit fault address,
> rather than a 64-bit write to DMAR_FEADDR_REG.

I would add:

"Note that when x2APIC is disabled DMAR_FEUADDR_REG is reserved and it's not
necessary to update it."

> Though I haven't seen any errors caused by such one 64-bit write on
> real machines, it's still better to follow the specification.
> 
> Signed-off-by: Haozhong Zhang 

Given the reply from Kevin:

Reviewed-by: Roger Pau Monné 

Thanks, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 2/3] Introduce migration precopy policy

2017-09-20 Thread Paul Durrant
> -Original Message-
> From: Xen-devel [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of
> Jennifer Herbert
> Sent: 19 September 2017 19:06
> To: Ian Jackson ; Wei Liu ;
> xen-de...@lists.xenproject.org; jto...@uwaterloo.ca
> Cc: Jennifer Herbert 
> Subject: [Xen-devel] [PATCH v2 2/3] Introduce migration precopy policy
> 
> This Patch allows a migration precopy policy to be specified.
> 
> The precopy phase of the xc_domain_save() live migration algorithm has
> historically been implemented to run until either a) (almost) no pages
> are dirty or b) some fixed, hard-coded maximum number of precopy
> iterations has been exceeded.  This policy and its implementation are
> less than ideal for a few reasons:
> - the logic of the policy is intertwined with the control flow of the
>   mechanism of the precopy stage
> - it can't take into account facts external to the immediate
>   migration context, such external state transfer state, interactive
>   user input, or the passage of wall-clock time.
> - it does not permit the user to change their mind, over time, about
>   what to do at the end of the precopy (they get an unconditional
>   transition into the stop-and-copy phase of the migration)
> 
> To permit callers to implement arbitrary higher-level policies governing
> when the live migration precopy phase should end, and what should be
> done next:
> - add a precopy_policy() callback to the xc_domain_save() user-supplied
>   callbacks
> - during the precopy phase of live migrations, consult this policy after
>   each batch of pages transmitted and take the dictated action, which
>   may be to a) abort the migration entirely, b) continue with the
>   precopy, or c) proceed to the stop-and-copy phase.
> - provide an implementation of the old policy, used when
>   precopy_policy callback  is not provided.
> 
> Signed-off-by: Jennifer Herbert 
> Signed-off-by: Joshua Otto 
> 
> ---
> 
> v2:
> 
> Have made a few formatting corrections, added typedef as suggested.
> 
> v1:
> 
> This is updated/modified subset of patch 7/20, part of
> Joshua Otto's "Add postcopy live migration support." patch,
> dated 27th March 2017.  As indicated on the original thread,
> I wish to make use of this this within the XenServer product.
> I hope this will aid Josh in pushing the remainder of his series.
> 
> ---
>  tools/libxc/include/xenguest.h | 31 --
>  tools/libxc/xc_sr_common.h |  6 +--
>  tools/libxc/xc_sr_save.c   | 97 +
> -
>  3 files changed, 97 insertions(+), 37 deletions(-)
> 
> diff --git a/tools/libxc/include/xenguest.h b/tools/libxc/include/xenguest.h
> index 6626f0c..a2a654c 100644
> --- a/tools/libxc/include/xenguest.h
> +++ b/tools/libxc/include/xenguest.h
> @@ -39,6 +39,16 @@
>   */
>  struct xenevtchn_handle;
> 
> +/* For save's precopy_policy(). */
> +struct precopy_stats
> +{
> +unsigned int iteration;
> +unsigned int total_written;
> +long dirty_count; /* -1 if unknown */
> +};
> +
> +typedef int (*precopy_policy_t)(struct precopy_stats, void *);
> +
>  /* callbacks provided by xc_domain_save */
>  struct save_callbacks {
>  /* Called after expiration of checkpoint interval,
> @@ -46,7 +56,20 @@ struct save_callbacks {
>   */
>  int (*suspend)(void* data);
> 
> -/* Called after the guest's dirty pages have been
> +/*
> + * Called after every batch of page data sent during the precopy
> + * phase of a live migration to ask the caller what to do next
> + * based on the current state of the precopy migration.
> + */
> +#define XGS_POLICY_ABORT  (-1) /* Abandon the migration entirely
> +* and tidy up. */
> +#define XGS_POLICY_CONTINUE_PRECOPY 0  /* Remain in the precopy
> phase. */
> +#define XGS_POLICY_STOP_AND_COPY1  /* Immediately suspend and
> transmit the
> +* remaining dirty pages. */
> +precopy_policy_t precopy_policy;
> +
> +/*
> + * Called after the guest's dirty pages have been
>   *  copied into an output buffer.
>   * Callback function resumes the guest & the device model,
>   *  returns to xc_domain_save.
> @@ -55,7 +78,8 @@ struct save_callbacks {
>   */
>  int (*postcopy)(void* data);
> 
> -/* Called after the memory checkpoint has been flushed
> +/*
> + * Called after the memory checkpoint has been flushed
>   * out into the network. Typical actions performed in this
>   * callback include:
>   *   (a) send the saved device model state (for HVM guests),
> @@ -65,7 +89,8 @@ struct save_callbacks {
>   *
>   * returns:
>   * 0: terminate checkpointing gracefully
> - * 1: take another checkpoint */
> + * 1: take another checkpoint
> + */
>  int (*checkpoint)(void* data);
> 
>  /*
> diff --git a/tools/libxc/xc_sr_common.h b/tools/libxc/xc_sr_common.h
> index a83f22a..3635704 100644
> --- a/tools/libxc/xc_sr_com

Re: [Xen-devel] [PATCH v3 01/15] docs: create Memory Bandwidth Allocation (MBA) feature document

2017-09-20 Thread Roger Pau Monné
On Wed, Sep 20, 2017 at 11:06:57AM +0800, Yi Sun wrote:
> On 17-09-18 18:16:40, Roger Pau Monn� wrote:
> > On Tue, Sep 05, 2017 at 05:32:23PM +0800, Yi Sun wrote:
> > > +* xl interfaces:
> > > +
> > > +  1. `psr-mba-show [domain-id]`:
> > 
> > Is this limited to domain-id, or one can also use the domain name?
> > Most of the xl commands accept either a domain-id or a domain-name.
> > 
> Both domain-id and domain-name can show it. I thought this is by default and
> no need to explicitly declare. If I am wrong, I will change it as below:
> `psr-mba-show [domain-id/domain-name]`

[domain-id|domain-name]

Would be better IMHO.

> > > +
> > > +  3. DOMCTL:
> > > +  - XEN_DOMCTL_PSR_MBA_OP_GET_THRTL: Get throttling for a domain.
> > > +  - XEN_DOMCTL_PSR_MBA_OP_SET_THRTL: Set throttling for a domain.
> > > +
> > > +* xl interfaces:
> > > +
> > > +  1. psr-mba-show [domain-id]
> > > +  Show system/domain runtime MBA throttling value. For linear 
> > > mode,
> > > +  it shows the decimal value. For non-linear mode, it shows 
> > > hexadecimal
> > > +  value.
> > > +  => XEN_SYSCTL_PSR_MBA_get_info/XEN_DOMCTL_PSR_MBA_OP_GET_THRTL
> > > +
> > > +  2. psr-mba-set [OPTIONS]  
> > > +  Set bandwidth throttling for a domain.
> > > +  => XEN_DOMCTL_PSR_MBA_OP_SET_THRTL
> > > +
> > > +  3. psr-hwinfo
> > > +  Show PSR HW information, including L3 CAT/CDP/L2 CAT/MBA.
> > > +  => XEN_SYSCTL_PSR_MBA_get_info
> > 
> > 'psr-hwinfo' seems to be completely missing from the 'xl interfaces:'
> > section above.
> > 
> Because this is not a newly added interface, I do not describe it in 'xl
> interfaces'. Is that necessary?

Oh, OK, sorry for the noise. Then I guess it's not necessary to
describe it here. Maybe a reference to where 'psr-hwinfo' is described
would be nice (I assume there's a feature document somewhere that
describes 'psr-hwinfo').

Roger.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 06/15] x86: implement get value interface for MBA

2017-09-20 Thread Roger Pau Monné
On Wed, Sep 20, 2017 at 01:09:06PM +0800, Yi Sun wrote:
> On 17-09-19 10:15:42, Roger Pau Monn� wrote:
> > On Tue, Sep 05, 2017 at 05:32:28PM +0800, Yi Sun wrote:
> > > diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
> > > index 696eff2..7902af7 100644
> > > --- a/xen/arch/x86/domctl.c
> > > +++ b/xen/arch/x86/domctl.c
> > > @@ -1496,6 +1496,13 @@ long arch_do_domctl(
> > >  copyback = true;
> > >  break;
> > >  
> > > +case XEN_DOMCTL_PSR_ALLOC_GET_MBA_THRTL:
> > > +ret = psr_get_val(d, domctl->u.psr_alloc.target,
> > > +  &val32, PSR_TYPE_MBA_THRTL);
> > > +domctl->u.psr_alloc.data = val32;
> > 
> > Hm, why does psr_get_val take a uint32_t * instead of a uint64_t *? So
> > that you can directly pass &domctl->u.psr_alloc.data.
> > 
> > Or the other way around, why is domctl->u.psr_alloc.data a uint64_t
> > instead of a uint32_t?
> > 
> There is a historical reason. The COS MSR is 64bit. So, the original codes
> in L3 CAT (submitted years ago) used uint64_t.
> 
> But during L2 CAT review, per Jan's comment, the uint64_t is not necessary
> in psr.c. So, we convert it to uint32_t in psr.c and make the codes you see
> here.

Since this is a DOMCTL, you can change the type of the structure to be
an uint32_t, so that you can pass it directly (unless I'm missing
something else that requires this to be uint64_t).

Roger.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] minimum Python version

2017-09-20 Thread Juergen Gross
On 20/09/17 09:55, Jan Beulich wrote:
 On 20.09.17 at 09:20,  wrote:
>> For your case: what about adding a configure option to disable building
>> the python bindings (and any dependencies like e.g. pygrub) instead of
>> doing so in case of a version mismatch? This would avoid any surprises
>> in case someone didn't notice that the bindings have been disabled.
> 
> That's certainly an option, but I'm generally not very happy with
> having to add such options. My script invoking configure becomes
> more and more unreadable because of such, since this way the
> requirement to do the version check gets moved from the build
> system (i.e. configure, whose job this is imo) to that script. And
> as you can probably understand I don't want to unilaterally
> disable the building of that code, to be able to build test possible
> changes I may end up making to e.g. libxc.
> 
> IOW if such a configure option was added, I'd still think it should
> default to disabled if the Python in use is not new enough.

I can understand your concerns.

OTOH it is quite nasty to find out that due to a missing or too old
package on the system some parts of Xen haven't been built without
having them disabled via configure (I had this experience more than
once, e.g. with a Mini-OS patch breaking vtpm stubdom which I didn't
notice as it wasn't being built on my side).

What about the following idea: maybe we could add an option to configure
to make it fail instead of disabling components in case some
requirements for that component are not met? The default could be to
just disable the components and with the option specified all components
not disabled explicitly must be buildable or configure will fail.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v8 03/15] xen: add new domctl hypercall to set grant table resource limits

2017-09-20 Thread Jan Beulich
>>> On 20.09.17 at 10:26,  wrote:
>>  -Original Message-
>> From: Xen-devel [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of
>> Juergen Gross
>> Sent: 20 September 2017 07:34
>> To: xen-de...@lists.xenproject.org 
>> Cc: Juergen Gross ; sstabell...@kernel.org; Wei Liu
>> ; George Dunlap ;
>> Andrew Cooper ; Ian Jackson
>> ; Tim (Xen.org) ;
>> julien.gr...@arm.com; jbeul...@suse.com; dgde...@tycho.nsa.gov 
>> Subject: [Xen-devel] [PATCH v8 03/15] xen: add new domctl hypercall to set
>> grant table resource limits
>> 
>> Add a domctl hypercall to set the domain's resource limits regarding
>> grant tables.
>> 
>> Signed-off-by: Juergen Gross 
>> Acked-by: Daniel De Graaf 
> 
> Reviewed-by: Paul Durrant 

Acked-by: Jan Beulich 
despite ...

>> --- a/xen/common/grant_table.c
>> +++ b/xen/common/grant_table.c
>> @@ -3640,6 +3640,25 @@ void grant_table_init_vcpu(struct vcpu *v)
>>  v->maptrack_tail = MAPTRACK_TAIL;
>>  }
>> 
>> +int grant_table_set_limits(struct domain *d, unsigned int grant_frames,
>> +   unsigned int maptrack_frames)
>> +{
>> +struct grant_table *gt = d->grant_table;
>> +int ret = -EBUSY;

... the initializer being pointless here (it becomes relevant only in
patch 11).

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 12/13] [RFC] iommu: VT-d: Squash map_pages/unmap_pages with map_page/unmap_page

2017-09-20 Thread Tian, Kevin
this patch alone looks OK to me. but I haven't found time to review
the whole series to judge whether below change is necessary.

> From: Oleksandr Tyshchenko [mailto:olekst...@gmail.com]
> Sent: Tuesday, September 12, 2017 10:44 PM
> 
> Hi.
> 
> Gentle reminder.
> 
> On Mon, Aug 21, 2017 at 7:44 PM, Oleksandr Tyshchenko
>  wrote:
> > Hi, all.
> >
> > Any comments?
> >
> > On Tue, Jul 25, 2017 at 8:26 PM, Oleksandr Tyshchenko
> >  wrote:
> >> From: Oleksandr Tyshchenko 
> >>
> >> Reduce the scope of the TODO by squashing single-page stuff with
> >> multi-page one. Next target is to use large pages whenever possible
> >> in the case that hardware supports them.
> >>
> >> Signed-off-by: Oleksandr Tyshchenko
> 
> >> CC: Jan Beulich 
> >> CC: Kevin Tian 
> >>
> >> ---
> >>Changes in v1:
> >>   -
> >>
> >>Changes in v2:
> >>   -
> >> ---
> >>  xen/drivers/passthrough/vtd/iommu.c | 138 +
> ---
> >>  1 file changed, 67 insertions(+), 71 deletions(-)
> >>
> >> diff --git a/xen/drivers/passthrough/vtd/iommu.c
> b/xen/drivers/passthrough/vtd/iommu.c
> >> index 45d1f36..d20b2f9 100644
> >> --- a/xen/drivers/passthrough/vtd/iommu.c
> >> +++ b/xen/drivers/passthrough/vtd/iommu.c
> >> @@ -1750,15 +1750,24 @@ static void
> iommu_domain_teardown(struct domain *d)
> >>  spin_unlock(&hd->arch.mapping_lock);
> >>  }
> >>
> >> -static int __must_check intel_iommu_map_page(struct domain *d,
> >> - unsigned long gfn,
> >> - unsigned long mfn,
> >> - unsigned int flags)
> >> +static int __must_check intel_iommu_unmap_pages(struct domain *d,
> >> +unsigned long gfn,
> >> +unsigned int order);
> >> +
> >> +/*
> >> + * TODO: Optimize by using large pages whenever possible in the case
> >> + * that hardware supports them.
> >> + */
> >> +static int __must_check intel_iommu_map_pages(struct domain *d,
> >> +  unsigned long gfn,
> >> +  unsigned long mfn,
> >> +  unsigned int order,
> >> +  unsigned int flags)
> >>  {
> >>  struct domain_iommu *hd = dom_iommu(d);
> >> -struct dma_pte *page = NULL, *pte = NULL, old, new = { 0 };
> >> -u64 pg_maddr;
> >>  int rc = 0;
> >> +unsigned long orig_gfn = gfn;
> >> +unsigned long i;
> >>
> >>  /* Do nothing if VT-d shares EPT page table */
> >>  if ( iommu_use_hap_pt(d) )
> >> @@ -1768,78 +1777,60 @@ static int __must_check
> intel_iommu_map_page(struct domain *d,
> >>  if ( iommu_passthrough && is_hardware_domain(d) )
> >>  return 0;
> >>
> >> -spin_lock(&hd->arch.mapping_lock);
> >> -
> >> -pg_maddr = addr_to_dma_page_maddr(d, (paddr_t)gfn <<
> PAGE_SHIFT_4K, 1);
> >> -if ( pg_maddr == 0 )
> >> +for ( i = 0; i < (1UL << order); i++, gfn++, mfn++ )
> >>  {
> >> -spin_unlock(&hd->arch.mapping_lock);
> >> -return -ENOMEM;
> >> -}
> >> -page = (struct dma_pte *)map_vtd_domain_page(pg_maddr);
> >> -pte = page + (gfn & LEVEL_MASK);
> >> -old = *pte;
> >> -dma_set_pte_addr(new, (paddr_t)mfn << PAGE_SHIFT_4K);
> >> -dma_set_pte_prot(new,
> >> - ((flags & IOMMUF_readable) ? DMA_PTE_READ  : 0) |
> >> - ((flags & IOMMUF_writable) ? DMA_PTE_WRITE : 0));
> >> +struct dma_pte *page = NULL, *pte = NULL, old, new = { 0 };
> >> +u64 pg_maddr;
> >>
> >> -/* Set the SNP on leaf page table if Snoop Control available */
> >> -if ( iommu_snoop )
> >> -dma_set_pte_snp(new);
> >> +spin_lock(&hd->arch.mapping_lock);
> >>
> >> -if ( old.val == new.val )
> >> -{
> >> +pg_maddr = addr_to_dma_page_maddr(d, (paddr_t)gfn <<
> PAGE_SHIFT_4K, 1);
> >> +if ( pg_maddr == 0 )
> >> +{
> >> +spin_unlock(&hd->arch.mapping_lock);
> >> +rc = -ENOMEM;
> >> +goto err;
> >> +}
> >> +page = (struct dma_pte *)map_vtd_domain_page(pg_maddr);
> >> +pte = page + (gfn & LEVEL_MASK);
> >> +old = *pte;
> >> +dma_set_pte_addr(new, (paddr_t)mfn << PAGE_SHIFT_4K);
> >> +dma_set_pte_prot(new,
> >> + ((flags & IOMMUF_readable) ? DMA_PTE_READ  : 0) |
> >> + ((flags & IOMMUF_writable) ? DMA_PTE_WRITE : 0));
> >> +
> >> +/* Set the SNP on leaf page table if Snoop Control available */
> >> +if ( iommu_snoop )
> >> +dma_set_pte_snp(new);
> >> +
> >> +if ( old.val == new.val )
> >> +{
> >> +spin_unlock(&hd->arch.mapping_lock);
> >> +unmap_vtd_domain_page(page);
> >> +continue;
> >> + 

Re: [Xen-devel] [PATCH v3 13/15] tools: implement new generic get value interface and MBA get value command

2017-09-20 Thread Roger Pau Monné
On Wed, Sep 20, 2017 at 02:46:24PM +0800, Yi Sun wrote:
> On 17-09-19 12:02:08, Roger Pau Monn� wrote:
> > On Tue, Sep 05, 2017 at 05:32:35PM +0800, Yi Sun wrote:
> > > +};
> > >  char *msg;
> > >  
> > >  switch (err) {
> > >  case ENODEV:
> > > -msg = "CAT is not supported in this system";
> > > +msg = "is not supported in this system";
> > >  break;
> > >  case ENOENT:
> > > -msg = "CAT is not enabled on the socket";
> > > +msg = "is not enabled on the socket";
> > >  break;
> > >  case EOVERFLOW:
> > >  msg = "no free COS available";
> > > @@ -106,7 +120,7 @@ static void libxl__psr_cat_log_err_msg(libxl__gc *gc, 
> > > int err)
> > >  return;
> > >  }
> > >  
> > > -LOGE(ERROR, "%s", msg);
> > > +LOGE(ERROR, "%s: %s", feat_name[type], msg);
> > 
> > I don't think you should use LOGE here, but rather LOG. LOGE should be
> > used when errno is set, which I don't think is the case here.
> > 
> But two places to call libxl__psr_cat_log_err_msg all set errno in 'xc_'
> functions.

But you already translate the error into a custom message ('msg' in the
above context), and the error variable in this case is 'err' not
'errno', so LOGE is not appropriate here IMHO.

> > > @@ -347,18 +361,7 @@ int libxl_psr_cat_get_cbm(libxl_ctx *ctx, uint32_t 
> > > domid,
> > >libxl_psr_cbm_type type, uint32_t target,
> > >uint64_t *cbm_r)
> > >  {
> > > -GC_INIT(ctx);
> > > -int rc = 0;
> > > -xc_psr_type xc_type = libxl__psr_cbm_type_to_libxc_psr_type(type);
> > > -
> > > -if (xc_psr_cat_get_domain_data(ctx->xch, domid, xc_type,
> > > -   target, cbm_r)) {
> > > -libxl__psr_cat_log_err_msg(gc, errno);
> > > -rc = ERROR_FAIL;
> > > -}
> > > -
> > > -GC_FREE;
> > > -return rc;
> > > +return libxl_psr_get_val(ctx, domid, type, target, cbm_r);
> > >  }
> > 
> > You could even move this to libxl.h as a static function IMHO.
> > 
> Yes. But I prefer to keep it here with other interfaces together. Is that
> acceptable to you?

OK, as you wish.

> > > diff --git a/tools/xl/xl_psr.c b/tools/xl/xl_psr.c
> > > index 40269b4..46b7788 100644
> > > --- a/tools/xl/xl_psr.c
> > > +++ b/tools/xl/xl_psr.c
> > > -static int psr_cat_print_socket(uint32_t domid, libxl_psr_cat_info *info,
> > > -unsigned int lvl)
> > > +static int psr_print_socket(uint32_t domid,
> > > +libxl_psr_hw_info *info,
> > > +libxl_psr_feat_type type,
> > > +unsigned int lvl)
> > >  {
> > > -int rc;
> > > -uint32_t l3_cache_size;
> > > -
> > >  printf("%-16s: %u\n", "Socket ID", info->id);
> > >  
> > > -/* So far, CMT only supports L3 cache. */
> > > -if (lvl == 3) {
> > > -rc = libxl_psr_cmt_get_l3_cache_size(ctx, info->id, 
> > > &l3_cache_size);
> > > -if (rc) {
> > > -fprintf(stderr, "Failed to get l3 cache size for 
> > > socket:%d\n",
> > > -info->id);
> > > -return -1;
> > > +switch (type) {
> > > +case LIBXL_PSR_FEAT_TYPE_CAT:
> > > +{
> > > +int rc;
> > > +uint32_t l3_cache_size;
> > > +
> > > +/* So far, CMT only supports L3 cache. */
> > > +if (lvl == 3) {
> > 
> > Shouldn't you print some kind of error message if lvl != 3? Or is it
> > expected that this function will be called with lvl != 3 and it should
> > be ignored?
> > 
> We only get cache size for level 3 cache. So, if input is lvl=2, we print
> nothing.

My question would be, is it expected that this function is called with
lvl == 2 as part of the normal operation with valid input values?

Roger.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v8 04/15] xen: add function for obtaining highest possible memory address

2017-09-20 Thread Jan Beulich
>>> On 20.09.17 at 08:34,  wrote:
> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -6312,6 +6312,17 @@ int pv_ro_page_fault(unsigned long addr, struct 
> cpu_user_regs *regs)
>  return 0;
>  }
>  
> +unsigned long arch_get_upper_mfn_bound(void)
> +{
> +unsigned long max_mfn;
> +
> +max_mfn = mem_hotplug ? PFN_DOWN(mem_hotplug) : max_page;

Taking into account the code in the caller of this function as well
as the ARM counterpart I find the use of max_page here odd. I'd
prefer if get_upper_mfn_bound() went away altogether, and it's
sole caller (which strangely enough doesn't get introduced here)
called the arch function directly. Additionally, with the caller being
a sysctl, how is that supposed to help a PV DomU kernel in their
choice of grant table version?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v8 04/15] xen: add function for obtaining highest possible memory address

2017-09-20 Thread Juergen Gross
On 20/09/17 10:57, Jan Beulich wrote:
 On 20.09.17 at 08:34,  wrote:
>> --- a/xen/arch/x86/mm.c
>> +++ b/xen/arch/x86/mm.c
>> @@ -6312,6 +6312,17 @@ int pv_ro_page_fault(unsigned long addr, struct 
>> cpu_user_regs *regs)
>>  return 0;
>>  }
>>  
>> +unsigned long arch_get_upper_mfn_bound(void)
>> +{
>> +unsigned long max_mfn;
>> +
>> +max_mfn = mem_hotplug ? PFN_DOWN(mem_hotplug) : max_page;
> 
> Taking into account the code in the caller of this function as well
> as the ARM counterpart I find the use of max_page here odd. I'd
> prefer if get_upper_mfn_bound() went away altogether, and it's
> sole caller (which strangely enough doesn't get introduced here)
> called the arch function directly. Additionally, with the caller being
> a sysctl, how is that supposed to help a PV DomU kernel in their
> choice of grant table version?

Did you look at patch 15?


Juergen


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v8 05/15] xen: add max possible mfn to struct xen_sysctl_physinfo

2017-09-20 Thread Jan Beulich
>>> On 20.09.17 at 08:34,  wrote:
> Add the maximum possible mfn to struct xen_sysctl_physinfo in order to
> enable Xen tools to size the grant table frame limits for a domU.
> 
> Signed-off-by: Juergen Gross 
> ---
>  xen/common/sysctl.c | 1 +
>  xen/include/public/sysctl.h | 2 ++
>  2 files changed, 3 insertions(+)

As indicated in reply to patch 4, I think the on here should be folded
into that one, in order to not transiently leave around a dead function.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v8 05/15] xen: add max possible mfn to struct xen_sysctl_physinfo

2017-09-20 Thread Juergen Gross
On 20/09/17 10:58, Jan Beulich wrote:
 On 20.09.17 at 08:34,  wrote:
>> Add the maximum possible mfn to struct xen_sysctl_physinfo in order to
>> enable Xen tools to size the grant table frame limits for a domU.
>>
>> Signed-off-by: Juergen Gross 
>> ---
>>  xen/common/sysctl.c | 1 +
>>  xen/include/public/sysctl.h | 2 ++
>>  2 files changed, 3 insertions(+)
> 
> As indicated in reply to patch 4, I think the on here should be folded
> into that one, in order to not transiently leave around a dead function.

Okay, I can do it.


Juergen


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 01/15] docs: create Memory Bandwidth Allocation (MBA) feature document

2017-09-20 Thread Yi Sun
On 17-09-20 09:36:13, Roger Pau Monn� wrote:
> On Wed, Sep 20, 2017 at 11:06:57AM +0800, Yi Sun wrote:
> > On 17-09-18 18:16:40, Roger Pau Monn� wrote:
> > > On Tue, Sep 05, 2017 at 05:32:23PM +0800, Yi Sun wrote:
> > > > +* xl interfaces:
> > > > +
> > > > +  1. `psr-mba-show [domain-id]`:
> > > 
> > > Is this limited to domain-id, or one can also use the domain name?
> > > Most of the xl commands accept either a domain-id or a domain-name.
> > > 
> > Both domain-id and domain-name can show it. I thought this is by default and
> > no need to explicitly declare. If I am wrong, I will change it as below:
> > `psr-mba-show [domain-id/domain-name]`
> 
> [domain-id|domain-name]
> 
> Would be better IMHO.
> 
Thanks!

> > > > +
> > > > +  3. DOMCTL:
> > > > +  - XEN_DOMCTL_PSR_MBA_OP_GET_THRTL: Get throttling for a 
> > > > domain.
> > > > +  - XEN_DOMCTL_PSR_MBA_OP_SET_THRTL: Set throttling for a 
> > > > domain.
> > > > +
> > > > +* xl interfaces:
> > > > +
> > > > +  1. psr-mba-show [domain-id]
> > > > +  Show system/domain runtime MBA throttling value. For linear 
> > > > mode,
> > > > +  it shows the decimal value. For non-linear mode, it shows 
> > > > hexadecimal
> > > > +  value.
> > > > +  => 
> > > > XEN_SYSCTL_PSR_MBA_get_info/XEN_DOMCTL_PSR_MBA_OP_GET_THRTL
> > > > +
> > > > +  2. psr-mba-set [OPTIONS]  
> > > > +  Set bandwidth throttling for a domain.
> > > > +  => XEN_DOMCTL_PSR_MBA_OP_SET_THRTL
> > > > +
> > > > +  3. psr-hwinfo
> > > > +  Show PSR HW information, including L3 CAT/CDP/L2 CAT/MBA.
> > > > +  => XEN_SYSCTL_PSR_MBA_get_info
> > > 
> > > 'psr-hwinfo' seems to be completely missing from the 'xl interfaces:'
> > > section above.
> > > 
> > Because this is not a newly added interface, I do not describe it in 'xl
> > interfaces'. Is that necessary?
> 
> Oh, OK, sorry for the noise. Then I guess it's not necessary to
> describe it here. Maybe a reference to where 'psr-hwinfo' is described
> would be nice (I assume there's a feature document somewhere that
> describes 'psr-hwinfo').
> 
psr-hwinfo is firstly introduced in intel_psr_cat_cdp.pandoc. But MBA feature
adds a new sysctl interface 'XEN_SYSCTL_PSR_MBA_get_info' which is used by
psr-hwinfo. So, I describe it here again.

> Roger.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 13/15] tools: implement new generic get value interface and MBA get value command

2017-09-20 Thread Yi Sun
On 17-09-20 09:57:59, Roger Pau Monn� wrote:
> On Wed, Sep 20, 2017 at 02:46:24PM +0800, Yi Sun wrote:
> > On 17-09-19 12:02:08, Roger Pau Monn� wrote:
> > > On Tue, Sep 05, 2017 at 05:32:35PM +0800, Yi Sun wrote:
> > > > +};
> > > >  char *msg;
> > > >  
> > > >  switch (err) {
> > > >  case ENODEV:
> > > > -msg = "CAT is not supported in this system";
> > > > +msg = "is not supported in this system";
> > > >  break;
> > > >  case ENOENT:
> > > > -msg = "CAT is not enabled on the socket";
> > > > +msg = "is not enabled on the socket";
> > > >  break;
> > > >  case EOVERFLOW:
> > > >  msg = "no free COS available";
> > > > @@ -106,7 +120,7 @@ static void libxl__psr_cat_log_err_msg(libxl__gc 
> > > > *gc, int err)
> > > >  return;
> > > >  }
> > > >  
> > > > -LOGE(ERROR, "%s", msg);
> > > > +LOGE(ERROR, "%s: %s", feat_name[type], msg);
> > > 
> > > I don't think you should use LOGE here, but rather LOG. LOGE should be
> > > used when errno is set, which I don't think is the case here.
> > > 
> > But two places to call libxl__psr_cat_log_err_msg all set errno in 'xc_'
> > functions.
> 
> But you already translate the error into a custom message ('msg' in the
> above context), and the error variable in this case is 'err' not
> 'errno', so LOGE is not appropriate here IMHO.
> 
Ok.

[...]

> > > > diff --git a/tools/xl/xl_psr.c b/tools/xl/xl_psr.c
> > > > index 40269b4..46b7788 100644
> > > > --- a/tools/xl/xl_psr.c
> > > > +++ b/tools/xl/xl_psr.c
> > > > -static int psr_cat_print_socket(uint32_t domid, libxl_psr_cat_info 
> > > > *info,
> > > > -unsigned int lvl)
> > > > +static int psr_print_socket(uint32_t domid,
> > > > +libxl_psr_hw_info *info,
> > > > +libxl_psr_feat_type type,
> > > > +unsigned int lvl)
> > > >  {
> > > > -int rc;
> > > > -uint32_t l3_cache_size;
> > > > -
> > > >  printf("%-16s: %u\n", "Socket ID", info->id);
> > > >  
> > > > -/* So far, CMT only supports L3 cache. */
> > > > -if (lvl == 3) {
> > > > -rc = libxl_psr_cmt_get_l3_cache_size(ctx, info->id, 
> > > > &l3_cache_size);
> > > > -if (rc) {
> > > > -fprintf(stderr, "Failed to get l3 cache size for 
> > > > socket:%d\n",
> > > > -info->id);
> > > > -return -1;
> > > > +switch (type) {
> > > > +case LIBXL_PSR_FEAT_TYPE_CAT:
> > > > +{
> > > > +int rc;
> > > > +uint32_t l3_cache_size;
> > > > +
> > > > +/* So far, CMT only supports L3 cache. */
> > > > +if (lvl == 3) {
> > > 
> > > Shouldn't you print some kind of error message if lvl != 3? Or is it
> > > expected that this function will be called with lvl != 3 and it should
> > > be ignored?
> > > 
> > We only get cache size for level 3 cache. So, if input is lvl=2, we print
> > nothing.
> 
> My question would be, is it expected that this function is called with
> lvl == 2 as part of the normal operation with valid input values?
> 
Yes, lvl == 2 is a normal operation.

> Roger.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [qemu-mainline test] 113613: regressions - FAIL

2017-09-20 Thread osstest service owner
flight 113613 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/113613/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-xl-qemuu-win7-amd64 18 guest-start/win.repeat fail REGR. vs. 
113302

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-xl-rtds  6 xen-install  fail REGR. vs. 113302

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail  like 113302
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 113302
 test-amd64-amd64-xl-rtds 10 debian-install   fail  like 113302
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 113302
 test-amd64-amd64-xl-qemuu-ws16-amd64 10 windows-installfail never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-qcow2 12 migrate-support-checkfail  never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 13 guest-saverestore   fail never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass

version targeted for testing:
 qemuuc51700273ad9802a21c19f8d2b4bcb67c38e74ac
baseline version:
 qemuua6e8c1dacfd37d34542e33600dcc50b7683b735a

Last test of basis   113302  2017-09-11 10:18:16 Z8 days
Failing since113345  2017-09-12 00:21:07 Z8 days   16 attempts
Testing same since   113613  2017-09-20 02:35:03 Z0 days1 attempts


People who touched revisions under test:
  Alex Bennée 
  Alexander Graf 
  Alexey Kardashevskiy 
  Alistair Francis 
  Amador Pahim 
  Benjamin Herrenschmidt 
  Christian Borntraeger 
  Cornelia Huck 
  Cédric Le Goater 
  Daniel Henrique Barboza 
  Daniel P. Berrange 
  David Gibson 
  David Hildenbrand 
  Dr. David Alan Gilbert 
  Eduardo Habkost 
  Eduardo Otubo 
  Eduardo Otubo 
  Eric Auger 
  Eric Blake 
  Fam Zheng 
  Farhan Ali 
  Feng Kan 
  Gerd Hoffmann 
  Gonglei 
  Greg Kurz 
  Halil Pasic 
  Hannes Reinecke 
  Hannes Reinecke 
  Igor Mammedov 
  Jaroslaw Pelczar 
  John Snow 
  Joseph Myers 
  Kamil Rytarowski 
  Kevin Wolf 
  Ladi Prosek 
  Laurent Vivier 
  Lluís Vilanova 
  Lukáš Doktor 
  Mao Zhongyi 
  Mark Cave-Ayland 
  Matt Parker 
  Matthew Rosato 
  Mohammed Gamal 
  Paolo Bonzini 
  Peter Maydell 
  Peter Xu 
  Philippe Mathieu-Daudé 
  Philippe Mathieu-Daudé 
  Pranavkumar Sawargaonkar 
  Prasad J Pandit 
  Richard Henderson 
  Richard W.M. Jones 
  Roman Kagan 
  Stefan Hajnoczi 
  Thomas Huth 
  Tushar Jagad 
  Vadim Galitsyn 
  Vasilis Liaskovitis 
  Viktor Mihajlovski 
  Xiao Guangrong 
  Yi Min Zhao 

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm

Re: [Xen-devel] [PATCH] xen: credit2: fix spinlock irq-safety violation

2017-09-20 Thread Roger Pau Monné
On Tue, Sep 19, 2017 at 04:11:28AM +0200, Dario Faggioli wrote:
> In commit ad4b3e1e9df34 ("xen: credit2: implement
> utilization cap") xfree() was being called (for
> deallocating the budget replenishment timer, during
> domain destruction) inside an IRQ disabled critical
> section.
> 
> That must not happen, as it uses the mem-pool's lock,
> which needs to be taken with IRQ enabled. And, in fact,
> we crash (in debug builds):
> 
> (XEN) 
> (XEN) Panic on CPU 0:
> (XEN) Xen BUG at spinlock.c:47
> (XEN) 
> 
> Let's, therefore, kill and deallocate the timer outside of
> the critical sections, when IRQs are enabled.
> 
> Signed-off-by: Dario Faggioli 

Reviewed-by: Roger Pau Monné 

It would be good to get this committed sooner rather than later, so
the push gate can be unblocked.

Thanks, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xen: credit2: fix spinlock irq-safety violation

2017-09-20 Thread George Dunlap
On 09/19/2017 03:11 AM, Dario Faggioli wrote:
> In commit ad4b3e1e9df34 ("xen: credit2: implement
> utilization cap") xfree() was being called (for
> deallocating the budget replenishment timer, during
> domain destruction) inside an IRQ disabled critical
> section.
> 
> That must not happen, as it uses the mem-pool's lock,
> which needs to be taken with IRQ enabled. And, in fact,
> we crash (in debug builds):
> 
> (XEN) 
> (XEN) Panic on CPU 0:
> (XEN) Xen BUG at spinlock.c:47
> (XEN) 
> 
> Let's, therefore, kill and deallocate the timer outside of
> the critical sections, when IRQs are enabled.
> 
> Signed-off-by: Dario Faggioli 
> ---
> Cc: osstest service owner 
> Cc: George Dunlap 
> Cc: Wei Liu 
> ---
> This was spotted by OSSTest's flight 113562:
> 
>  http://logs.test-lab.xenproject.org/osstest/logs/113562/
>  
> http://logs.test-lab.xenproject.org/osstest/logs/113562/test-amd64-amd64-xl-credit2/serial-godello0.log
> ---
>  xen/common/sched_credit2.c |6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
> index 32234ac..7a550db 100644
> --- a/xen/common/sched_credit2.c
> +++ b/xen/common/sched_credit2.c
> @@ -2923,13 +2923,13 @@ csched2_free_domdata(const struct scheduler *ops, 
> void *data)
>  
>  write_lock_irqsave(&prv->lock, flags);
>  
> -kill_timer(sdom->repl_timer);
> -xfree(sdom->repl_timer);
> -
>  list_del_init(&sdom->sdom_elem);
>  
>  write_unlock_irqrestore(&prv->lock, flags);
>  
> +kill_timer(sdom->repl_timer);
> +xfree(sdom->repl_timer);

Any particular reason for moving the kill_timer() as well as the xfree
outside the lock?  What happens if the timer goes off after the
irqrestore but before the kill_timer?

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v4 0/3] Various XSA followups

2017-09-20 Thread Alexandru Isaila
XSA-219 was discovered while trying to implement the bugfix in patch 3.

Andrew Cooper (3):
 [RFC] x86/hvm: Rename enum hvm_copy_result to hvm_translation_result
 x86/hvm: Break out __hvm_copy()'s translation logic
 x86/hvm: Implement hvmemul_write() using real mappings

Alexandru Isaila (2):
 x86/hvm: Break out __hvm_copy()'s translation logic
 x86/hvm: Implement hvmemul_write() using real mappings






___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v4 2/3] x86/hvm: Break out __hvm_copy()'s translation logic

2017-09-20 Thread Alexandru Isaila
From: Andrew Cooper 

It will be reused by later changes.

Signed-off-by: Andrew Cooper 
Signed-off-by: Alexandru Isaila 
Reviewed-by: Paul Durrant 
Acked-by: Jan Beulich 
---
 xen/arch/x86/hvm/hvm.c| 144 +++---
 xen/include/asm-x86/hvm/support.h |  12 
 2 files changed, 98 insertions(+), 58 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 488acbf..93394c1 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -3069,6 +3069,83 @@ void hvm_task_switch(
 hvm_unmap_entry(nptss_desc);
 }
 
+enum hvm_translation_result hvm_translate_get_page(
+struct vcpu *v, unsigned long addr, bool linear, uint32_t pfec,
+pagefault_info_t *pfinfo, struct page_info **page_p,
+gfn_t *gfn_p, p2m_type_t *p2mt_p)
+{
+struct page_info *page;
+p2m_type_t p2mt;
+gfn_t gfn;
+
+if ( linear )
+{
+gfn = _gfn(paging_gva_to_gfn(v, addr, &pfec));
+
+if ( gfn_eq(gfn, INVALID_GFN) )
+{
+if ( pfec & PFEC_page_paged )
+return HVMTRANS_gfn_paged_out;
+
+if ( pfec & PFEC_page_shared )
+return HVMTRANS_gfn_shared;
+
+if ( pfinfo )
+{
+pfinfo->linear = addr;
+pfinfo->ec = pfec & ~PFEC_implicit;
+}
+
+return HVMTRANS_bad_linear_to_gfn;
+}
+}
+else
+{
+gfn = gaddr_to_gfn(addr);
+ASSERT(!pfinfo);
+}
+
+/*
+ * No need to do the P2M lookup for internally handled MMIO, benefiting
+ * - 32-bit WinXP (& older Windows) on AMD CPUs for LAPIC accesses,
+ * - newer Windows (like Server 2012) for HPET accesses.
+ */
+if ( v == current
+ && !nestedhvm_vcpu_in_guestmode(v)
+ && hvm_mmio_internal(gfn_to_gaddr(gfn)) )
+return HVMTRANS_bad_gfn_to_mfn;
+
+page = get_page_from_gfn(v->domain, gfn_x(gfn), &p2mt, P2M_UNSHARE);
+
+if ( !page )
+return HVMTRANS_bad_gfn_to_mfn;
+
+if ( p2m_is_paging(p2mt) )
+{
+put_page(page);
+p2m_mem_paging_populate(v->domain, gfn_x(gfn));
+return HVMTRANS_gfn_paged_out;
+}
+if ( p2m_is_shared(p2mt) )
+{
+put_page(page);
+return HVMTRANS_gfn_shared;
+}
+if ( p2m_is_grant(p2mt) )
+{
+put_page(page);
+return HVMTRANS_unhandleable;
+}
+
+*page_p = page;
+if ( gfn_p )
+*gfn_p = gfn;
+if ( p2mt_p )
+*p2mt_p = p2mt;
+
+return HVMTRANS_okay;
+}
+
 #define HVMCOPY_from_guest (0u<<0)
 #define HVMCOPY_to_guest   (1u<<0)
 #define HVMCOPY_phys   (0u<<2)
@@ -3077,7 +3154,7 @@ static enum hvm_translation_result __hvm_copy(
 void *buf, paddr_t addr, int size, struct vcpu *v, unsigned int flags,
 uint32_t pfec, pagefault_info_t *pfinfo)
 {
-unsigned long gfn;
+gfn_t gfn;
 struct page_info *page;
 p2m_type_t p2mt;
 char *p;
@@ -3103,65 +3180,15 @@ static enum hvm_translation_result __hvm_copy(
 
 while ( todo > 0 )
 {
+enum hvm_translation_result res;
 paddr_t gpa = addr & ~PAGE_MASK;
 
 count = min_t(int, PAGE_SIZE - gpa, todo);
 
-if ( flags & HVMCOPY_linear )
-{
-gfn = paging_gva_to_gfn(v, addr, &pfec);
-if ( gfn == gfn_x(INVALID_GFN) )
-{
-if ( pfec & PFEC_page_paged )
-return HVMTRANS_gfn_paged_out;
-if ( pfec & PFEC_page_shared )
-return HVMTRANS_gfn_shared;
-if ( pfinfo )
-{
-pfinfo->linear = addr;
-pfinfo->ec = pfec & ~PFEC_implicit;
-}
-return HVMTRANS_bad_linear_to_gfn;
-}
-gpa |= (paddr_t)gfn << PAGE_SHIFT;
-}
-else
-{
-gfn = addr >> PAGE_SHIFT;
-gpa = addr;
-}
-
-/*
- * No need to do the P2M lookup for internally handled MMIO, benefiting
- * - 32-bit WinXP (& older Windows) on AMD CPUs for LAPIC accesses,
- * - newer Windows (like Server 2012) for HPET accesses.
- */
-if ( v == current
- && !nestedhvm_vcpu_in_guestmode(v)
- && hvm_mmio_internal(gpa) )
-return HVMTRANS_bad_gfn_to_mfn;
-
-page = get_page_from_gfn(v->domain, gfn, &p2mt, P2M_UNSHARE);
-
-if ( !page )
-return HVMTRANS_bad_gfn_to_mfn;
-
-if ( p2m_is_paging(p2mt) )
-{
-put_page(page);
-p2m_mem_paging_populate(v->domain, gfn);
-return HVMTRANS_gfn_paged_out;
-}
-if ( p2m_is_shared(p2mt) )
-{
-put_page(page);
-return HVMTRANS_gfn_shared;
-}
-if ( p2m_is_grant(p2mt) )
-{
-put_page(page);
-return HVMTRANS_unhandleable;
-}
+res = hvm_translate_ge

[Xen-devel] [PATCH v4 3/3] x86/hvm: Implement hvmemul_write() using real mappings

2017-09-20 Thread Alexandru Isaila
From: Andrew Cooper 

An access which crosses a page boundary is performed atomically by x86
hardware, albeit with a severe performance penalty.  An important corner case
is when a straddled access hits two pages which differ in whether a
translation exists, or in net access rights.

The use of hvm_copy*() in hvmemul_write() is problematic, because it performs
a translation then completes the partial write, before moving onto the next
translation.

If an individual emulated write straddles two pages, the first of which is
writable, and the second of which is not, the first half of the write will
complete before #PF is raised from the second half.

This results in guest state corruption as a side effect of emulation, which
has been observed to cause windows to crash while under introspection.

Introduce the hvmemul_{,un}map_linear_addr() helpers, which translate an
entire contents of a linear access, and vmap() the underlying frames to
provide a contiguous virtual mapping for the emulator to use.  This is the
same mechanism as used by the shadow emulation code.

This will catch any translation issues and abort the emulation before any
modifications occur.

Signed-off-by: Andrew Cooper 
Signed-off-by: Alexandru Isaila 

---
Changes since V3:
- Changed the if condition form ... > ... -1 to ... >= ...
- Removed long cast to the err var
- Moved the mfn++ closer to the end of the loop
- Integrated the ++frame in the while condition
---
 xen/arch/x86/hvm/emulate.c| 175 ++
 xen/include/asm-x86/hvm/emulate.h |   7 ++
 2 files changed, 165 insertions(+), 17 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index cc874ce..9866d64 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -498,6 +498,157 @@ static int hvmemul_do_mmio_addr(paddr_t mmio_gpa,
 }
 
 /*
+ * Map the frame(s) covering an individual linear access, for writeable
+ * access.  May return NULL for MMIO, or ERR_PTR(~X86EMUL_*) for other errors
+ * including ERR_PTR(~X86EMUL_OKAY) for write-discard mappings.
+ *
+ * In debug builds, map() checks that each slot in hvmemul_ctxt->mfn[] is
+ * clean before use, and poisions unused slots with INVALID_MFN.
+ */
+static void *hvmemul_map_linear_addr(
+unsigned long linear, unsigned int bytes, uint32_t pfec,
+struct hvm_emulate_ctxt *hvmemul_ctxt)
+{
+struct vcpu *curr = current;
+void *err, *mapping;
+
+/* First and final gfns which need mapping. */
+unsigned long frame = linear >> PAGE_SHIFT, first = frame;
+unsigned long final = (linear + bytes - !!bytes) >> PAGE_SHIFT;
+
+/*
+ * mfn points to the next free slot.  All used slots have a page reference
+ * held on them.
+ */
+mfn_t *mfn = &hvmemul_ctxt->mfn[0];
+
+/*
+ * The caller has no legitimate reason for trying a zero-byte write, but
+ * final is calculate to fail safe in release builds.
+ *
+ * The maximum write size depends on the number of adjacent mfns[] which
+ * can be vmap()'d, accouting for possible misalignment within the region.
+ * The higher level emulation callers are responsible for ensuring that
+ * mfns[] is large enough for the requested write size.
+ */
+if ( bytes == 0 ||
+ final - first >= ARRAY_SIZE(hvmemul_ctxt->mfn) )
+{
+ASSERT_UNREACHABLE();
+goto unhandleable;
+}
+
+do {
+enum hvm_translation_result res;
+struct page_info *page;
+pagefault_info_t pfinfo;
+p2m_type_t p2mt;
+
+/* Error checking.  Confirm that the current slot is clean. */
+ASSERT(mfn_x(*mfn) == 0);
+
+res = hvm_translate_get_page(curr, frame << PAGE_SHIFT, true, pfec,
+ &pfinfo, &page, NULL, &p2mt);
+
+switch ( res )
+{
+case HVMTRANS_okay:
+break;
+
+case HVMTRANS_bad_linear_to_gfn:
+x86_emul_pagefault(pfinfo.ec, pfinfo.linear, &hvmemul_ctxt->ctxt);
+err = ERR_PTR(~X86EMUL_EXCEPTION);
+goto out;
+
+case HVMTRANS_bad_gfn_to_mfn:
+err = NULL;
+goto out;
+
+case HVMTRANS_gfn_paged_out:
+case HVMTRANS_gfn_shared:
+err = ERR_PTR(~X86EMUL_RETRY);
+goto out;
+
+default:
+goto unhandleable;
+}
+
+if ( p2m_is_discard_write(p2mt) )
+{
+err = ERR_PTR(~X86EMUL_OKAY);
+goto out;
+}
+
+*mfn++ = _mfn(page_to_mfn(page));
+
+} while ( ++frame < final );
+
+/* Entire access within a single frame? */
+if ( first == final )
+mapping = map_domain_page(hvmemul_ctxt->mfn[0]);
+/* Multiple frames? Need to vmap(). */
+else if ( (mapping = vmap(hvmemul_ctxt->mfn,
+  final - first + 1)) == NULL )
+goto unhandleable;
+
+#ifndef NDEBUG /* Poision unused mfn

[Xen-devel] [PATCH v4 1/3] x86/hvm: Rename enum hvm_copy_result to hvm_translation_result

2017-09-20 Thread Alexandru Isaila
From: Andrew Cooper 

Signed-off-by: Andrew Cooper 
Acked-by: Tim Deegan 
Acked-by: Jan Beulich 
Reviewed-by: Kevin Tian 
Acked-by: George Dunlap 
---
 xen/arch/x86/hvm/dom0_build.c |  2 +-
 xen/arch/x86/hvm/emulate.c| 40 ++--
 xen/arch/x86/hvm/hvm.c| 56 +++
 xen/arch/x86/hvm/intercept.c  | 20 +++---
 xen/arch/x86/hvm/svm/nestedsvm.c  |  5 ++--
 xen/arch/x86/hvm/svm/svm.c|  2 +-
 xen/arch/x86/hvm/viridian.c   |  2 +-
 xen/arch/x86/hvm/vmsi.c   |  2 +-
 xen/arch/x86/hvm/vmx/realmode.c   |  2 +-
 xen/arch/x86/hvm/vmx/vvmx.c   | 14 +-
 xen/arch/x86/mm/shadow/common.c   | 12 -
 xen/common/libelf/libelf-loader.c |  4 +--
 xen/include/asm-x86/hvm/support.h | 40 ++--
 13 files changed, 101 insertions(+), 100 deletions(-)

diff --git a/xen/arch/x86/hvm/dom0_build.c b/xen/arch/x86/hvm/dom0_build.c
index 020c355..e8f746c 100644
--- a/xen/arch/x86/hvm/dom0_build.c
+++ b/xen/arch/x86/hvm/dom0_build.c
@@ -238,7 +238,7 @@ static int __init pvh_setup_vmx_realmode_helpers(struct 
domain *d)
 if ( !pvh_steal_ram(d, HVM_VM86_TSS_SIZE, 128, GB(4), &gaddr) )
 {
 if ( hvm_copy_to_guest_phys(gaddr, NULL, HVM_VM86_TSS_SIZE, v) !=
- HVMCOPY_okay )
+ HVMTRANS_okay )
 printk("Unable to zero VM86 TSS area\n");
 d->arch.hvm_domain.params[HVM_PARAM_VM86_TSS_SIZED] =
 VM86_TSS_UPDATED | ((uint64_t)HVM_VM86_TSS_SIZE << 32) | gaddr;
diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 54811c1..cc874ce 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -100,7 +100,7 @@ static int ioreq_server_read(const struct hvm_io_handler 
*io_handler,
 uint32_t size,
 uint64_t *data)
 {
-if ( hvm_copy_from_guest_phys(data, addr, size) != HVMCOPY_okay )
+if ( hvm_copy_from_guest_phys(data, addr, size) != HVMTRANS_okay )
 return X86EMUL_UNHANDLEABLE;
 
 return X86EMUL_OKAY;
@@ -893,18 +893,18 @@ static int __hvmemul_read(
 
 switch ( rc )
 {
-case HVMCOPY_okay:
+case HVMTRANS_okay:
 break;
-case HVMCOPY_bad_gva_to_gfn:
+case HVMTRANS_bad_linear_to_gfn:
 x86_emul_pagefault(pfinfo.ec, pfinfo.linear, &hvmemul_ctxt->ctxt);
 return X86EMUL_EXCEPTION;
-case HVMCOPY_bad_gfn_to_mfn:
+case HVMTRANS_bad_gfn_to_mfn:
 if ( access_type == hvm_access_insn_fetch )
 return X86EMUL_UNHANDLEABLE;
 
 return hvmemul_linear_mmio_read(addr, bytes, p_data, pfec, 
hvmemul_ctxt, 0);
-case HVMCOPY_gfn_paged_out:
-case HVMCOPY_gfn_shared:
+case HVMTRANS_gfn_paged_out:
+case HVMTRANS_gfn_shared:
 return X86EMUL_RETRY;
 default:
 return X86EMUL_UNHANDLEABLE;
@@ -1012,15 +1012,15 @@ static int hvmemul_write(
 
 switch ( rc )
 {
-case HVMCOPY_okay:
+case HVMTRANS_okay:
 break;
-case HVMCOPY_bad_gva_to_gfn:
+case HVMTRANS_bad_linear_to_gfn:
 x86_emul_pagefault(pfinfo.ec, pfinfo.linear, &hvmemul_ctxt->ctxt);
 return X86EMUL_EXCEPTION;
-case HVMCOPY_bad_gfn_to_mfn:
+case HVMTRANS_bad_gfn_to_mfn:
 return hvmemul_linear_mmio_write(addr, bytes, p_data, pfec, 
hvmemul_ctxt, 0);
-case HVMCOPY_gfn_paged_out:
-case HVMCOPY_gfn_shared:
+case HVMTRANS_gfn_paged_out:
+case HVMTRANS_gfn_shared:
 return X86EMUL_RETRY;
 default:
 return X86EMUL_UNHANDLEABLE;
@@ -1384,7 +1384,7 @@ static int hvmemul_rep_movs(
 return rc;
 }
 
-rc = HVMCOPY_okay;
+rc = HVMTRANS_okay;
 }
 else
 /*
@@ -1394,16 +1394,16 @@ static int hvmemul_rep_movs(
  */
 rc = hvm_copy_from_guest_phys(buf, sgpa, bytes);
 
-if ( rc == HVMCOPY_okay )
+if ( rc == HVMTRANS_okay )
 rc = hvm_copy_to_guest_phys(dgpa, buf, bytes, current);
 
 xfree(buf);
 
-if ( rc == HVMCOPY_gfn_paged_out )
+if ( rc == HVMTRANS_gfn_paged_out )
 return X86EMUL_RETRY;
-if ( rc == HVMCOPY_gfn_shared )
+if ( rc == HVMTRANS_gfn_shared )
 return X86EMUL_RETRY;
-if ( rc != HVMCOPY_okay )
+if ( rc != HVMTRANS_okay )
 {
 gdprintk(XENLOG_WARNING, "Failed memory-to-memory REP MOVS: sgpa=%"
  PRIpaddr" dgpa=%"PRIpaddr" reps=%lu bytes_per_rep=%u\n",
@@ -1513,10 +1513,10 @@ static int hvmemul_rep_stos(
 
 switch ( rc )
 {
-case HVMCOPY_gfn_paged_out:
-case HVMCOPY_gfn_shared:
+case HVMTRANS_gfn_paged_out:
+case HVMTRANS_gfn_shared:
 return X86EMUL_RETRY;
-case HVMCOPY_okay:
+case HVMTRANS_okay:
 return X86EMUL_OKAY;
 }
 
@@ -2172,7 +2172,7 @@ void hvm_emulate_init_per_insn(
 &addr) &&
  hvm_fetch_from_guest_linear(hvmemul_ctxt

Re: [Xen-devel] [PATCH v8 11/15] xen: delay allocation of grant table sub structures

2017-09-20 Thread Jan Beulich
>>> On 20.09.17 at 08:34,  wrote:
> @@ -3381,75 +3425,21 @@ grant_table_create(
>  struct domain *d)
>  {
>  struct grant_table *t;
> -unsigned int i, j;
>  
>  if ( (t = xzalloc(struct grant_table)) == NULL )
> -goto no_mem_0;
> +return -ENOMEM;
>  
>  /* Simple stuff. */
>  percpu_rwlock_resource_init(&t->lock, grant_rwlock);
>  spin_lock_init(&t->maptrack_lock);
> -t->nr_grant_frames = INITIAL_NR_GRANT_FRAMES;
> -
> -/* Active grant table. */
> -if ( (t->active = xzalloc_array(struct active_grant_entry *,
> -max_nr_active_grant_frames)) == NULL )
> -goto no_mem_1;
> -for ( i = 0;
> -  i < num_act_frames_from_sha_frames(INITIAL_NR_GRANT_FRAMES); i++ )
> -{
> -if ( (t->active[i] = alloc_xenheap_page()) == NULL )
> -goto no_mem_2;
> -clear_page(t->active[i]);
> -for ( j = 0; j < ACGNT_PER_PAGE; j++ )
> -spin_lock_init(&t->active[i][j].lock);
> -}
> -
> -/* Tracking of mapped foreign frames table */
> -t->maptrack = vzalloc(max_maptrack_frames * sizeof(*t->maptrack));
> -if ( t->maptrack == NULL )
> -goto no_mem_2;
> -
> -/* Shared grant table. */
> -if ( (t->shared_raw = xzalloc_array(void *, max_grant_frames)) == NULL )
> -goto no_mem_3;
> -for ( i = 0; i < INITIAL_NR_GRANT_FRAMES; i++ )
> -{
> -if ( (t->shared_raw[i] = alloc_xenheap_page()) == NULL )
> -goto no_mem_4;
> -clear_page(t->shared_raw[i]);
> -}
> -
> -/* Status pages for grant table - for version 2 */
> -t->status = xzalloc_array(grant_status_t *,
> -  grant_to_status_frames(max_grant_frames));
> -if ( t->status == NULL )
> -goto no_mem_4;
> -
> -for ( i = 0; i < INITIAL_NR_GRANT_FRAMES; i++ )
> -gnttab_create_shared_page(d, t, i);
> -
> -t->nr_status_frames = 0;
>  
>  /* Okay, install the structure. */
>  d->grant_table = t;
> -return 0;
>  
> - no_mem_4:
> -for ( i = 0; i < INITIAL_NR_GRANT_FRAMES; i++ )
> -free_xenheap_page(t->shared_raw[i]);
> -xfree(t->shared_raw);
> - no_mem_3:
> -vfree(t->maptrack);
> - no_mem_2:
> -for ( i = 0;
> -  i < num_act_frames_from_sha_frames(INITIAL_NR_GRANT_FRAMES); i++ )
> -free_xenheap_page(t->active[i]);
> -xfree(t->active);
> - no_mem_1:
> -xfree(t);
> - no_mem_0:
> -return -ENOMEM;
> +if ( d->domain_id == 0 )
> +return grant_table_init(t);
> +
> +return 0;
>  }
>  
>  void
> @@ -3651,8 +3641,9 @@ int grant_table_set_limits(struct domain *d, unsigned 
> int grant_frames,
>  
>  grant_write_lock(gt);
>  
> -ret = 0;
> -/* Set limits, alloc needed arrays. */
> +/* Set limits. */
> +if ( !gt->active )
> +ret = grant_table_init(gt);
>  
>  grant_write_unlock(gt);

These changes don't leave the domains in a state similar to that
before the change - I'm missing calls to gnttab_grow_table() to
establish the minimal sizes. Aiui so far there has been no
requirement for a domain to invoke GNTTABOP_setup_table if it
is happy with v1 and the minimum size.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 06/15] x86: implement get value interface for MBA

2017-09-20 Thread Yi Sun
On 17-09-20 09:43:40, Roger Pau Monn� wrote:
> On Wed, Sep 20, 2017 at 01:09:06PM +0800, Yi Sun wrote:
> > On 17-09-19 10:15:42, Roger Pau Monn� wrote:
> > > On Tue, Sep 05, 2017 at 05:32:28PM +0800, Yi Sun wrote:
> > > > diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
> > > > index 696eff2..7902af7 100644
> > > > --- a/xen/arch/x86/domctl.c
> > > > +++ b/xen/arch/x86/domctl.c
> > > > @@ -1496,6 +1496,13 @@ long arch_do_domctl(
> > > >  copyback = true;
> > > >  break;
> > > >  
> > > > +case XEN_DOMCTL_PSR_ALLOC_GET_MBA_THRTL:
> > > > +ret = psr_get_val(d, domctl->u.psr_alloc.target,
> > > > +  &val32, PSR_TYPE_MBA_THRTL);
> > > > +domctl->u.psr_alloc.data = val32;
> > > 
> > > Hm, why does psr_get_val take a uint32_t * instead of a uint64_t *? So
> > > that you can directly pass &domctl->u.psr_alloc.data.
> > > 
> > > Or the other way around, why is domctl->u.psr_alloc.data a uint64_t
> > > instead of a uint32_t?
> > > 
> > There is a historical reason. The COS MSR is 64bit. So, the original codes
> > in L3 CAT (submitted years ago) used uint64_t.
> > 
> > But during L2 CAT review, per Jan's comment, the uint64_t is not necessary
> > in psr.c. So, we convert it to uint32_t in psr.c and make the codes you see
> > here.
> 
> Since this is a DOMCTL, you can change the type of the structure to be
> an uint32_t, so that you can pass it directly (unless I'm missing
> something else that requires this to be uint64_t).
> 
The tools layer implementation uses uint64_t according to SDM definition. So,
we have to do a convertion here or in tools/. Or, we need modify interfaces in
tools layer.

As CAT interfaces in tools layer have been released and it is a natural choice
to make it same as SDM, I think should keep uint64_t in tools at least. If so,
we must do the convertion somewhere. Then, I think it is not so necessary to
change current codes. Is that good to you?

> Roger.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v8 12/15] xen/arm: move arch specific grant table bits into grant_table.c

2017-09-20 Thread Jan Beulich
>>> On 20.09.17 at 08:34,  wrote:
> Instead of attaching the ARM specific grant table data to the domain
> structure add it to struct grant_table. Add the needed arch functions
> to the asm-*/grant_table.h includes.
> 
> Signed-off-by: Juergen Gross 
> Reviewed-by: Paul Durrant 

Non-ARM parts
Acked-by: Jan Beulich 



___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] minimum Python version

2017-09-20 Thread Jan Beulich
>>> On 20.09.17 at 10:49,  wrote:
> What about the following idea: maybe we could add an option to configure
> to make it fail instead of disabling components in case some
> requirements for that component are not met? The default could be to
> just disable the components and with the option specified all components
> not disabled explicitly must be buildable or configure will fail.

If that's something that (a) can be reasonably expressed in
configure.ac and (b) the tools maintainers can live with, that's
certainly an option.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 3/3] x86/hvm: Implement hvmemul_write() using real mappings

2017-09-20 Thread Paul Durrant
> -Original Message-
> From: Alexandru Isaila [mailto:aisa...@bitdefender.com]
> Sent: 20 September 2017 10:23
> To: xen-devel@lists.xen.org
> Cc: Tim (Xen.org) ; George Dunlap
> ; jbeul...@suse.com; Andrew Cooper
> ; Ian Jackson ;
> konrad.w...@oracle.com; sstabell...@kernel.org; Wei Liu
> ; Paul Durrant ;
> boris.ostrov...@oracle.com; suravee.suthikulpa...@amd.com;
> jun.nakaj...@intel.com; Kevin Tian ; Alexandru Isaila
> 
> Subject: [PATCH v4 3/3] x86/hvm: Implement hvmemul_write() using real
> mappings
> 
> From: Andrew Cooper 
> 
> An access which crosses a page boundary is performed atomically by x86
> hardware, albeit with a severe performance penalty.  An important corner
> case
> is when a straddled access hits two pages which differ in whether a
> translation exists, or in net access rights.
> 
> The use of hvm_copy*() in hvmemul_write() is problematic, because it
> performs
> a translation then completes the partial write, before moving onto the next
> translation.
> 
> If an individual emulated write straddles two pages, the first of which is
> writable, and the second of which is not, the first half of the write will
> complete before #PF is raised from the second half.
> 
> This results in guest state corruption as a side effect of emulation, which
> has been observed to cause windows to crash while under introspection.
> 
> Introduce the hvmemul_{,un}map_linear_addr() helpers, which translate an
> entire contents of a linear access, and vmap() the underlying frames to
> provide a contiguous virtual mapping for the emulator to use.  This is the
> same mechanism as used by the shadow emulation code.
> 
> This will catch any translation issues and abort the emulation before any
> modifications occur.
> 
> Signed-off-by: Andrew Cooper 
> Signed-off-by: Alexandru Isaila 
> 
> ---
> Changes since V3:
>   - Changed the if condition form ... > ... -1 to ... >= ...
>   - Removed long cast to the err var
>   - Moved the mfn++ closer to the end of the loop
>   - Integrated the ++frame in the while condition
> ---
>  xen/arch/x86/hvm/emulate.c| 175
> ++
>  xen/include/asm-x86/hvm/emulate.h |   7 ++
>  2 files changed, 165 insertions(+), 17 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
> index cc874ce..9866d64 100644
> --- a/xen/arch/x86/hvm/emulate.c
> +++ b/xen/arch/x86/hvm/emulate.c
> @@ -498,6 +498,157 @@ static int hvmemul_do_mmio_addr(paddr_t
> mmio_gpa,
>  }
> 
>  /*
> + * Map the frame(s) covering an individual linear access, for writeable
> + * access.  May return NULL for MMIO, or ERR_PTR(~X86EMUL_*) for other
> errors
> + * including ERR_PTR(~X86EMUL_OKAY) for write-discard mappings.
> + *
> + * In debug builds, map() checks that each slot in hvmemul_ctxt->mfn[] is
> + * clean before use, and poisions unused slots with INVALID_MFN.
> + */
> +static void *hvmemul_map_linear_addr(
> +unsigned long linear, unsigned int bytes, uint32_t pfec,
> +struct hvm_emulate_ctxt *hvmemul_ctxt)
> +{
> +struct vcpu *curr = current;
> +void *err, *mapping;
> +
> +/* First and final gfns which need mapping. */
> +unsigned long frame = linear >> PAGE_SHIFT, first = frame;
> +unsigned long final = (linear + bytes - !!bytes) >> PAGE_SHIFT;
> +
> +/*
> + * mfn points to the next free slot.  All used slots have a page 
> reference
> + * held on them.
> + */
> +mfn_t *mfn = &hvmemul_ctxt->mfn[0];
> +
> +/*
> + * The caller has no legitimate reason for trying a zero-byte write, but
> + * final is calculate to fail safe in release builds.
> + *
> + * The maximum write size depends on the number of adjacent mfns[]
> which
> + * can be vmap()'d, accouting for possible misalignment within the 
> region.
> + * The higher level emulation callers are responsible for ensuring that
> + * mfns[] is large enough for the requested write size.
> + */
> +if ( bytes == 0 ||
> + final - first >= ARRAY_SIZE(hvmemul_ctxt->mfn) )
> +{
> +ASSERT_UNREACHABLE();
> +goto unhandleable;
> +}
> +
> +do {
> +enum hvm_translation_result res;
> +struct page_info *page;
> +pagefault_info_t pfinfo;
> +p2m_type_t p2mt;
> +
> +/* Error checking.  Confirm that the current slot is clean. */
> +ASSERT(mfn_x(*mfn) == 0);
> +
> +res = hvm_translate_get_page(curr, frame << PAGE_SHIFT, true, pfec,
> + &pfinfo, &page, NULL, &p2mt);
> +
> +switch ( res )
> +{
> +case HVMTRANS_okay:
> +break;
> +
> +case HVMTRANS_bad_linear_to_gfn:
> +x86_emul_pagefault(pfinfo.ec, pfinfo.linear, 
> &hvmemul_ctxt->ctxt);
> +err = ERR_PTR(~X86EMUL_EXCEPTION);
> +goto out;
> +
> +case HVMTRANS_bad_gfn_to_mfn:
> +err = NULL;
> +goto out;
> +
> +

Re: [Xen-devel] [PATCH v8 04/15] xen: add function for obtaining highest possible memory address

2017-09-20 Thread Jan Beulich
>>> On 20.09.17 at 10:58,  wrote:
> On 20/09/17 10:57, Jan Beulich wrote:
> On 20.09.17 at 08:34,  wrote:
>>> --- a/xen/arch/x86/mm.c
>>> +++ b/xen/arch/x86/mm.c
>>> @@ -6312,6 +6312,17 @@ int pv_ro_page_fault(unsigned long addr, struct 
> cpu_user_regs *regs)
>>>  return 0;
>>>  }
>>>  
>>> +unsigned long arch_get_upper_mfn_bound(void)
>>> +{
>>> +unsigned long max_mfn;
>>> +
>>> +max_mfn = mem_hotplug ? PFN_DOWN(mem_hotplug) : max_page;
>> 
>> Taking into account the code in the caller of this function as well
>> as the ARM counterpart I find the use of max_page here odd. I'd
>> prefer if get_upper_mfn_bound() went away altogether, and it's
>> sole caller (which strangely enough doesn't get introduced here)
>> called the arch function directly. Additionally, with the caller being
>> a sysctl, how is that supposed to help a PV DomU kernel in their
>> choice of grant table version?
> 
> Did you look at patch 15?

Not yet, no (I had looked over the titles, but this one's didn't
make me make the connection). So yes, that addresses the PV
DomU concern. Still I'd like to get away without the thin common
wrapper around the arch specific actual implementation (and I
don't care much whether the resulting function has an arch_
prefix).

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v8 04/15] xen: add function for obtaining highest possible memory address

2017-09-20 Thread Juergen Gross
On 20/09/17 11:32, Jan Beulich wrote:
 On 20.09.17 at 10:58,  wrote:
>> On 20/09/17 10:57, Jan Beulich wrote:
>> On 20.09.17 at 08:34,  wrote:
 --- a/xen/arch/x86/mm.c
 +++ b/xen/arch/x86/mm.c
 @@ -6312,6 +6312,17 @@ int pv_ro_page_fault(unsigned long addr, struct 
>> cpu_user_regs *regs)
  return 0;
  }
  
 +unsigned long arch_get_upper_mfn_bound(void)
 +{
 +unsigned long max_mfn;
 +
 +max_mfn = mem_hotplug ? PFN_DOWN(mem_hotplug) : max_page;
>>>
>>> Taking into account the code in the caller of this function as well
>>> as the ARM counterpart I find the use of max_page here odd. I'd
>>> prefer if get_upper_mfn_bound() went away altogether, and it's
>>> sole caller (which strangely enough doesn't get introduced here)
>>> called the arch function directly. Additionally, with the caller being
>>> a sysctl, how is that supposed to help a PV DomU kernel in their
>>> choice of grant table version?
>>
>> Did you look at patch 15?
> 
> Not yet, no (I had looked over the titles, but this one's didn't
> make me make the connection). So yes, that addresses the PV
> DomU concern. Still I'd like to get away without the thin common
> wrapper around the arch specific actual implementation (and I
> don't care much whether the resulting function has an arch_
> prefix).

Okay, I'll remove the common wrapper and drop the arch_ from the x86 and
arm variants.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 1/3] x86/hvm: Rename enum hvm_copy_result to hvm_translation_result

2017-09-20 Thread Jan Beulich
>>> On 20.09.17 at 11:22,  wrote:
> From: Andrew Cooper 
> 
> Signed-off-by: Andrew Cooper 
> Acked-by: Tim Deegan 
> Acked-by: Jan Beulich 
> Reviewed-by: Kevin Tian 
> Acked-by: George Dunlap 

I'm sorry to say that, but this time you've lost Paul's R-b. Please
pay more attention to such aspects in the future.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v8 11/15] xen: delay allocation of grant table sub structures

2017-09-20 Thread Juergen Gross
On 20/09/17 11:22, Jan Beulich wrote:
 On 20.09.17 at 08:34,  wrote:
>> @@ -3381,75 +3425,21 @@ grant_table_create(
>>  struct domain *d)
>>  {
>>  struct grant_table *t;
>> -unsigned int i, j;
>>  
>>  if ( (t = xzalloc(struct grant_table)) == NULL )
>> -goto no_mem_0;
>> +return -ENOMEM;
>>  
>>  /* Simple stuff. */
>>  percpu_rwlock_resource_init(&t->lock, grant_rwlock);
>>  spin_lock_init(&t->maptrack_lock);
>> -t->nr_grant_frames = INITIAL_NR_GRANT_FRAMES;
>> -
>> -/* Active grant table. */
>> -if ( (t->active = xzalloc_array(struct active_grant_entry *,
>> -max_nr_active_grant_frames)) == NULL )
>> -goto no_mem_1;
>> -for ( i = 0;
>> -  i < num_act_frames_from_sha_frames(INITIAL_NR_GRANT_FRAMES); i++ )
>> -{
>> -if ( (t->active[i] = alloc_xenheap_page()) == NULL )
>> -goto no_mem_2;
>> -clear_page(t->active[i]);
>> -for ( j = 0; j < ACGNT_PER_PAGE; j++ )
>> -spin_lock_init(&t->active[i][j].lock);
>> -}
>> -
>> -/* Tracking of mapped foreign frames table */
>> -t->maptrack = vzalloc(max_maptrack_frames * sizeof(*t->maptrack));
>> -if ( t->maptrack == NULL )
>> -goto no_mem_2;
>> -
>> -/* Shared grant table. */
>> -if ( (t->shared_raw = xzalloc_array(void *, max_grant_frames)) == NULL )
>> -goto no_mem_3;
>> -for ( i = 0; i < INITIAL_NR_GRANT_FRAMES; i++ )
>> -{
>> -if ( (t->shared_raw[i] = alloc_xenheap_page()) == NULL )
>> -goto no_mem_4;
>> -clear_page(t->shared_raw[i]);
>> -}
>> -
>> -/* Status pages for grant table - for version 2 */
>> -t->status = xzalloc_array(grant_status_t *,
>> -  grant_to_status_frames(max_grant_frames));
>> -if ( t->status == NULL )
>> -goto no_mem_4;
>> -
>> -for ( i = 0; i < INITIAL_NR_GRANT_FRAMES; i++ )
>> -gnttab_create_shared_page(d, t, i);
>> -
>> -t->nr_status_frames = 0;
>>  
>>  /* Okay, install the structure. */
>>  d->grant_table = t;
>> -return 0;
>>  
>> - no_mem_4:
>> -for ( i = 0; i < INITIAL_NR_GRANT_FRAMES; i++ )
>> -free_xenheap_page(t->shared_raw[i]);
>> -xfree(t->shared_raw);
>> - no_mem_3:
>> -vfree(t->maptrack);
>> - no_mem_2:
>> -for ( i = 0;
>> -  i < num_act_frames_from_sha_frames(INITIAL_NR_GRANT_FRAMES); i++ )
>> -free_xenheap_page(t->active[i]);
>> -xfree(t->active);
>> - no_mem_1:
>> -xfree(t);
>> - no_mem_0:
>> -return -ENOMEM;
>> +if ( d->domain_id == 0 )
>> +return grant_table_init(t);
>> +
>> +return 0;
>>  }
>>  
>>  void
>> @@ -3651,8 +3641,9 @@ int grant_table_set_limits(struct domain *d, unsigned 
>> int grant_frames,
>>  
>>  grant_write_lock(gt);
>>  
>> -ret = 0;
>> -/* Set limits, alloc needed arrays. */
>> +/* Set limits. */
>> +if ( !gt->active )
>> +ret = grant_table_init(gt);
>>  
>>  grant_write_unlock(gt);
> 
> These changes don't leave the domains in a state similar to that
> before the change - I'm missing calls to gnttab_grow_table() to
> establish the minimal sizes. Aiui so far there has been no
> requirement for a domain to invoke GNTTABOP_setup_table if it
> is happy with v1 and the minimum size.

So you don't like gnttab_grow_table() being called when the guest tries
to map the grant frames in this case?

I can add the call of gnttab_grow_table() to grant_table_set_limits() in
case you like that better.


Juergen


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v8 11/15] xen: delay allocation of grant table sub structures

2017-09-20 Thread Jan Beulich
>>> On 20.09.17 at 11:44,  wrote:
> On 20/09/17 11:22, Jan Beulich wrote:
> On 20.09.17 at 08:34,  wrote:
>>> @@ -3381,75 +3425,21 @@ grant_table_create(
>>>  struct domain *d)
>>>  {
>>>  struct grant_table *t;
>>> -unsigned int i, j;
>>>  
>>>  if ( (t = xzalloc(struct grant_table)) == NULL )
>>> -goto no_mem_0;
>>> +return -ENOMEM;
>>>  
>>>  /* Simple stuff. */
>>>  percpu_rwlock_resource_init(&t->lock, grant_rwlock);
>>>  spin_lock_init(&t->maptrack_lock);
>>> -t->nr_grant_frames = INITIAL_NR_GRANT_FRAMES;
>>> -
>>> -/* Active grant table. */
>>> -if ( (t->active = xzalloc_array(struct active_grant_entry *,
>>> -max_nr_active_grant_frames)) == NULL )
>>> -goto no_mem_1;
>>> -for ( i = 0;
>>> -  i < num_act_frames_from_sha_frames(INITIAL_NR_GRANT_FRAMES); i++ 
>>> )
>>> -{
>>> -if ( (t->active[i] = alloc_xenheap_page()) == NULL )
>>> -goto no_mem_2;
>>> -clear_page(t->active[i]);
>>> -for ( j = 0; j < ACGNT_PER_PAGE; j++ )
>>> -spin_lock_init(&t->active[i][j].lock);
>>> -}
>>> -
>>> -/* Tracking of mapped foreign frames table */
>>> -t->maptrack = vzalloc(max_maptrack_frames * sizeof(*t->maptrack));
>>> -if ( t->maptrack == NULL )
>>> -goto no_mem_2;
>>> -
>>> -/* Shared grant table. */
>>> -if ( (t->shared_raw = xzalloc_array(void *, max_grant_frames)) == NULL 
>>> )
>>> -goto no_mem_3;
>>> -for ( i = 0; i < INITIAL_NR_GRANT_FRAMES; i++ )
>>> -{
>>> -if ( (t->shared_raw[i] = alloc_xenheap_page()) == NULL )
>>> -goto no_mem_4;
>>> -clear_page(t->shared_raw[i]);
>>> -}
>>> -
>>> -/* Status pages for grant table - for version 2 */
>>> -t->status = xzalloc_array(grant_status_t *,
>>> -  grant_to_status_frames(max_grant_frames));
>>> -if ( t->status == NULL )
>>> -goto no_mem_4;
>>> -
>>> -for ( i = 0; i < INITIAL_NR_GRANT_FRAMES; i++ )
>>> -gnttab_create_shared_page(d, t, i);
>>> -
>>> -t->nr_status_frames = 0;
>>>  
>>>  /* Okay, install the structure. */
>>>  d->grant_table = t;
>>> -return 0;
>>>  
>>> - no_mem_4:
>>> -for ( i = 0; i < INITIAL_NR_GRANT_FRAMES; i++ )
>>> -free_xenheap_page(t->shared_raw[i]);
>>> -xfree(t->shared_raw);
>>> - no_mem_3:
>>> -vfree(t->maptrack);
>>> - no_mem_2:
>>> -for ( i = 0;
>>> -  i < num_act_frames_from_sha_frames(INITIAL_NR_GRANT_FRAMES); i++ 
>>> )
>>> -free_xenheap_page(t->active[i]);
>>> -xfree(t->active);
>>> - no_mem_1:
>>> -xfree(t);
>>> - no_mem_0:
>>> -return -ENOMEM;
>>> +if ( d->domain_id == 0 )
>>> +return grant_table_init(t);
>>> +
>>> +return 0;
>>>  }
>>>  
>>>  void
>>> @@ -3651,8 +3641,9 @@ int grant_table_set_limits(struct domain *d, unsigned 
>>> int grant_frames,
>>>  
>>>  grant_write_lock(gt);
>>>  
>>> -ret = 0;
>>> -/* Set limits, alloc needed arrays. */
>>> +/* Set limits. */
>>> +if ( !gt->active )
>>> +ret = grant_table_init(gt);
>>>  
>>>  grant_write_unlock(gt);
>> 
>> These changes don't leave the domains in a state similar to that
>> before the change - I'm missing calls to gnttab_grow_table() to
>> establish the minimal sizes. Aiui so far there has been no
>> requirement for a domain to invoke GNTTABOP_setup_table if it
>> is happy with v1 and the minimum size.
> 
> So you don't like gnttab_grow_table() being called when the guest tries
> to map the grant frames in this case?

I don't like it being called _only_ in that case.

> I can add the call of gnttab_grow_table() to grant_table_set_limits() in
> case you like that better.

And to grant_table_create() for the Dom0 case.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v8 11/15] xen: delay allocation of grant table sub structures

2017-09-20 Thread Juergen Gross
On 20/09/17 12:02, Jan Beulich wrote:
 On 20.09.17 at 11:44,  wrote:
>> On 20/09/17 11:22, Jan Beulich wrote:
>> On 20.09.17 at 08:34,  wrote:
 @@ -3381,75 +3425,21 @@ grant_table_create(
  struct domain *d)
  {
  struct grant_table *t;
 -unsigned int i, j;
  
  if ( (t = xzalloc(struct grant_table)) == NULL )
 -goto no_mem_0;
 +return -ENOMEM;
  
  /* Simple stuff. */
  percpu_rwlock_resource_init(&t->lock, grant_rwlock);
  spin_lock_init(&t->maptrack_lock);
 -t->nr_grant_frames = INITIAL_NR_GRANT_FRAMES;
 -
 -/* Active grant table. */
 -if ( (t->active = xzalloc_array(struct active_grant_entry *,
 -max_nr_active_grant_frames)) == NULL )
 -goto no_mem_1;
 -for ( i = 0;
 -  i < num_act_frames_from_sha_frames(INITIAL_NR_GRANT_FRAMES); 
 i++ )
 -{
 -if ( (t->active[i] = alloc_xenheap_page()) == NULL )
 -goto no_mem_2;
 -clear_page(t->active[i]);
 -for ( j = 0; j < ACGNT_PER_PAGE; j++ )
 -spin_lock_init(&t->active[i][j].lock);
 -}
 -
 -/* Tracking of mapped foreign frames table */
 -t->maptrack = vzalloc(max_maptrack_frames * sizeof(*t->maptrack));
 -if ( t->maptrack == NULL )
 -goto no_mem_2;
 -
 -/* Shared grant table. */
 -if ( (t->shared_raw = xzalloc_array(void *, max_grant_frames)) == 
 NULL )
 -goto no_mem_3;
 -for ( i = 0; i < INITIAL_NR_GRANT_FRAMES; i++ )
 -{
 -if ( (t->shared_raw[i] = alloc_xenheap_page()) == NULL )
 -goto no_mem_4;
 -clear_page(t->shared_raw[i]);
 -}
 -
 -/* Status pages for grant table - for version 2 */
 -t->status = xzalloc_array(grant_status_t *,
 -  grant_to_status_frames(max_grant_frames));
 -if ( t->status == NULL )
 -goto no_mem_4;
 -
 -for ( i = 0; i < INITIAL_NR_GRANT_FRAMES; i++ )
 -gnttab_create_shared_page(d, t, i);
 -
 -t->nr_status_frames = 0;
  
  /* Okay, install the structure. */
  d->grant_table = t;
 -return 0;
  
 - no_mem_4:
 -for ( i = 0; i < INITIAL_NR_GRANT_FRAMES; i++ )
 -free_xenheap_page(t->shared_raw[i]);
 -xfree(t->shared_raw);
 - no_mem_3:
 -vfree(t->maptrack);
 - no_mem_2:
 -for ( i = 0;
 -  i < num_act_frames_from_sha_frames(INITIAL_NR_GRANT_FRAMES); 
 i++ )
 -free_xenheap_page(t->active[i]);
 -xfree(t->active);
 - no_mem_1:
 -xfree(t);
 - no_mem_0:
 -return -ENOMEM;
 +if ( d->domain_id == 0 )
 +return grant_table_init(t);
 +
 +return 0;
  }
  
  void
 @@ -3651,8 +3641,9 @@ int grant_table_set_limits(struct domain *d, 
 unsigned int grant_frames,
  
  grant_write_lock(gt);
  
 -ret = 0;
 -/* Set limits, alloc needed arrays. */
 +/* Set limits. */
 +if ( !gt->active )
 +ret = grant_table_init(gt);
  
  grant_write_unlock(gt);
>>>
>>> These changes don't leave the domains in a state similar to that
>>> before the change - I'm missing calls to gnttab_grow_table() to
>>> establish the minimal sizes. Aiui so far there has been no
>>> requirement for a domain to invoke GNTTABOP_setup_table if it
>>> is happy with v1 and the minimum size.
>>
>> So you don't like gnttab_grow_table() being called when the guest tries
>> to map the grant frames in this case?
> 
> I don't like it being called _only_ in that case.
> 
>> I can add the call of gnttab_grow_table() to grant_table_set_limits() in
>> case you like that better.
> 
> And to grant_table_create() for the Dom0 case.

So in the end: to grant_table_init().


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xen: credit2: fix spinlock irq-safety violation

2017-09-20 Thread George Dunlap
On 09/20/2017 10:19 AM, George Dunlap wrote:
> On 09/19/2017 03:11 AM, Dario Faggioli wrote:
>> In commit ad4b3e1e9df34 ("xen: credit2: implement
>> utilization cap") xfree() was being called (for
>> deallocating the budget replenishment timer, during
>> domain destruction) inside an IRQ disabled critical
>> section.
>>
>> That must not happen, as it uses the mem-pool's lock,
>> which needs to be taken with IRQ enabled. And, in fact,
>> we crash (in debug builds):
>>
>> (XEN) 
>> (XEN) Panic on CPU 0:
>> (XEN) Xen BUG at spinlock.c:47
>> (XEN) 
>>
>> Let's, therefore, kill and deallocate the timer outside of
>> the critical sections, when IRQs are enabled.
>>
>> Signed-off-by: Dario Faggioli 
>> ---
>> Cc: osstest service owner 
>> Cc: George Dunlap 
>> Cc: Wei Liu 
>> ---
>> This was spotted by OSSTest's flight 113562:
>>
>>  http://logs.test-lab.xenproject.org/osstest/logs/113562/
>>  
>> http://logs.test-lab.xenproject.org/osstest/logs/113562/test-amd64-amd64-xl-credit2/serial-godello0.log
>> ---
>>  xen/common/sched_credit2.c |6 +++---
>>  1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
>> index 32234ac..7a550db 100644
>> --- a/xen/common/sched_credit2.c
>> +++ b/xen/common/sched_credit2.c
>> @@ -2923,13 +2923,13 @@ csched2_free_domdata(const struct scheduler *ops, 
>> void *data)
>>  
>>  write_lock_irqsave(&prv->lock, flags);
>>  
>> -kill_timer(sdom->repl_timer);
>> -xfree(sdom->repl_timer);
>> -
>>  list_del_init(&sdom->sdom_elem);
>>  
>>  write_unlock_irqrestore(&prv->lock, flags);
>>  
>> +kill_timer(sdom->repl_timer);
>> +xfree(sdom->repl_timer);
> 
> Any particular reason for moving the kill_timer() as well as the xfree
> outside the lock?  What happens if the timer goes off after the
> irqrestore but before the kill_timer?

Looks like if the timer fires, nothing terribly bad will happen; it will
just do a useless budget replenishment.

It looks like kill_timer() disables irqs, so it could be moved inside
the critical section.  I'm inclined to say we should do so.  I don't
anticipate the budget replenishment ever to need to walk the domain
list, but should that change, this would be a bear of a bug to find.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v8 13/15] xen: make grant resource limits per domain

2017-09-20 Thread Jan Beulich
>>> On 20.09.17 at 08:34,  wrote:
> --- a/xen/common/compat/grant_table.c
> +++ b/xen/common/compat/grant_table.c
> @@ -157,21 +157,14 @@ int compat_grant_table_op(unsigned int cmd,
>  unsigned int max_frame_list_size_in_page =
>  (COMPAT_ARG_XLAT_SIZE - sizeof(*nat.setup)) /
>  sizeof(*nat.setup->frame_list.p);
> -if ( max_frame_list_size_in_page < max_grant_frames )

The latest here, but perhaps even earlier I think max_grant_frames
should become static, so one can be reasonably certain that all other
references are gone.

> @@ -290,8 +293,8 @@ num_act_frames_from_sha_frames(const unsigned int num)
>  return DIV_ROUND_UP(num * sha_per_page, ACGNT_PER_PAGE);
>  }
>  
> -#define max_nr_active_grant_frames \
> -num_act_frames_from_sha_frames(max_grant_frames)
> +#define max_nr_active_grant_frames(gt) \
> +num_act_frames_from_sha_frames(gt->max_grant_frames)

Parentheses around gt please.

> @@ -1718,8 +1724,9 @@ gnttab_grow_table(struct domain *d, unsigned int 
> req_nr_frames)
>  ASSERT(gt->active);
>  
>  if ( req_nr_frames < INITIAL_NR_GRANT_FRAMES )
> -req_nr_frames = INITIAL_NR_GRANT_FRAMES;
> -ASSERT(req_nr_frames <= max_grant_frames);
> +req_nr_frames = min_t(unsigned int, INITIAL_NR_GRANT_FRAMES,
> +gt->max_grant_frames);

Perhaps give the INITIAL_NR_GRANT_FRAMES value a U suffix
instead of using min_t() here?

> @@ -1777,13 +1784,15 @@ active_alloc_failed:
>  
>  static long
>  gnttab_setup_table(
> -XEN_GUEST_HANDLE_PARAM(gnttab_setup_table_t) uop, unsigned int count)
> +XEN_GUEST_HANDLE_PARAM(gnttab_setup_table_t) uop, unsigned int count,
> +unsigned int limit_max)
>  {
>  struct vcpu *curr = current;
>  struct gnttab_setup_table op;
>  struct domain *d = NULL;
>  struct grant_table *gt;
>  unsigned int i;
> +long ret = 0;

Wouldn't int suffice here?

> @@ -1819,6 +1819,21 @@ gnttab_setup_table(
>  gt = d->grant_table;
>  grant_write_lock(gt);
>  
> +if ( unlikely(op.nr_frames > gt->max_grant_frames) )
> +{
> +gdprintk(XENLOG_INFO, "Domain is limited to %d grant-table 
> frames.\n",

%u and you also want to log the subject domain ID (which may not
be current's; same for the other log message below as well as the
similar one in gnttab_get_status_frames()).

> +gt->max_grant_frames);
> +op.status = GNTST_general_error;
> +goto unlock;
> +}
> +if ( unlikely(limit_max < gt->max_grant_frames) )

With the check moved here it can be relaxed afaict: Code below
only writes op.nr_frames entries (same then again for
gnttab_get_status_frames()).

> @@ -1852,10 +1867,10 @@ gnttab_setup_table(
>  if ( d )
>  rcu_unlock_domain(d);
>  
> -if ( unlikely(__copy_field_to_guest(uop, &op, status)) )
> +if ( !ret && unlikely(__copy_field_to_guest(uop, &op, status)) )

I wonder whether it wouldn't be better to switch that check
producing -EINVAL to also report the failure in op.status, now
that it lives here (same then for gnttab_get_status_frames()
once again).

>  static long
>  gnttab_get_status_frames(XEN_GUEST_HANDLE_PARAM(gnttab_get_status_frames_t) 
> uop,
> - int count)
> + int count, unsigned int limit_max)

Take the opportunity and switch count to unsigned int?

> @@ -3320,7 +3347,7 @@ do_grant_table_op(
>  
>  case GNTTABOP_setup_table:
>  rc = gnttab_setup_table(
> -guest_handle_cast(uop, gnttab_setup_table_t), count);
> +guest_handle_cast(uop, gnttab_setup_table_t), count, ~0);

UINT_MAX?

> @@ -3442,6 +3469,8 @@ grant_table_create(
>  /* Simple stuff. */
>  percpu_rwlock_resource_init(&t->lock, grant_rwlock);
>  spin_lock_init(&t->maptrack_lock);
> +t->max_grant_frames = max_grant_frames;
> +t->max_maptrack_frames = max_maptrack_frames;

This together with ...

> @@ -3655,7 +3686,11 @@ int grant_table_set_limits(struct domain *d, unsigned 
> int grant_frames,
>  
>  /* Set limits. */
>  if ( !gt->active )
> +{
> +gt->max_grant_frames = grant_frames;
> +gt->max_maptrack_frames = maptrack_frames;
>  ret = grant_table_init(gt);
> +}

.. this raises the question of whether it is legal to decrease the
limits. There may be code depending on it only ever growing.
Additionally to take the input values without applying some
upper cap - while we have XSA-77, we still shouldn't introduce
new issues making disaggregation more unsafe. Perhaps the
global limits could serve as a cap here?

> --- a/xen/include/asm-arm/grant_table.h
> +++ b/xen/include/asm-arm/grant_table.h
> @@ -26,8 +26,8 @@ static inline int replace_grant_supported(void)
>  return 1;
>  }
>  
> -#define gnttab_init_arch(gt) \
> -( ((gt)->arch.gfn = xzalloc_array(gfn_t, max_grant_frames)) =

Re: [Xen-devel] [PATCH 1/2] public/domctl: drop unnecessary typedefs and handles

2017-09-20 Thread George Dunlap
On Tue, Sep 12, 2017 at 4:08 PM, Jan Beulich  wrote:
> By virtue of the struct xen_domctl container structure, most of them
> are really just cluttering the name space.
>
> While doing so,
> - convert an enum typed (pt_irq_type_t) structure field to a fixed
>   width type,
> - make x86's paging_domctl() and descendants take a properly typed
>   handle,
> - add const in a few places.
>
> Signed-off-by: Jan Beulich 

Acked-by: George Dunlap 

>
> --- a/tools/libxc/include/xenctrl.h
> +++ b/tools/libxc/include/xenctrl.h
> @@ -903,7 +903,7 @@ int xc_vcpu_get_extstate(xc_interface *x
>   uint32_t vcpu,
>   xc_vcpu_extstate_t *extstate);
>
> -typedef xen_domctl_getvcpuinfo_t xc_vcpuinfo_t;
> +typedef struct xen_domctl_getvcpuinfo xc_vcpuinfo_t;
>  int xc_vcpu_getinfo(xc_interface *xch,
>  uint32_t domid,
>  uint32_t vcpu,
> @@ -916,7 +916,7 @@ long long xc_domain_get_cpu_usage(xc_int
>  int xc_domain_sethandle(xc_interface *xch, uint32_t domid,
>  xen_domain_handle_t handle);
>
> -typedef xen_domctl_shadow_op_stats_t xc_shadow_op_stats_t;
> +typedef struct xen_domctl_shadow_op_stats xc_shadow_op_stats_t;
>  int xc_shadow_control(xc_interface *xch,
>uint32_t domid,
>unsigned int sop,
> --- a/tools/libxc/xc_domain.c
> +++ b/tools/libxc/xc_domain.c
> @@ -1714,8 +1714,7 @@ int xc_domain_update_msi_irq(
>  uint64_t gtable)
>  {
>  int rc;
> -xen_domctl_bind_pt_irq_t *bind;
> -
> +struct xen_domctl_bind_pt_irq *bind;
>  DECLARE_DOMCTL;
>
>  domctl.cmd = XEN_DOMCTL_bind_pt_irq;
> @@ -1740,8 +1739,7 @@ int xc_domain_unbind_msi_irq(
>  uint32_t gflags)
>  {
>  int rc;
> -xen_domctl_bind_pt_irq_t *bind;
> -
> +struct xen_domctl_bind_pt_irq *bind;
>  DECLARE_DOMCTL;
>
>  domctl.cmd = XEN_DOMCTL_unbind_pt_irq;
> @@ -1770,7 +1768,7 @@ static int xc_domain_bind_pt_irq_int(
>  uint16_t spi)
>  {
>  int rc;
> -xen_domctl_bind_pt_irq_t * bind;
> +struct xen_domctl_bind_pt_irq *bind;
>  DECLARE_DOMCTL;
>
>  domctl.cmd = XEN_DOMCTL_bind_pt_irq;
> @@ -1828,7 +1826,7 @@ static int xc_domain_unbind_pt_irq_int(
>  uint8_t spi)
>  {
>  int rc;
> -xen_domctl_bind_pt_irq_t * bind;
> +struct xen_domctl_bind_pt_irq *bind;
>  DECLARE_DOMCTL;
>
>  domctl.cmd = XEN_DOMCTL_unbind_pt_irq;
> --- a/xen/arch/arm/domctl.c
> +++ b/xen/arch/arm/domctl.c
> @@ -41,7 +41,7 @@ long arch_do_domctl(struct xen_domctl *d
>  case XEN_DOMCTL_bind_pt_irq:
>  {
>  int rc;
> -xen_domctl_bind_pt_irq_t *bind = &domctl->u.bind_pt_irq;
> +struct xen_domctl_bind_pt_irq *bind = &domctl->u.bind_pt_irq;
>  uint32_t irq = bind->u.spi.spi;
>  uint32_t virq = bind->machine_irq;
>
> @@ -87,7 +87,7 @@ long arch_do_domctl(struct xen_domctl *d
>  case XEN_DOMCTL_unbind_pt_irq:
>  {
>  int rc;
> -xen_domctl_bind_pt_irq_t *bind = &domctl->u.bind_pt_irq;
> +struct xen_domctl_bind_pt_irq *bind = &domctl->u.bind_pt_irq;
>  uint32_t irq = bind->u.spi.spi;
>  uint32_t virq = bind->machine_irq;
>
> --- a/xen/arch/x86/domctl.c
> +++ b/xen/arch/x86/domctl.c
> @@ -48,7 +48,7 @@ static int gdbsx_guest_mem_io(domid_t do
>  }
>
>  static int update_domain_cpuid_info(struct domain *d,
> -const xen_domctl_cpuid_t *ctl)
> +const struct xen_domctl_cpuid *ctl)
>  {
>  struct cpuid_policy *p = d->arch.cpuid;
>  const struct cpuid_leaf leaf = { ctl->eax, ctl->ebx, ctl->ecx, ctl->edx 
> };
> @@ -363,8 +363,7 @@ long arch_do_domctl(
>  {
>
>  case XEN_DOMCTL_shadow_op:
> -ret = paging_domctl(d, &domctl->u.shadow_op,
> -guest_handle_cast(u_domctl, void), 0);
> +ret = paging_domctl(d, &domctl->u.shadow_op, u_domctl, 0);
>  if ( ret == -ERESTART )
>  return hypercall_create_continuation(__HYPERVISOR_arch_1,
>   "h", u_domctl);
> @@ -707,7 +706,7 @@ long arch_do_domctl(
>
>  case XEN_DOMCTL_bind_pt_irq:
>  {
> -xen_domctl_bind_pt_irq_t *bind = &domctl->u.bind_pt_irq;
> +struct xen_domctl_bind_pt_irq *bind = &domctl->u.bind_pt_irq;
>  int irq;
>
>  ret = -EINVAL;
> @@ -738,7 +737,7 @@ long arch_do_domctl(
>
>  case XEN_DOMCTL_unbind_pt_irq:
>  {
> -xen_domctl_bind_pt_irq_t *bind = &domctl->u.bind_pt_irq;
> +struct xen_domctl_bind_pt_irq *bind = &domctl->u.bind_pt_irq;
>  int irq = domain_pirq_to_irq(d, bind->machine_irq);
>
>  ret = -EPERM;
> --- a/xen/arch/x86/hvm/vioapic.c
> +++ b/xen/arch/x86/hvm/vioapic.c
> @@ -162,7 +162,7 @@ static int vioapic_hwdom_map_gsi(unsigne
>   unsigned int pol)
>  {
>  struct domain *currd = current->domain;
> -xen

Re: [Xen-devel] [PATCH 2/2] public/sysctl: drop unnecessary typedefs and handles

2017-09-20 Thread George Dunlap
On Tue, Sep 12, 2017 at 4:10 PM, Jan Beulich  wrote:
> By virtue of the struct xen_sysctl container structure, most of them
> are really just cluttering the name space.
>
> Signed-off-by: Jan Beulich 

Acked-by: George Dunlap 

>
> --- a/tools/libxc/include/xenctrl.h
> +++ b/tools/libxc/include/xenctrl.h
> @@ -1212,11 +1212,11 @@ int xc_readconsolering(xc_interface *xch
>  int xc_send_debug_keys(xc_interface *xch, char *keys);
>  int xc_set_parameters(xc_interface *xch, char *params);
>
> -typedef xen_sysctl_physinfo_t xc_physinfo_t;
> -typedef xen_sysctl_cputopo_t xc_cputopo_t;
> -typedef xen_sysctl_numainfo_t xc_numainfo_t;
> -typedef xen_sysctl_meminfo_t xc_meminfo_t;
> -typedef xen_sysctl_pcitopoinfo_t xc_pcitopoinfo_t;
> +typedef struct xen_sysctl_physinfo xc_physinfo_t;
> +typedef struct xen_sysctl_cputopo xc_cputopo_t;
> +typedef struct xen_sysctl_numainfo xc_numainfo_t;
> +typedef struct xen_sysctl_meminfo xc_meminfo_t;
> +typedef struct xen_sysctl_pcitopoinfo xc_pcitopoinfo_t;
>
>  typedef uint32_t xc_cpu_to_node_t;
>  typedef uint32_t xc_cpu_to_socket_t;
> @@ -1240,7 +1240,7 @@ int xc_machphys_mfn_list(xc_interface *x
>   unsigned long max_extents,
>   xen_pfn_t *extent_start);
>
> -typedef xen_sysctl_cpuinfo_t xc_cpuinfo_t;
> +typedef struct xen_sysctl_cpuinfo xc_cpuinfo_t;
>  int xc_getcpuinfo(xc_interface *xch, int max_cpus,
>xc_cpuinfo_t *info, int *nr_cpus);
>
> @@ -1853,8 +1853,8 @@ int xc_cpu_offline(xc_interface *xch, in
>   * cpufreq para name of this structure named
>   * same as sysfs file name of native linux
>   */
> -typedef xen_userspace_t xc_userspace_t;
> -typedef xen_ondemand_t xc_ondemand_t;
> +typedef struct xen_userspace xc_userspace_t;
> +typedef struct xen_ondemand xc_ondemand_t;
>
>  struct xc_get_cpufreq_para {
>  /* IN/OUT variable */
> --- a/tools/libxc/xc_misc.c
> +++ b/tools/libxc/xc_misc.c
> @@ -547,7 +547,7 @@ int xc_livepatch_upload(xc_interface *xc
>  DECLARE_SYSCTL;
>  DECLARE_HYPERCALL_BUFFER(char, local);
>  DECLARE_HYPERCALL_BOUNCE(name, 0 /* later */, 
> XC_HYPERCALL_BUFFER_BOUNCE_IN);
> -xen_livepatch_name_t def_name = { .pad = { 0, 0, 0 } };
> +struct xen_livepatch_name def_name = { };
>
>  if ( !name || !payload )
>  {
> @@ -594,12 +594,12 @@ int xc_livepatch_upload(xc_interface *xc
>
>  int xc_livepatch_get(xc_interface *xch,
>   char *name,
> - xen_livepatch_status_t *status)
> + struct xen_livepatch_status *status)
>  {
>  int rc;
>  DECLARE_SYSCTL;
>  DECLARE_HYPERCALL_BOUNCE(name, 0 /*adjust later */, 
> XC_HYPERCALL_BUFFER_BOUNCE_IN);
> -xen_livepatch_name_t def_name = { .pad = { 0, 0, 0 } };
> +struct xen_livepatch_name def_name = { };
>
>  if ( !name )
>  {
> @@ -677,7 +677,7 @@ int xc_livepatch_get(xc_interface *xch,
>   * retrieved (if any).
>   */
>  int xc_livepatch_list(xc_interface *xch, unsigned int max, unsigned int 
> start,
> -  xen_livepatch_status_t *info,
> +  struct xen_livepatch_status *info,
>char *name, uint32_t *len,
>unsigned int *done,
>unsigned int *left)
> @@ -837,7 +837,7 @@ static int _xc_livepatch_action(xc_inter
>  DECLARE_SYSCTL;
>  /* The size is figured out when we strlen(name) */
>  DECLARE_HYPERCALL_BOUNCE(name, 0, XC_HYPERCALL_BUFFER_BOUNCE_IN);
> -xen_livepatch_name_t def_name = { .pad = { 0, 0, 0 } };
> +struct xen_livepatch_name def_name = { };
>
>  def_name.size = strlen(name) + 1;
>
> --- a/xen/arch/arm/sysctl.c
> +++ b/xen/arch/arm/sysctl.c
> @@ -12,7 +12,7 @@
>  #include 
>  #include 
>
> -void arch_do_physinfo(xen_sysctl_physinfo_t *pi) { }
> +void arch_do_physinfo(struct xen_sysctl_physinfo *pi) { }
>
>  long arch_do_sysctl(struct xen_sysctl *sysctl,
>  XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl)
> --- a/xen/arch/x86/sysctl.c
> +++ b/xen/arch/x86/sysctl.c
> @@ -72,7 +72,7 @@ long cpu_down_helper(void *data)
>  return ret;
>  }
>
> -void arch_do_physinfo(xen_sysctl_physinfo_t *pi)
> +void arch_do_physinfo(struct xen_sysctl_physinfo *pi)
>  {
>  memcpy(pi->hw_cap, boot_cpu_data.x86_capability,
> min(sizeof(pi->hw_cap), sizeof(boot_cpu_data.x86_capability)));
> --- a/xen/common/gcov/gcov.c
> +++ b/xen/common/gcov/gcov.c
> @@ -209,7 +209,7 @@ static int gcov_dump_all(XEN_GUEST_HANDL
>  return ret;
>  }
>
> -int sysctl_gcov_op(xen_sysctl_gcov_op_t *op)
> +int sysctl_gcov_op(struct xen_sysctl_gcov_op *op)
>  {
>  int ret;
>
> --- a/xen/common/livepatch.c
> +++ b/xen/common/livepatch.c
> @@ -104,7 +104,7 @@ static struct livepatch_work livepatch_w
>   */
>  static DEFINE_PER_CPU(bool_t, work_to_do);
>
> -static int get_name(const xen_livepatch_name_t *name, char *n)
> +static int get_name(const struct xen_livepatch_name *name, char *n)
>  {
> 

[Xen-devel] [xen-unstable-coverity test] 113625: all pass - PUSHED

2017-09-20 Thread osstest service owner
flight 113625 xen-unstable-coverity real [real]
http://logs.test-lab.xenproject.org/osstest/logs/113625/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 xen  64cf3181e4d469a8bd7e7dee8ff2d3bf5b45f4b0
baseline version:
 xen  abd91b2a2bcd05618a71f7e5fe571dd10a5727bc

Last test of basis   113540  2017-09-17 09:39:30 Z3 days
Testing same since   113625  2017-09-20 09:22:12 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Boris Ostrovsky 
  George Dunlap 
  Ian Jackson 
  Jan Beulich 
  Juergen Gross 
  Julien Grall 
  Kevin Tian 
  Konrad Rzeszutek Wilk 
  Stefano Stabellini 
  Wei Liu 

jobs:
 coverity-amd64   pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

+ branch=xen-unstable-coverity
+ revision=64cf3181e4d469a8bd7e7dee8ff2d3bf5b45f4b0
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
 export PERLLIB=.:.
 PERLLIB=.:.
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x '!=' x/home/osstest/repos/lock ']'
++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock
++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push 
xen-unstable-coverity 64cf3181e4d469a8bd7e7dee8ff2d3bf5b45f4b0
+ branch=xen-unstable-coverity
+ revision=64cf3181e4d469a8bd7e7dee8ff2d3bf5b45f4b0
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
 export PERLLIB=.:.:.
 PERLLIB=.:.:.
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']'
+ . ./cri-common
++ . ./cri-getconfig
+++ export PERLLIB=.:.:.:.
+++ PERLLIB=.:.:.:.
++ umask 002
+ select_xenbranch
+ case "$branch" in
+ tree=xen
+ xenbranch=xen-unstable-coverity
+ qemuubranch=qemu-upstream-unstable-coverity
+ qemuubranch=qemu-upstream-unstable
+ '[' xxen = xlinux ']'
+ linuxbranch=
+ '[' xqemu-upstream-unstable = x ']'
+ select_prevxenbranch
++ ./cri-getprevxenbranch xen-unstable-coverity
+ prevxenbranch=xen-4.9-testing
+ '[' x64cf3181e4d469a8bd7e7dee8ff2d3bf5b45f4b0 = x ']'
+ : tested/2.6.39.x
+ . ./ap-common
++ : osst...@xenbits.xen.org
+++ getconfig OsstestUpstream
+++ perl -e '
use Osstest;
readglobalconfig();
print $c{"OsstestUpstream"} or die $!;
'
++ :
++ : git://xenbits.xen.org/xen.git
++ : osst...@xenbits.xen.org:/home/xen/git/xen.git
++ : git://xenbits.xen.org/qemu-xen-traditional.git
++ : git://git.kernel.org
++ : git://git.kernel.org/pub/scm/linux/kernel/git
++ : git
++ : git://xenbits.xen.org/xtf.git
++ : osst...@xenbits.xen.org:/home/xen/git/xtf.git
++ : git://xenbits.xen.org/xtf.git
++ : git://xenbits.xen.org/libvirt.git
++ : osst...@xenbits.xen.org:/home/xen/git/libvirt.git
++ : git://xenbits.xen.org/libvirt.git
++ : git://xenbits.xen.org/osstest/rumprun.git
++ : git
++ : git://xenbits.xen.org/osstest/rumprun.git
++ : osst...@xenbits.xen.org:/home/xen/git/osstest/rumprun.git
++ : git://git.seabios.org/seabios.git
++ : osst...@xenbits.xen.org:/home/xen/git/osstest/seabios.git
++ : git://xenbits.xen.org/osstest/seabios.git
++ : https://github.com/tianocore/edk2.git
++ : osst...@xenbits.xen.org:/home/xen/git/osstest/ovmf.git
++ : git://xenbits.xen.org/osstest/ovmf.git
++ : git://xenbits.xen.org/osstest/linux-firmware.git
++ : osst...@xenbits.xen.org:/home/osstest/ext/linux-firmware.git
++ : git://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git
++ : osst...@xenbits.xen.org:/home/xen/git/linux-pvops.git
++ : git://xenbits.xen.org/linux-pvops.git
++ : tested/linux-4.9
++ : tested/linux-arm-xen
++ '[' xgit://xenbits.xen.org/linux-pvops.git = x ']'
++ '[' x = x ']'
++ : git://xenb

Re: [Xen-devel] [PATCH v2 2/3] Introduce migration precopy policy

2017-09-20 Thread Roger Pau Monné
On Tue, Sep 19, 2017 at 07:06:26PM +0100, Jennifer Herbert wrote:
> This Patch allows a migration precopy policy to be specified.
> 
> The precopy phase of the xc_domain_save() live migration algorithm has
> historically been implemented to run until either a) (almost) no pages
> are dirty or b) some fixed, hard-coded maximum number of precopy
> iterations has been exceeded.  This policy and its implementation are
> less than ideal for a few reasons:
> - the logic of the policy is intertwined with the control flow of the
>   mechanism of the precopy stage
> - it can't take into account facts external to the immediate
>   migration context, such external state transfer state, interactive
>   user input, or the passage of wall-clock time.
> - it does not permit the user to change their mind, over time, about
>   what to do at the end of the precopy (they get an unconditional
>   transition into the stop-and-copy phase of the migration)
> 
> To permit callers to implement arbitrary higher-level policies governing
> when the live migration precopy phase should end, and what should be
> done next:
> - add a precopy_policy() callback to the xc_domain_save() user-supplied
>   callbacks
> - during the precopy phase of live migrations, consult this policy after
>   each batch of pages transmitted and take the dictated action, which
>   may be to a) abort the migration entirely, b) continue with the
>   precopy, or c) proceed to the stop-and-copy phase.
> - provide an implementation of the old policy, used when
>   precopy_policy callback  is not provided.
> 
> Signed-off-by: Jennifer Herbert 
> Signed-off-by: Joshua Otto 
> 
> ---
> 
> v2:
> 
> Have made a few formatting corrections, added typedef as suggested.
> 
> v1:
> 
> This is updated/modified subset of patch 7/20, part of
> Joshua Otto's "Add postcopy live migration support." patch,
> dated 27th March 2017.  As indicated on the original thread,
> I wish to make use of this this within the XenServer product.
> I hope this will aid Josh in pushing the remainder of his series.
> 
> ---
>  tools/libxc/include/xenguest.h | 31 --
>  tools/libxc/xc_sr_common.h |  6 +--
>  tools/libxc/xc_sr_save.c   | 97 
> +-
>  3 files changed, 97 insertions(+), 37 deletions(-)
> 
> diff --git a/tools/libxc/include/xenguest.h b/tools/libxc/include/xenguest.h
> index 6626f0c..a2a654c 100644
> --- a/tools/libxc/include/xenguest.h
> +++ b/tools/libxc/include/xenguest.h
> @@ -39,6 +39,16 @@
>   */
>  struct xenevtchn_handle;
>  
> +/* For save's precopy_policy(). */
> +struct precopy_stats
> +{
> +unsigned int iteration;
> +unsigned int total_written;
> +long dirty_count; /* -1 if unknown */
> +};
> +
> +typedef int (*precopy_policy_t)(struct precopy_stats, void *);

Shouldn't precopy_stats be a pointer (const pointer probably seeing
it's usage)?

> +
>  /* callbacks provided by xc_domain_save */
>  struct save_callbacks {
>  /* Called after expiration of checkpoint interval,
> @@ -46,7 +56,20 @@ struct save_callbacks {
>   */
>  int (*suspend)(void* data);
>  
> -/* Called after the guest's dirty pages have been
> +/*
> + * Called after every batch of page data sent during the precopy
> + * phase of a live migration to ask the caller what to do next
> + * based on the current state of the precopy migration.

I would add:

"Should return one of the values listed below:"

> + */
> +#define XGS_POLICY_ABORT  (-1) /* Abandon the migration entirely
> +* and tidy up. */
> +#define XGS_POLICY_CONTINUE_PRECOPY 0  /* Remain in the precopy phase. */
> +#define XGS_POLICY_STOP_AND_COPY1  /* Immediately suspend and transmit 
> the
> +* remaining dirty pages. */
> +precopy_policy_t precopy_policy;
> +
> +/*
> + * Called after the guest's dirty pages have been
>   *  copied into an output buffer.
>   * Callback function resumes the guest & the device model,
>   *  returns to xc_domain_save.
> @@ -55,7 +78,8 @@ struct save_callbacks {
>   */
>  int (*postcopy)(void* data);
>  
> -/* Called after the memory checkpoint has been flushed
> +/*
> + * Called after the memory checkpoint has been flushed
>   * out into the network. Typical actions performed in this
>   * callback include:
>   *   (a) send the saved device model state (for HVM guests),
> @@ -65,7 +89,8 @@ struct save_callbacks {
>   *
>   * returns:
>   * 0: terminate checkpointing gracefully
> - * 1: take another checkpoint */
> + * 1: take another checkpoint 
^ trailing space
> + */
>  int (*checkpoint)(void* data);
>  
>  /*
> diff --git a/tools/libxc/xc_sr_common.h b/tools/libxc/xc_sr_common.h
> index a83f22a..3635704 100644
> --- a/tools/libxc/xc_sr_common.h
> +++ b/tools/libxc/xc_sr_common.h
> @@ -198,12 +198,10 @@ s

Re: [Xen-devel] [PATCH 0/2] public/*ctl: drop unnecessary typedefs and handles

2017-09-20 Thread Julien Grall

Hi,

On 12/09/17 15:25, Jan Beulich wrote:

1: public/domctl: drop unnecessary typedefs and handles
2: public/sysctl: drop unnecessary typedefs and handles

Signed-off-by: Jan Beulich 


For the ARM changes:

Acked-by: Julien Grall 

Cheers,





--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v8 13/15] xen: make grant resource limits per domain

2017-09-20 Thread Juergen Gross
On 20/09/17 12:34, Jan Beulich wrote:
 On 20.09.17 at 08:34,  wrote:
>> --- a/xen/common/compat/grant_table.c
>> +++ b/xen/common/compat/grant_table.c
>> @@ -157,21 +157,14 @@ int compat_grant_table_op(unsigned int cmd,
>>  unsigned int max_frame_list_size_in_page =
>>  (COMPAT_ARG_XLAT_SIZE - sizeof(*nat.setup)) /
>>  sizeof(*nat.setup->frame_list.p);
>> -if ( max_frame_list_size_in_page < max_grant_frames )
> 
> The latest here, but perhaps even earlier I think max_grant_frames
> should become static, so one can be reasonably certain that all other
> references are gone.

Patch 14 removes the last references, so I can't do it earlier.

>> @@ -290,8 +293,8 @@ num_act_frames_from_sha_frames(const unsigned int num)
>>  return DIV_ROUND_UP(num * sha_per_page, ACGNT_PER_PAGE);
>>  }
>>  
>> -#define max_nr_active_grant_frames \
>> -num_act_frames_from_sha_frames(max_grant_frames)
>> +#define max_nr_active_grant_frames(gt) \
>> +num_act_frames_from_sha_frames(gt->max_grant_frames)
> 
> Parentheses around gt please.

Okay.

>> @@ -1718,8 +1724,9 @@ gnttab_grow_table(struct domain *d, unsigned int 
>> req_nr_frames)
>>  ASSERT(gt->active);
>>  
>>  if ( req_nr_frames < INITIAL_NR_GRANT_FRAMES )
>> -req_nr_frames = INITIAL_NR_GRANT_FRAMES;
>> -ASSERT(req_nr_frames <= max_grant_frames);
>> +req_nr_frames = min_t(unsigned int, INITIAL_NR_GRANT_FRAMES,
>> +gt->max_grant_frames);
> 
> Perhaps give the INITIAL_NR_GRANT_FRAMES value a U suffix
> instead of using min_t() here?

Okay.

>> @@ -1777,13 +1784,15 @@ active_alloc_failed:
>>  
>>  static long
>>  gnttab_setup_table(
>> -XEN_GUEST_HANDLE_PARAM(gnttab_setup_table_t) uop, unsigned int count)
>> +XEN_GUEST_HANDLE_PARAM(gnttab_setup_table_t) uop, unsigned int count,
>> +unsigned int limit_max)
>>  {
>>  struct vcpu *curr = current;
>>  struct gnttab_setup_table op;
>>  struct domain *d = NULL;
>>  struct grant_table *gt;
>>  unsigned int i;
>> +long ret = 0;
> 
> Wouldn't int suffice here?

I just followed the return type of the function. I think this way is
cleaner, but in case you like int better I can change it.

>> @@ -1819,6 +1819,21 @@ gnttab_setup_table(
>>  gt = d->grant_table;
>>  grant_write_lock(gt);
>>  
>> +if ( unlikely(op.nr_frames > gt->max_grant_frames) )
>> +{
>> +gdprintk(XENLOG_INFO, "Domain is limited to %d grant-table 
>> frames.\n",
> 
> %u and you also want to log the subject domain ID (which may not
> be current's; same for the other log message below as well as the
> similar one in gnttab_get_status_frames()).

Okay.

>> +gt->max_grant_frames);
>> +op.status = GNTST_general_error;
>> +goto unlock;
>> +}
>> +if ( unlikely(limit_max < gt->max_grant_frames) )
> 
> With the check moved here it can be relaxed afaict: Code below
> only writes op.nr_frames entries (same then again for
> gnttab_get_status_frames()).

Okay.

>> @@ -1852,10 +1867,10 @@ gnttab_setup_table(
>>  if ( d )
>>  rcu_unlock_domain(d);
>>  
>> -if ( unlikely(__copy_field_to_guest(uop, &op, status)) )
>> +if ( !ret && unlikely(__copy_field_to_guest(uop, &op, status)) )
> 
> I wonder whether it wouldn't be better to switch that check
> producing -EINVAL to also report the failure in op.status, now
> that it lives here (same then for gnttab_get_status_frames()
> once again).

Okay.

>>  static long
>>  gnttab_get_status_frames(XEN_GUEST_HANDLE_PARAM(gnttab_get_status_frames_t) 
>> uop,
>> - int count)
>> + int count, unsigned int limit_max)
> 
> Take the opportunity and switch count to unsigned int?

Okay.

>> @@ -3320,7 +3347,7 @@ do_grant_table_op(
>>  
>>  case GNTTABOP_setup_table:
>>  rc = gnttab_setup_table(
>> -guest_handle_cast(uop, gnttab_setup_table_t), count);
>> +guest_handle_cast(uop, gnttab_setup_table_t), count, ~0);
> 
> UINT_MAX?

Yes.

>> @@ -3442,6 +3469,8 @@ grant_table_create(
>>  /* Simple stuff. */
>>  percpu_rwlock_resource_init(&t->lock, grant_rwlock);
>>  spin_lock_init(&t->maptrack_lock);
>> +t->max_grant_frames = max_grant_frames;
>> +t->max_maptrack_frames = max_maptrack_frames;
> 
> This together with ...
> 
>> @@ -3655,7 +3686,11 @@ int grant_table_set_limits(struct domain *d, unsigned 
>> int grant_frames,
>>  
>>  /* Set limits. */
>>  if ( !gt->active )
>> +{
>> +gt->max_grant_frames = grant_frames;
>> +gt->max_maptrack_frames = maptrack_frames;
>>  ret = grant_table_init(gt);
>> +}
> 
> .. this raises the question of whether it is legal to decrease the
> limits. There may be code depending on it only ever growing.

Before grant_table_init() has been called there is no problem
decreasing the limits, as nothing depending on them 

Re: [Xen-devel] minimum Python version

2017-09-20 Thread Ian Jackson
Jan Beulich writes ("minimum Python version"):
> for quite a while (apparently as of Marek's series in February) I've
> been seeing
> 
> xen/lowlevel/xc/xc.c: In function ‘pyxc_dom_extract_cpuid’:
> xen/lowlevel/xc/xc.c:692:13: error: implicit declaration of function 
> ‘PyBytes_AS_STRING’ [-Werror=implicit-function-declaration]

"PyBytes_AS_STRING" is a compatibility alias to help porting to Python
3.  They are provided in Python 2.6.0 and later.

https://docs.python.org/release/2.6/c-api/string.html

Python 2.6.0 was released on 1st October 2008.  In a week and a half's
time that will be 10 years ago.

Under the circumstances we should clearly keep the Python-3-compatible
naming.

> on an older system having Python 2.4 on it. The minimum version
> configure checks for appears to be 2.3, matching up with what
> ./README says. Since the Python bindings aren't something one
> absolutely needs (afaict), wouldn't it be reasonable to check for
> a suitable version and if that fails simply disable their building
> (which currently is unconditional)? What I'd like to avoid is ending
> up with being unable to build the tools on that system - I know
> it's rather old, but there's a reason I'd like to keep it (including its
> old distro level).

I would welcome a patch to change the minimum python version to 2.6.
I would also welcome a patch to make a too-old python version disable
the python build rather than causing the whole build to break.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] minimum Python version

2017-09-20 Thread Ian Jackson
Jan Beulich writes ("Re: [Xen-devel] minimum Python version"):
> On 20.09.17 at 10:49,  wrote:
> > What about the following idea: maybe we could add an option to configure
> > to make it fail instead of disabling components in case some
> > requirements for that component are not met? The default could be to
> > just disable the components and with the option specified all components
> > not disabled explicitly must be buildable or configure will fail.
> 
> If that's something that (a) can be reasonably expressed in
> configure.ac and (b) the tools maintainers can live with, that's
> certainly an option.

I'd could live with that.

Personally I think the whole purpose of configure is to allow
automatic building of what's possible, leaving out anything else.

So I think the default for this new option should be to suppress
parts of the build whose dependencies are not satisfied, rather than
to bomb out.

(I wrote "nonessential parts" but I thinkthere are no essential
parts.  The hypervisor itself is already "nonessential" by this
defininion - i386 builds skip it.  And the tools are nonessential
since you might be building on amd64 just for the hypervisor...)

And, yes, configure can express that.  It is an m4-generated shell
script.

My reservation is that this proposed new option would make every
dependency for every feature more complicated.  So to avoid that it
would be necessary to invent new macros to generate the formulaic
enable/disable code.  I doubt anyone has the effort for that.

Jan, I suggest you submit at the very least a patch to change the
minimum to 2.6.  Personally I would accept one to have it disable
python instead of bombing out, if you would like to write it.  That is
the way we deal with many other nonessential components already.

I think that if we want something more sophisticated, involving
optionally bombing out on dependency trouble, it's up to people who
want that to do the work.

(As an aside, I don't think this is a distro/upstream division.  With
my Debian maintainer had on I actually prefer the automatic fallback.
Our build-dependency system avoids accidental creation of broken
packages in normal situations, and having graceful degradation makes
it easier to do unusual things eg in derivatives, when cross-building,
or whatever.)

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 3/3] RFC: migration: defer precopy policy to libxl

2017-09-20 Thread Roger Pau Monné
On Tue, Sep 19, 2017 at 07:06:27PM +0100, Jennifer Herbert wrote:
> Provide an implementation of the old policy as a callback in
> libxl and plumb it through the IPC machinery to libxc.
> 
> This serves as an example for defining a libxl policy,
> and provides no advantage over the default policy in libxc.
> 
> Signed-off-by: Joshua Otto 

LGTM:

Reviewed-by: Roger Pau Monné 

Have we ever thought of providing libxl consumers the ability to set
these hooks? Would that be interesting?

Thanks, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2] xen: grant-table: Simplify get_paged_frame

2017-09-20 Thread Julien Grall

Hi Roger,

On 19/09/17 12:40, Roger Pau Monné wrote:

On Tue, Sep 19, 2017 at 10:44:53AM +0100, Julien Grall wrote:

Hi,

On 19/09/17 08:13, Roger Pau Monné wrote:

On Mon, Sep 18, 2017 at 06:32:22PM +0100, Julien Grall wrote:

Hi Roger,

On 18/09/17 17:58, Roger Pau Monné wrote:

On Mon, Sep 18, 2017 at 05:27:52PM +0100, Julien Grall wrote:

The implementation of get_paged_frame is currently different whether the
architecture support sharing memory or paging memory. Both
version are extremely similar so it is possible to consolidate in a
single implementation.

The main difference is the x86 version will allow grant on foreign page
when using HVM/PVH whilst Arm does not. At the moment, on x86 foreign pages
are only allowed for PVH Dom0. It seems that foreign pages should never
be granted so deny them

The check for shared/paged memory are now gated with the respective ifdef.
Potentially, dummy p2m_is_shared/p2m_is_paging could be implemented for
Arm.

Signed-off-by: Julien Grall 

---

Cc: Andrew Cooper 
Cc: George Dunlap 
Cc: Ian Jackson 
Cc: Jan Beulich 
Cc: Konrad Rzeszutek Wilk 
Cc: Stefano Stabellini 
Cc: Tim Deegan 
Cc: Wei Liu 

   Changes in v2:
   - Deny grant on foreign page (aligned with the ARM code)
   - Use #ifdef rather than #if defined
   - Update commit message
   - Fix typo in the title

get_page_from_gfn will be able to get reference on foreign page and as
per my understanding will allow to grant page on foreign memory.

This was not allowed with a simple get_page(...) on the ARM
implementation (no sharing nor paging supprot) but is allowed on the x86
implementation due to get_page_from_gfn.

On x86, foreign pages are currently only allowed for PVH dom0, so I
think it is not a big deal for now.

On Arm, foreign pages can be present on any domain. So this patch would
permit grant on foreing pages.

This patch will deny granting foreign pages. Jan Beulich is happy with
it. Any other opinions?


Won't this break QEMU running in stub domains?

I haven't tested it, but I'm afraid QEMU running in a stub domain
might try to grant a foreign frame. Ie: the emulated network code in
QEMU might try to grant a foreign frame in order to forward operations
from emulated devices to PV frontends.


I don't think it will break any existing setup because foreign mapping are
only allowed for the hardware domain (see p2m_add_foreign).


IIRC this only applies to auto-translated (HVM) domains, not to PV
domains. QEMU stubdomains are PV guests.


Still, p2m_map_foreign can only be used for auto-translated domains.


Oh, so you are limiting this only for auto-translated guests? Should
have looked more closely at the commit message.


That's right. I am happy to update the commit message if it is not clear 
enough.


Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 0/2] public/*ctl: drop unnecessary typedefs and handles

2017-09-20 Thread Jan Beulich
>>> On 20.09.17 at 13:07,  wrote:
> On 12/09/17 15:25, Jan Beulich wrote:
>> 1: public/domctl: drop unnecessary typedefs and handles
>> 2: public/sysctl: drop unnecessary typedefs and handles
>> 
>> Signed-off-by: Jan Beulich 
> 
> For the ARM changes:
> 
> Acked-by: Julien Grall 

Thanks, but I have this on record already.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xen: credit2: fix spinlock irq-safety violation

2017-09-20 Thread Dario Faggioli
On Wed, 2017-09-20 at 11:04 +0100, George Dunlap wrote:
> On 09/20/2017 10:19 AM, George Dunlap wrote:
> > > diff --git a/xen/common/sched_credit2.c
> > > b/xen/common/sched_credit2.c
> > > index 32234ac..7a550db 100644
> > > --- a/xen/common/sched_credit2.c
> > > +++ b/xen/common/sched_credit2.c
> > > @@ -2923,13 +2923,13 @@ csched2_free_domdata(const struct
> > > scheduler *ops, void *data)
> > >  
> > >  write_lock_irqsave(&prv->lock, flags);
> > >  
> > > -kill_timer(sdom->repl_timer);
> > > -xfree(sdom->repl_timer);
> > > -
> > >  list_del_init(&sdom->sdom_elem);
> > >  
> > >  write_unlock_irqrestore(&prv->lock, flags);
> > >  
> > > +kill_timer(sdom->repl_timer);
> > > +xfree(sdom->repl_timer);
> > 
> > Any particular reason for moving the kill_timer() as well as the
> > xfree
> > outside the lock?  What happens if the timer goes off after the
> > irqrestore but before the kill_timer?
> 
> Looks like if the timer fires, nothing terribly bad will happen; it
> will
> just do a useless budget replenishment.
> 
It's just that it has not reason to be there, as nothing that it does
is serialized by prv->lock, so it only makes the critical section (with
IRQ disabled) longer, for no reason, which is bad, as this being a
write_lock(), it'll stop readers too.

I think it was (my) mistake to put it there in the first place.

> It looks like kill_timer() disables irqs, so it could be moved inside
> the critical section.  I'm inclined to say we should do so.  I don't
> anticipate the budget replenishment ever to need to walk the domain
> list, but should that change, this would be a bear of a bug to find.
> 
Indeed it's rather unlikely for the replenishment handler to have to
use sdom_elem to go through the list of domains.

IAC, if you're concerned about that, I'd much rather put both
kill_timer() and xfree() before the critical section, rather than
after, like in the attached patch.

Dario
-- 
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)--- Begin Message ---
[PATCH v2] xen: credit2: fix spinlock irq-safety violation

In commit ad4b3e1e9df34 ("xen: credit2: implement
utilization cap") xfree() was being called (for
deallocating the budget replenishment timer, during
domain destruction) inside an IRQ disabled critical
section.

That must not happen, as it uses the mem-pool's lock,
which needs to be taken with IRQ enabled. And, in fact,
we crash (in debug builds):

(XEN) 
(XEN) Panic on CPU 0:
(XEN) Xen BUG at spinlock.c:47
(XEN) 

Let's, therefore, kill and deallocate the timer outside of
the critical sections, when IRQs are enabled.

Signed-off-by: Dario Faggioli 
---
Cc: osstest service owner 
Cc: George Dunlap 
Cc: Wei Liu 
---
This was spotted by OSSTest's flight 113562:

 http://logs.test-lab.xenproject.org/osstest/logs/113562/
 
http://logs.test-lab.xenproject.org/osstest/logs/113562/test-amd64-amd64-xl-credit2/serial-godello0.log
---
Changes from v1:
- kill_timer() and xfree() moved above critical section, instead than below.

diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index 32234ac..32b0363 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -2921,11 +2921,11 @@ csched2_free_domdata(const struct scheduler *ops, void 
*data)
 struct csched2_dom *sdom = data;
 struct csched2_private *prv = csched2_priv(ops);
 
-write_lock_irqsave(&prv->lock, flags);
-
 kill_timer(sdom->repl_timer);
 xfree(sdom->repl_timer);
 
+write_lock_irqsave(&prv->lock, flags);
+
 list_del_init(&sdom->sdom_elem);
 
 write_unlock_irqrestore(&prv->lock, flags);
--- End Message ---


signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v8 13/15] xen: make grant resource limits per domain

2017-09-20 Thread Jan Beulich
>>> On 20.09.17 at 13:10,  wrote:
> On 20/09/17 12:34, Jan Beulich wrote:
> On 20.09.17 at 08:34,  wrote:
>>> --- a/xen/common/compat/grant_table.c
>>> +++ b/xen/common/compat/grant_table.c
>>> @@ -157,21 +157,14 @@ int compat_grant_table_op(unsigned int cmd,
>>>  unsigned int max_frame_list_size_in_page =
>>>  (COMPAT_ARG_XLAT_SIZE - sizeof(*nat.setup)) /
>>>  sizeof(*nat.setup->frame_list.p);
>>> -if ( max_frame_list_size_in_page < max_grant_frames )
>> 
>> The latest here, but perhaps even earlier I think max_grant_frames
>> should become static, so one can be reasonably certain that all other
>> references are gone.
> 
> Patch 14 removes the last references, so I can't do it earlier.

That's unfortunate, but okay.

>>> @@ -1777,13 +1784,15 @@ active_alloc_failed:
>>>  
>>>  static long
>>>  gnttab_setup_table(
>>> -XEN_GUEST_HANDLE_PARAM(gnttab_setup_table_t) uop, unsigned int count)
>>> +XEN_GUEST_HANDLE_PARAM(gnttab_setup_table_t) uop, unsigned int count,
>>> +unsigned int limit_max)
>>>  {
>>>  struct vcpu *curr = current;
>>>  struct gnttab_setup_table op;
>>>  struct domain *d = NULL;
>>>  struct grant_table *gt;
>>>  unsigned int i;
>>> +long ret = 0;
>> 
>> Wouldn't int suffice here?
> 
> I just followed the return type of the function. I think this way is
> cleaner, but in case you like int better I can change it.

I sort of expected this reply, but that's not in line with what you
did in gnttab_get_status_frames() then. I think we will want to
eventually change all function return types to int where the wider
type isn't needed.

>>> @@ -3442,6 +3469,8 @@ grant_table_create(
>>>  /* Simple stuff. */
>>>  percpu_rwlock_resource_init(&t->lock, grant_rwlock);
>>>  spin_lock_init(&t->maptrack_lock);
>>> +t->max_grant_frames = max_grant_frames;
>>> +t->max_maptrack_frames = max_maptrack_frames;
>> 
>> This together with ...
>> 
>>> @@ -3655,7 +3686,11 @@ int grant_table_set_limits(struct domain *d, 
>>> unsigned int grant_frames,
>>>  
>>>  /* Set limits. */
>>>  if ( !gt->active )
>>> +{
>>> +gt->max_grant_frames = grant_frames;
>>> +gt->max_maptrack_frames = maptrack_frames;
>>>  ret = grant_table_init(gt);
>>> +}
>> 
>> .. this raises the question of whether it is legal to decrease the
>> limits. There may be code depending on it only ever growing.
> 
> Before grant_table_init() has been called there is no problem
> decreasing the limits, as nothing depending on them has been setup
> yet.

Oh, right, I didn't pay attention to this being a one-time action.

>> Additionally to take the input values without applying some
>> upper cap - while we have XSA-77, we still shouldn't introduce
>> new issues making disaggregation more unsafe. Perhaps the
>> global limits could serve as a cap here?
> 
> I thought about a cap and TBH I'm not sure which would be sane to
> apply. The global limits seem wrong, especially looking at patch 14:
> those limits will be for dom0 only then. And dom0 won't need many
> grant frames in the normal case...

I've been thinking about this Dom0 aspect too over lunch. What
about allowing the hardware domain to set its limit (only upwards
of course) in setup_table(), without any upper bound enforced?
This would free up the globals to be used as system wide limits
again.

> So I could make up a cap in form of either a configurable constant
> (CONFIG_* or boot parameter?) or as a fraction of domain memory. Any
> preferences here?

A config constant as well as a fraction of domain memory might lock
out special purpose guests. Which would leave command line options
- as per above perhaps the ones we already have.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [libvirt test] 113616: tolerable all pass - PUSHED

2017-09-20 Thread osstest service owner
flight 113616 libvirt real [real]
http://logs.test-lab.xenproject.org/osstest/logs/113616/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail  like 113592
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 113592
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 113592
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qcow2 12 migrate-support-checkfail  never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass

version targeted for testing:
 libvirt  dbd98380b9180afab0a9149a2ba4a58bf06d0e02
baseline version:
 libvirt  3faf3ca60af342ac74a17d9d6decc70a947d

Last test of basis   113592  2017-09-19 04:20:58 Z1 days
Testing same since   113616  2017-09-20 04:20:57 Z0 days1 attempts


People who touched revisions under test:
  Andrea Bolognani 
  Ashish Mittal 
  Erik Skultety 
  John Ferlan 
  Ján Tomko 
  Laine Stump 
  Michal Privoznik 
  Nikolay Shirokovskiy 
  Peter Krempa 

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm   pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmpass
 test-amd64-amd64-libvirt-xsm pass
 test-armhf-armhf-libvirt-xsm pass
 test-amd64-i386-libvirt-xsm  pass
 test-amd64-amd64-libvirt pass
 test-armhf-armhf-libvirt pass
 test-amd64-i386-libvirt  pass
 test-amd64-amd64-libvirt-pairpass
 test-amd64-i386-libvirt-pair pass
 test-amd64-i386-libvirt-qcow2pass
 test-armhf-armhf-libvirt-raw pass
 test-amd64-amd64-libvirt-vhd pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

+ branch=libvirt
+ revision=dbd98380b9180afab0a9149a2ba4a58bf06d0e02
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
 export PERLLIB=.:.
 PERLLIB=.:.
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x '!=' x/home/osstest/repos/lock ']'
++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock
++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push libvirt 

Re: [Xen-devel] [PATCH] xen: credit2: fix spinlock irq-safety violation

2017-09-20 Thread Jan Beulich
>>> On 20.09.17 at 13:44,  wrote:
> IAC, if you're concerned about that, I'd much rather put both
> kill_timer() and xfree() before the critical section, rather than
> after, like in the attached patch.

Hmm, killing the timer upfront is certainly fine, but is freeing the
data before removing the element from the list safe not only
currently, but also going forward?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v8 14/15] xen: make grant table limits boot parameters dom0 only

2017-09-20 Thread Jan Beulich
>>> On 20.09.17 at 08:34,  wrote:
> --- a/xen/arch/arm/domain_build.c
> +++ b/xen/arch/arm/domain_build.c
> @@ -2098,7 +2098,7 @@ static void __init find_gnttab_region(struct domain *d,
>  kinfo->gnttab_size = (_etext - _stext) & PAGE_MASK;
>  
>  /* Make sure the grant table will fit in the region */
> -if ( (kinfo->gnttab_size >> PAGE_SHIFT) < max_grant_frames )
> +if ( grant_table_verify_size(d, kinfo->gnttab_size >> PAGE_SHIFT) )
>  panic("Cannot find a space for the grant table region\n");

I can see how this may cause complications with my alternative
proposal for the meaning of the command line options, but that's
solvable for sure.

> @@ -3462,6 +3449,10 @@ grant_table_create(
>  struct domain *d)
>  {
>  struct grant_table *t;
> +static unsigned int max_grant_frames;
> +static unsigned int max_maptrack_frames;
> +integer_param("gnttab_max_frames", max_grant_frames);
> +integer_param("gnttab_max_maptrack_frames", max_maptrack_frames);
>  
>  if ( (t = xzalloc(struct grant_table)) == NULL )
>  return -ENOMEM;
> @@ -3469,14 +3460,17 @@ grant_table_create(
>  /* Simple stuff. */
>  percpu_rwlock_resource_init(&t->lock, grant_rwlock);
>  spin_lock_init(&t->maptrack_lock);
> -t->max_grant_frames = max_grant_frames;
> -t->max_maptrack_frames = max_maptrack_frames;
>  
>  /* Okay, install the structure. */
>  d->grant_table = t;
>  
>  if ( d->domain_id == 0 )
> +{
> +t->max_grant_frames = max_grant_frames ? : 
> DEFAULT_MAX_NR_GRANT_FRAMES;
> +t->max_maptrack_frames =
> +   max_maptrack_frames ? : 
> DEFAULT_MAX_MAPTRACK_FRAMES;
>  return grant_table_init(t);
> +}

The use here makes it that in effect the two variables could then
become __initdata. Maybe their definition should then even move
here (if the other proposal would be discarded).

> +bool __init grant_table_verify_size(struct domain *d, unsigned int frames)

const

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v8 15/15] xen: add new Xen cpuid node for max address width info

2017-09-20 Thread Jan Beulich
>>> On 20.09.17 at 08:34,  wrote:
> On very large hosts a guest needs to know whether it will have to

... a PV guest ...

> handle frame numbers larger than 32 bits in order to select the
> appropriate grant interface version.
> 
> Add a new Xen specific CPUID node to contain the maximum guest address
> width

"guest address width" is ambiguous here, the more when looking at
what you actually return. We should no longer allow ourselves to
mix up the different address spaces. The limit you want to report
here is that in MFN space, which ought to be of no relevance to
HVM guests. Therefore I'm against uniformly exposing this (as much
as almost no other host property should have any relevance for
HVM guests), and would instead like to see a PV-only leaf just like
we already have a HVM only one.

> --- a/xen/arch/x86/traps.c
> +++ b/xen/arch/x86/traps.c
> @@ -929,6 +929,10 @@ void cpuid_hypervisor_leaves(const struct vcpu *v, 
> uint32_t leaf,
>  res->b = v->vcpu_id;
>  break;
>  
> +case 5: /* Host specific parameters */
> +res->a = generic_flsl(get_upper_mfn_bound() - 1) + PAGE_SHIFT;
> +break;

Already when looking at the patch introducing the function I was
wondering whether the function wouldn't better return the highest
frame number instead of the first invalid one. From an abstract
perspective this would allow an arch (e.g. ARM32) to report that
all addresses are valid.

> --- a/xen/include/public/arch-x86/cpuid.h
> +++ b/xen/include/public/arch-x86/cpuid.h
> @@ -85,6 +85,15 @@
>  #define XEN_HVM_CPUID_IOMMU_MAPPINGS   (1u << 2)
>  #define XEN_HVM_CPUID_VCPU_ID_PRESENT  (1u << 3) /* vcpu id is present in 
> EBX 
> */
>  
> -#define XEN_CPUID_MAX_NUM_LEAVES 4
> +/*
> + * Leaf 6 (0x4x05)
> + * Host specific parameters
> + * EAX: bits 0-7: max guest address width
> + */
> +
> +/* Max. address width in bits taking memory hotplug into account. */
> +#define XEN_CPUID_GUEST_ADDRESS_WIDTH_MASK (255u << 0)

Please use hex numbers for multi-digit masks, and please make
explicit (at least in the comment, perhaps also in the constant's
name) which "address" is being talked about here.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [linux-linus test] 113615: regressions - FAIL

2017-09-20 Thread osstest service owner
flight 113615 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/113615/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-armhf-armhf-xl-vhd   6 xen-install  fail REGR. vs. 113605

Regressions which are regarded as allowable (not blocking):
 test-amd64-i386-rumprun-i386 17 rumprun-demo-xenstorels/xenstorels.repeat fail 
REGR. vs. 113605
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stop   fail REGR. vs. 113605
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stopfail REGR. vs. 113605

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stop  fail blocked in 113605
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 113605
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail  like 113605
 test-amd64-amd64-xl-rtds 10 debian-install   fail  like 113605
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 113605
 test-amd64-amd64-xl-qemut-ws16-amd64 10 windows-installfail never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qcow2 12 migrate-support-checkfail  never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 13 guest-saverestore   fail never pass
 test-amd64-amd64-xl-qemuu-ws16-amd64 10 windows-installfail never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-amd64-i386-xl-qemut-ws16-amd64 13 guest-saverestore   fail never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-xl-qemut-win10-i386 10 windows-installfail never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass
 test-amd64-i386-xl-qemut-win10-i386 10 windows-install fail never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass

version targeted for testing:
 linux820bf5c419e4b85298e5c3001bd1b5be46d60765
baseline version:
 linux12fcf66e74b16b96e57fc1ce32bdf27b3a426fd0

Last test of basis   113605  2017-09-19 15:52:58 Z0 days
Testing same since   113615  2017-09-20 03:19:38 Z0 days1 attempts


People who touched revisions under test:
  "Eric W. Biederman" 
  Arnd Bergmann 
  Christoph Hellwig 
  Eric W. Biederman 
  Hannes Reinecke 
  Hannes Reinecke 
  Haozhong Zhang 
  Jan H. Schönherr 
  Linus Torvalds 
  Lukas Czerner 
  Martin K. Petersen 
  Radim Krčmář 
  Yu Zhang 

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt

Re: [Xen-devel] [PATCH v4 3/3] x86/hvm: Implement hvmemul_write() using real mappings

2017-09-20 Thread Jan Beulich
>>> On 20.09.17 at 11:22,  wrote:
> +static void *hvmemul_map_linear_addr(
> +unsigned long linear, unsigned int bytes, uint32_t pfec,
> +struct hvm_emulate_ctxt *hvmemul_ctxt)
> +{
> +struct vcpu *curr = current;
> +void *err, *mapping;
> +
> +/* First and final gfns which need mapping. */
> +unsigned long frame = linear >> PAGE_SHIFT, first = frame;
> +unsigned long final = (linear + bytes - !!bytes) >> PAGE_SHIFT;
> +
> +/*
> + * mfn points to the next free slot.  All used slots have a page 
> reference
> + * held on them.
> + */
> +mfn_t *mfn = &hvmemul_ctxt->mfn[0];
> +
> +/*
> + * The caller has no legitimate reason for trying a zero-byte write, but
> + * final is calculate to fail safe in release builds.
> + *
> + * The maximum write size depends on the number of adjacent mfns[] which
> + * can be vmap()'d, accouting for possible misalignment within the 
> region.
> + * The higher level emulation callers are responsible for ensuring that
> + * mfns[] is large enough for the requested write size.
> + */
> +if ( bytes == 0 ||
> + final - first >= ARRAY_SIZE(hvmemul_ctxt->mfn) )
> +{
> +ASSERT_UNREACHABLE();
> +goto unhandleable;
> +}
> +
> +do {
> +enum hvm_translation_result res;
> +struct page_info *page;
> +pagefault_info_t pfinfo;
> +p2m_type_t p2mt;
> +
> +/* Error checking.  Confirm that the current slot is clean. */
> +ASSERT(mfn_x(*mfn) == 0);
> +
> +res = hvm_translate_get_page(curr, frame << PAGE_SHIFT, true, pfec,
> + &pfinfo, &page, NULL, &p2mt);
> +
> +switch ( res )
> +{
> +case HVMTRANS_okay:
> +break;
> +
> +case HVMTRANS_bad_linear_to_gfn:
> +x86_emul_pagefault(pfinfo.ec, pfinfo.linear, 
> &hvmemul_ctxt->ctxt);
> +err = ERR_PTR(~X86EMUL_EXCEPTION);
> +goto out;
> +
> +case HVMTRANS_bad_gfn_to_mfn:
> +err = NULL;
> +goto out;
> +
> +case HVMTRANS_gfn_paged_out:
> +case HVMTRANS_gfn_shared:
> +err = ERR_PTR(~X86EMUL_RETRY);
> +goto out;
> +
> +default:
> +goto unhandleable;
> +}
> +
> +if ( p2m_is_discard_write(p2mt) )
> +{
> +err = ERR_PTR(~X86EMUL_OKAY);
> +goto out;
> +}
> +
> +*mfn++ = _mfn(page_to_mfn(page));
> +
> +} while ( ++frame < final );

Interesting - I had specifically pointed out in a reply to v3 that the
increment of mfn _cannot_ be moved down here: You're now
leaking a page ref on the p2m_is_discard_write() error path afaict.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] Ping: [PATCH v4 1/3] x86/hvm: Rename enum hvm_copy_result to hvm_translation_result

2017-09-20 Thread Jan Beulich
>>> On 20.09.17 at 11:22,  wrote:
> From: Andrew Cooper 
> 
> Signed-off-by: Andrew Cooper 
> Acked-by: Tim Deegan 
> Acked-by: Jan Beulich 
> Reviewed-by: Kevin Tian 
> Acked-by: George Dunlap 
> ---
>  xen/arch/x86/hvm/dom0_build.c |  2 +-
>  xen/arch/x86/hvm/emulate.c| 40 ++--
>  xen/arch/x86/hvm/hvm.c| 56 
> +++
>  xen/arch/x86/hvm/intercept.c  | 20 +++---
>  xen/arch/x86/hvm/svm/nestedsvm.c  |  5 ++--
>  xen/arch/x86/hvm/svm/svm.c|  2 +-

Boris, Suravee? The missing ack is the only thing keeping this and
patch 2 from going in.

Thanks, Jan

>  xen/arch/x86/hvm/viridian.c   |  2 +-
>  xen/arch/x86/hvm/vmsi.c   |  2 +-
>  xen/arch/x86/hvm/vmx/realmode.c   |  2 +-
>  xen/arch/x86/hvm/vmx/vvmx.c   | 14 +-
>  xen/arch/x86/mm/shadow/common.c   | 12 -
>  xen/common/libelf/libelf-loader.c |  4 +--
>  xen/include/asm-x86/hvm/support.h | 40 ++--
>  13 files changed, 101 insertions(+), 100 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/dom0_build.c b/xen/arch/x86/hvm/dom0_build.c
> index 020c355..e8f746c 100644
> --- a/xen/arch/x86/hvm/dom0_build.c
> +++ b/xen/arch/x86/hvm/dom0_build.c
> @@ -238,7 +238,7 @@ static int __init pvh_setup_vmx_realmode_helpers(struct 
> domain *d)
>  if ( !pvh_steal_ram(d, HVM_VM86_TSS_SIZE, 128, GB(4), &gaddr) )
>  {
>  if ( hvm_copy_to_guest_phys(gaddr, NULL, HVM_VM86_TSS_SIZE, v) !=
> - HVMCOPY_okay )
> + HVMTRANS_okay )
>  printk("Unable to zero VM86 TSS area\n");
>  d->arch.hvm_domain.params[HVM_PARAM_VM86_TSS_SIZED] =
>  VM86_TSS_UPDATED | ((uint64_t)HVM_VM86_TSS_SIZE << 32) | gaddr;
> diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
> index 54811c1..cc874ce 100644
> --- a/xen/arch/x86/hvm/emulate.c
> +++ b/xen/arch/x86/hvm/emulate.c
> @@ -100,7 +100,7 @@ static int ioreq_server_read(const struct hvm_io_handler 
> *io_handler,
>  uint32_t size,
>  uint64_t *data)
>  {
> -if ( hvm_copy_from_guest_phys(data, addr, size) != HVMCOPY_okay )
> +if ( hvm_copy_from_guest_phys(data, addr, size) != HVMTRANS_okay )
>  return X86EMUL_UNHANDLEABLE;
>  
>  return X86EMUL_OKAY;
> @@ -893,18 +893,18 @@ static int __hvmemul_read(
>  
>  switch ( rc )
>  {
> -case HVMCOPY_okay:
> +case HVMTRANS_okay:
>  break;
> -case HVMCOPY_bad_gva_to_gfn:
> +case HVMTRANS_bad_linear_to_gfn:
>  x86_emul_pagefault(pfinfo.ec, pfinfo.linear, &hvmemul_ctxt->ctxt);
>  return X86EMUL_EXCEPTION;
> -case HVMCOPY_bad_gfn_to_mfn:
> +case HVMTRANS_bad_gfn_to_mfn:
>  if ( access_type == hvm_access_insn_fetch )
>  return X86EMUL_UNHANDLEABLE;
>  
>  return hvmemul_linear_mmio_read(addr, bytes, p_data, pfec, 
> hvmemul_ctxt, 0);
> -case HVMCOPY_gfn_paged_out:
> -case HVMCOPY_gfn_shared:
> +case HVMTRANS_gfn_paged_out:
> +case HVMTRANS_gfn_shared:
>  return X86EMUL_RETRY;
>  default:
>  return X86EMUL_UNHANDLEABLE;
> @@ -1012,15 +1012,15 @@ static int hvmemul_write(
>  
>  switch ( rc )
>  {
> -case HVMCOPY_okay:
> +case HVMTRANS_okay:
>  break;
> -case HVMCOPY_bad_gva_to_gfn:
> +case HVMTRANS_bad_linear_to_gfn:
>  x86_emul_pagefault(pfinfo.ec, pfinfo.linear, &hvmemul_ctxt->ctxt);
>  return X86EMUL_EXCEPTION;
> -case HVMCOPY_bad_gfn_to_mfn:
> +case HVMTRANS_bad_gfn_to_mfn:
>  return hvmemul_linear_mmio_write(addr, bytes, p_data, pfec, 
> hvmemul_ctxt, 0);
> -case HVMCOPY_gfn_paged_out:
> -case HVMCOPY_gfn_shared:
> +case HVMTRANS_gfn_paged_out:
> +case HVMTRANS_gfn_shared:
>  return X86EMUL_RETRY;
>  default:
>  return X86EMUL_UNHANDLEABLE;
> @@ -1384,7 +1384,7 @@ static int hvmemul_rep_movs(
>  return rc;
>  }
>  
> -rc = HVMCOPY_okay;
> +rc = HVMTRANS_okay;
>  }
>  else
>  /*
> @@ -1394,16 +1394,16 @@ static int hvmemul_rep_movs(
>   */
>  rc = hvm_copy_from_guest_phys(buf, sgpa, bytes);
>  
> -if ( rc == HVMCOPY_okay )
> +if ( rc == HVMTRANS_okay )
>  rc = hvm_copy_to_guest_phys(dgpa, buf, bytes, current);
>  
>  xfree(buf);
>  
> -if ( rc == HVMCOPY_gfn_paged_out )
> +if ( rc == HVMTRANS_gfn_paged_out )
>  return X86EMUL_RETRY;
> -if ( rc == HVMCOPY_gfn_shared )
> +if ( rc == HVMTRANS_gfn_shared )
>  return X86EMUL_RETRY;
> -if ( rc != HVMCOPY_okay )
> +if ( rc != HVMTRANS_okay )
>  {
>  gdprintk(XENLOG_WARNING, "Failed memory-to-memory REP MOVS: sgpa=%"
>   PRIpaddr" dgpa=%"PRIpaddr" reps=%lu bytes_per_rep=%u\n",
> @@ -1513,10 +1513,10 @@ static int hvmemul_rep_stos(
>  
>  switch ( rc )
>  {
> -case HVMCOPY_gfn_paged_ou

Re: [Xen-devel] [PATCH v4 3/3] x86/hvm: Implement hvmemul_write() using real mappings

2017-09-20 Thread Alexandru Stefan ISAILA
On Mi, 2017-09-20 at 06:24 -0600, Jan Beulich wrote:
> >
> > >
> > > >
> > > > On 20.09.17 at 11:22,  wrote:
> > +static void *hvmemul_map_linear_addr(
> > +unsigned long linear, unsigned int bytes, uint32_t pfec,
> > +struct hvm_emulate_ctxt *hvmemul_ctxt)
> > +{
> > +struct vcpu *curr = current;
> > +void *err, *mapping;
> > +
> > +/* First and final gfns which need mapping. */
> > +unsigned long frame = linear >> PAGE_SHIFT, first = frame;
> > +unsigned long final = (linear + bytes - !!bytes) >>
> > PAGE_SHIFT;
> > +
> > +/*
> > + * mfn points to the next free slot.  All used slots have a
> > page reference
> > + * held on them.
> > + */
> > +mfn_t *mfn = &hvmemul_ctxt->mfn[0];
> > +
> > +/*
> > + * The caller has no legitimate reason for trying a zero-byte
> > write, but
> > + * final is calculate to fail safe in release builds.
> > + *
> > + * The maximum write size depends on the number of adjacent
> > mfns[] which
> > + * can be vmap()'d, accouting for possible misalignment within
> > the region.
> > + * The higher level emulation callers are responsible for
> > ensuring that
> > + * mfns[] is large enough for the requested write size.
> > + */
> > +if ( bytes == 0 ||
> > + final - first >= ARRAY_SIZE(hvmemul_ctxt->mfn) )
> > +{
> > +ASSERT_UNREACHABLE();
> > +goto unhandleable;
> > +}
> > +
> > +do {
> > +enum hvm_translation_result res;
> > +struct page_info *page;
> > +pagefault_info_t pfinfo;
> > +p2m_type_t p2mt;
> > +
> > +/* Error checking.  Confirm that the current slot is
> > clean. */
> > +ASSERT(mfn_x(*mfn) == 0);
> > +
> > +res = hvm_translate_get_page(curr, frame << PAGE_SHIFT,
> > true, pfec,
> > + &pfinfo, &page, NULL, &p2mt);
> > +
> > +switch ( res )
> > +{
> > +case HVMTRANS_okay:
> > +break;
> > +
> > +case HVMTRANS_bad_linear_to_gfn:
> > +x86_emul_pagefault(pfinfo.ec, pfinfo.linear,
> > &hvmemul_ctxt->ctxt);
> > +err = ERR_PTR(~X86EMUL_EXCEPTION);
> > +goto out;
> > +
> > +case HVMTRANS_bad_gfn_to_mfn:
> > +err = NULL;
> > +goto out;
> > +
> > +case HVMTRANS_gfn_paged_out:
> > +case HVMTRANS_gfn_shared:
> > +err = ERR_PTR(~X86EMUL_RETRY);
> > +goto out;
> > +
> > +default:
> > +goto unhandleable;
> > +}
> > +
> > +if ( p2m_is_discard_write(p2mt) )
> > +{
> > +err = ERR_PTR(~X86EMUL_OKAY);
> > +goto out;
> > +}
> > +
> > +*mfn++ = _mfn(page_to_mfn(page));
> > +
> > +} while ( ++frame < final );
> Interesting - I had specifically pointed out in a reply to v3 that
> the
> increment of mfn _cannot_ be moved down here: You're now
> leaking a page ref on the p2m_is_discard_write() error path afaict.
>
Sorry about that, I realized the error after reading it again. I'll
wait for all the comments until doing the final version.

Alex


This email was scanned by Bitdefender
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v8 01/15] xen: move XENMAPSPACE_grant_table code into grant_table.c

2017-09-20 Thread Julien Grall

Hi Juergen,

On 20/09/17 07:34, Juergen Gross wrote:

The x86 and arm versions of XENMAPSPACE_grant_table handling are nearly
identical. Move the code into a function in grant_table.c and add an
architecture dependant hook to handle the differences.

Switch to mfn_t in order to be more type safe.

Signed-off-by: Juergen Gross 
Reviewed-by: Paul Durrant 
Acked-by: Jan Beulich 


Acked-by: Julien Grall 

Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v8 13/15] xen: make grant resource limits per domain

2017-09-20 Thread Juergen Gross
On 20/09/17 13:48, Jan Beulich wrote:
 On 20.09.17 at 13:10,  wrote:
>> On 20/09/17 12:34, Jan Beulich wrote:
>> On 20.09.17 at 08:34,  wrote:
 --- a/xen/common/compat/grant_table.c
 +++ b/xen/common/compat/grant_table.c
 @@ -1777,13 +1784,15 @@ active_alloc_failed:
  
  static long
  gnttab_setup_table(
 -XEN_GUEST_HANDLE_PARAM(gnttab_setup_table_t) uop, unsigned int count)
 +XEN_GUEST_HANDLE_PARAM(gnttab_setup_table_t) uop, unsigned int count,
 +unsigned int limit_max)
  {
  struct vcpu *curr = current;
  struct gnttab_setup_table op;
  struct domain *d = NULL;
  struct grant_table *gt;
  unsigned int i;
 +long ret = 0;
>>>
>>> Wouldn't int suffice here?
>>
>> I just followed the return type of the function. I think this way is
>> cleaner, but in case you like int better I can change it.
> 
> I sort of expected this reply, but that's not in line with what you
> did in gnttab_get_status_frames() then. I think we will want to
> eventually change all function return types to int where the wider
> type isn't needed.

Okay. Should I include a patch doing that in this series or would you
prefer this cleanup being delayed to 4.11?

 @@ -3442,6 +3469,8 @@ grant_table_create(
  /* Simple stuff. */
  percpu_rwlock_resource_init(&t->lock, grant_rwlock);
  spin_lock_init(&t->maptrack_lock);
 +t->max_grant_frames = max_grant_frames;
 +t->max_maptrack_frames = max_maptrack_frames;
>>>
>>> This together with ...
>>>
 @@ -3655,7 +3686,11 @@ int grant_table_set_limits(struct domain *d, 
 unsigned int grant_frames,
  
  /* Set limits. */
  if ( !gt->active )
 +{
 +gt->max_grant_frames = grant_frames;
 +gt->max_maptrack_frames = maptrack_frames;
  ret = grant_table_init(gt);
 +}
>>>
>>> .. this raises the question of whether it is legal to decrease the
>>> limits. There may be code depending on it only ever growing.
>>
>> Before grant_table_init() has been called there is no problem
>> decreasing the limits, as nothing depending on them has been setup
>> yet.
> 
> Oh, right, I didn't pay attention to this being a one-time action.
> 
>>> Additionally to take the input values without applying some
>>> upper cap - while we have XSA-77, we still shouldn't introduce
>>> new issues making disaggregation more unsafe. Perhaps the
>>> global limits could serve as a cap here?

Thinking more about it: what can happen in worst case when no cap
is being enforced?

A domain with the privilege to create another domain with arbitrary
amounts of memory (we have no cap here) might go crazy and give the
created domain an arbitrary amount of grant frames or maptrack
frames. So what is the difference whether the memory is spent directly
for the domain or via grant frames? In both cases there will be no
memory left for other purposes. I can't see how this would be worse
than allocating the same amount of memory directly for the new domain.

>> I thought about a cap and TBH I'm not sure which would be sane to
>> apply. The global limits seem wrong, especially looking at patch 14:
>> those limits will be for dom0 only then. And dom0 won't need many
>> grant frames in the normal case...
> 
> I've been thinking about this Dom0 aspect too over lunch. What
> about allowing the hardware domain to set its limit (only upwards
> of course) in setup_table(), without any upper bound enforced?
> This would free up the globals to be used as system wide limits
> again.

This would be possible, of course.

The question is whether the need to re-allocate the frame pointer arrays
is it worth.

>> So I could make up a cap in form of either a configurable constant
>> (CONFIG_* or boot parameter?) or as a fraction of domain memory. Any
>> preferences here?
> 
> A config constant as well as a fraction of domain memory might lock
> out special purpose guests. Which would leave command line options
> - as per above perhaps the ones we already have.

In case we really need the cap parameters I'd prefer distinct ones from
the dom0 initial values.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v8 14/15] xen: make grant table limits boot parameters dom0 only

2017-09-20 Thread Juergen Gross
On 20/09/17 14:07, Jan Beulich wrote:
 On 20.09.17 at 08:34,  wrote:
>> --- a/xen/arch/arm/domain_build.c
>> +++ b/xen/arch/arm/domain_build.c
>> @@ -2098,7 +2098,7 @@ static void __init find_gnttab_region(struct domain *d,
>>  kinfo->gnttab_size = (_etext - _stext) & PAGE_MASK;
>>  
>>  /* Make sure the grant table will fit in the region */
>> -if ( (kinfo->gnttab_size >> PAGE_SHIFT) < max_grant_frames )
>> +if ( grant_table_verify_size(d, kinfo->gnttab_size >> PAGE_SHIFT) )
>>  panic("Cannot find a space for the grant table region\n");
> 
> I can see how this may cause complications with my alternative
> proposal for the meaning of the command line options, but that's
> solvable for sure.
> 
>> @@ -3462,6 +3449,10 @@ grant_table_create(
>>  struct domain *d)
>>  {
>>  struct grant_table *t;
>> +static unsigned int max_grant_frames;
>> +static unsigned int max_maptrack_frames;
>> +integer_param("gnttab_max_frames", max_grant_frames);
>> +integer_param("gnttab_max_maptrack_frames", max_maptrack_frames);
>>  
>>  if ( (t = xzalloc(struct grant_table)) == NULL )
>>  return -ENOMEM;
>> @@ -3469,14 +3460,17 @@ grant_table_create(
>>  /* Simple stuff. */
>>  percpu_rwlock_resource_init(&t->lock, grant_rwlock);
>>  spin_lock_init(&t->maptrack_lock);
>> -t->max_grant_frames = max_grant_frames;
>> -t->max_maptrack_frames = max_maptrack_frames;
>>  
>>  /* Okay, install the structure. */
>>  d->grant_table = t;
>>  
>>  if ( d->domain_id == 0 )
>> +{
>> +t->max_grant_frames = max_grant_frames ? : 
>> DEFAULT_MAX_NR_GRANT_FRAMES;
>> +t->max_maptrack_frames =
>> +   max_maptrack_frames ? : 
>> DEFAULT_MAX_MAPTRACK_FRAMES;
>>  return grant_table_init(t);
>> +}
> 
> The use here makes it that in effect the two variables could then
> become __initdata. Maybe their definition should then even move
> here (if the other proposal would be discarded).

Is it possible in the hypervisor to access __initdata variables from non
__init functions? In the Linux kernel this is forbidden.

>> +bool __init grant_table_verify_size(struct domain *d, unsigned int frames)
> 
> const

Yes.


Juergen


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v8 04/15] xen: add function for obtaining highest possible memory address

2017-09-20 Thread Julien Grall

Hi Juergen,

Sorry for the late comment.

On 20/09/17 07:34, Juergen Gross wrote:

Add a function for obtaining the highest possible physical memory
address of the system. This value is influenced by:

- hypervisor configuration (CONFIG_BIGMEM)
- processor capability (max. addressable physical memory)
- memory map at boot time
- memory hotplug capability

The value is especially needed for dom0 to decide sizing of grant frame
limits of guests and for pv domains for selecting the grant interface


Why limiting to PV domain? Arm domain may also need to switch to another 
interface between v1 only support 32-bit GFN.



version to use.

Signed-off-by: Juergen Gross 


[...]


diff --git a/xen/include/asm-arm/mm.h b/xen/include/asm-arm/mm.h
index cd6dfb54b9..6aa8cba5e0 100644
--- a/xen/include/asm-arm/mm.h
+++ b/xen/include/asm-arm/mm.h
@@ -376,6 +376,11 @@ static inline void put_page_and_type(struct page_info 
*page)
  
  void clear_and_clean_page(struct page_info *page);
  
+static inline unsigned long arch_get_upper_mfn_bound(void)

+{
+return 0;
+}


I am not sure to understand the Arm implementation given the description 
of the commit message.


The guest layout is completely separate from the host layout. It might 
be possible to have all the memory below 40 bits on the host, but this 
does not preclude the guest to have all memory below 40 bits (the 
hardware might support, for instance, up to 48 bits).


Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v8 05/15] xen: add max possible mfn to struct xen_sysctl_physinfo

2017-09-20 Thread Julien Grall

Hi Juergen,

On 20/09/17 07:34, Juergen Gross wrote:

Add the maximum possible mfn to struct xen_sysctl_physinfo in order to
enable Xen tools to size the grant table frame limits for a domU

Signed-off-by: Juergen Gross 
---
  xen/common/sysctl.c | 1 +
  xen/include/public/sysctl.h | 2 ++
  2 files changed, 3 insertions(+)

diff --git a/xen/common/sysctl.c b/xen/common/sysctl.c
index a6882d1c9d..22f5d991f6 100644
--- a/xen/common/sysctl.c
+++ b/xen/common/sysctl.c
@@ -266,6 +266,7 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) 
u_sysctl)
  get_outstanding_claims(&pi->free_pages, &pi->outstanding_pages);
  pi->scrub_pages = 0;
  pi->cpu_khz = cpu_khz;
+pi->max_mfn = get_upper_mfn_bound() - 1;
  arch_do_physinfo(pi);
  
  if ( copy_to_guest(u_sysctl, op, 1) )

diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h
index 7830b987da..86b9ced86b 100644
--- a/xen/include/public/sysctl.h
+++ b/xen/include/public/sysctl.h
@@ -108,6 +108,8 @@ struct xen_sysctl_physinfo {
  
  /* XEN_SYSCTL_PHYSCAP_??? */

  uint32_t capabilities;
+
+uint64_t max_mfn; /* Largest possible MFN on this host */


Don't you need to bump XEN_SYSCTL_INTERFACE_VERSION because of this change?


  };
  typedef struct xen_sysctl_physinfo xen_sysctl_physinfo_t;
  DEFINE_XEN_GUEST_HANDLE(xen_sysctl_physinfo_t);



Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v11 3/5] x86emul: Add return code information to error messages

2017-09-20 Thread Petre Ovidiu PIRCALABU
On Ma, 2017-09-19 at 09:22 -0600, Jan Beulich wrote:
> > 
> > > 
> > > > 
> > > > On 12.09.17 at 16:32,  wrote:
> > --- a/xen/arch/x86/hvm/emulate.c
> > +++ b/xen/arch/x86/hvm/emulate.c
> > @@ -2055,7 +2055,7 @@ int hvm_emulate_one_mmio(unsigned long mfn,
> > unsigned long gla)
> >  {
> >  case X86EMUL_UNHANDLEABLE:
> >  case X86EMUL_UNIMPLEMENTED:
> > -hvm_dump_emulation_state(XENLOG_G_WARNING, "MMCFG",
> > &ctxt);
> > +hvm_dump_emulation_state(XENLOG_G_WARNING, "MMCFG", &ctxt,
> > rc);
> >  break;
> At the example of this one I think it is pretty clear that the order
> of patches would be the other way around. But I won't insist.
> 
> > 
> > @@ -2242,16 +2242,17 @@ static const char
> > *guest_x86_mode_to_str(int mode)
> >  }
> >  
> >  void hvm_dump_emulation_state(const char *loglvl, const char
> > *prefix,
> > -  struct hvm_emulate_ctxt
> > *hvmemul_ctxt)
> > +  struct hvm_emulate_ctxt
> > *hvmemul_ctxt, int rc)
> >  {
> >  struct vcpu *curr = current;
> >  const char *mode_str =
> > guest_x86_mode_to_str(hvm_guest_x86_mode(curr));
> >  const struct segment_register *cs =
> >  hvmemul_get_seg_reg(x86_seg_cs, hvmemul_ctxt);
> >  
> > -printk("%s%s emulation failed: %pv %s @ %04x:%08lx -> %*ph\n",
> > -   loglvl, prefix, curr, mode_str, cs->sel, hvmemul_ctxt-
> > >insn_buf_eip,
> > -   hvmemul_ctxt->insn_buf_bytes, hvmemul_ctxt->insn_buf);
> > +printk("%s%s emulation failed (rc=%d): %pv %s @ %04x:%08lx ->
> > %*ph\n",
> Please try to keep log messages short (but without losing relevant
> information). In the case here the "rc=" is unnecessary. With it
> dropped
I added the "rc=" to mark the distinction between "unimplemented" and
"unhandleable", as requested by Andrew Cooper for v10 

"Please modify hvm_dump_emulation_state to pass rc in, and distinguish
UNHANDLEABLE vs UNIMPLEMENTED in the printed message."


> Reviewed-by: Jan Beulich 
> 
> Jan
> 
> 
> 
> This email was scanned by Bitdefender
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v8 15/15] xen: add new Xen cpuid node for max address width info

2017-09-20 Thread Juergen Gross
On 20/09/17 14:18, Jan Beulich wrote:
 On 20.09.17 at 08:34,  wrote:
>> On very large hosts a guest needs to know whether it will have to
> 
> ... a PV guest ...

What about a HVM guest with (potentially) more than 16TB?

>> handle frame numbers larger than 32 bits in order to select the
>> appropriate grant interface version.
>>
>> Add a new Xen specific CPUID node to contain the maximum guest address
>> width
> 
> "guest address width" is ambiguous here, the more when looking at
> what you actually return. We should no longer allow ourselves to
> mix up the different address spaces.

I've chosen "guest address width" similar to the "guest frame number"
we already have: it is MFN based for PV and PFN based for HVM (and ARM).
I'm open for a better name.

> The limit you want to report
> here is that in MFN space, which ought to be of no relevance to
> HVM guests. Therefore I'm against uniformly exposing this (as much
> as almost no other host property should have any relevance for
> HVM guests), and would instead like to see a PV-only leaf just like
> we already have a HVM only one.

As said above: a HVM guest needs to know whether it will have to deal
with frame numbers >32 bits, too.

For HVM guests this would just be a hint that the host might be large
enough for this to happen, as even today a HVM guest could in theory
reorganize its memory map to have parts of the memory above the 16TB
border even with only rather small amounts of memory. But this would
be the problem of the guest then.

For the future I could envision a per-domain parameter setting the
upper limit where a guest could put its memory. This parameter would
then influence the new CPUID value fro HVM domains, of course.

> 
>> --- a/xen/arch/x86/traps.c
>> +++ b/xen/arch/x86/traps.c
>> @@ -929,6 +929,10 @@ void cpuid_hypervisor_leaves(const struct vcpu *v, 
>> uint32_t leaf,
>>  res->b = v->vcpu_id;
>>  break;
>>  
>> +case 5: /* Host specific parameters */
>> +res->a = generic_flsl(get_upper_mfn_bound() - 1) + PAGE_SHIFT;
>> +break;
> 
> Already when looking at the patch introducing the function I was
> wondering whether the function wouldn't better return the highest
> frame number instead of the first invalid one. From an abstract
> perspective this would allow an arch (e.g. ARM32) to report that
> all addresses are valid.

That's fine with me.

>> --- a/xen/include/public/arch-x86/cpuid.h
>> +++ b/xen/include/public/arch-x86/cpuid.h
>> @@ -85,6 +85,15 @@
>>  #define XEN_HVM_CPUID_IOMMU_MAPPINGS   (1u << 2)
>>  #define XEN_HVM_CPUID_VCPU_ID_PRESENT  (1u << 3) /* vcpu id is present in 
>> EBX 
>> */
>>  
>> -#define XEN_CPUID_MAX_NUM_LEAVES 4
>> +/*
>> + * Leaf 6 (0x4x05)
>> + * Host specific parameters
>> + * EAX: bits 0-7: max guest address width
>> + */
>> +
>> +/* Max. address width in bits taking memory hotplug into account. */
>> +#define XEN_CPUID_GUEST_ADDRESS_WIDTH_MASK (255u << 0)
> 
> Please use hex numbers for multi-digit masks, and please make
> explicit (at least in the comment, perhaps also in the constant's
> name) which "address" is being talked about here.

Okay.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v8 15/15] xen: add new Xen cpuid node for max address width info

2017-09-20 Thread Julien Grall

Hi Juergen,

On 20/09/17 07:34, Juergen Gross wrote:

On very large hosts a guest needs to know whether it will have to
handle frame numbers larger than 32 bits in order to select the
appropriate grant interface version.


Can you explain why you decided to have an arch specific interface 
rather than generic one?


On Arm, we would likely need to make same decision for large hosts.

Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Ping: [PATCH v4 1/3] x86/hvm: Rename enum hvm_copy_result to hvm_translation_result

2017-09-20 Thread Boris Ostrovsky
On 09/20/2017 08:29 AM, Jan Beulich wrote:
 On 20.09.17 at 11:22,  wrote:
>> From: Andrew Cooper 
>>
>> Signed-off-by: Andrew Cooper 
>> Acked-by: Tim Deegan 
>> Acked-by: Jan Beulich 
>> Reviewed-by: Kevin Tian 
>> Acked-by: George Dunlap 
>> ---
>>  xen/arch/x86/hvm/dom0_build.c |  2 +-
>>  xen/arch/x86/hvm/emulate.c| 40 ++--
>>  xen/arch/x86/hvm/hvm.c| 56 
>> +++
>>  xen/arch/x86/hvm/intercept.c  | 20 +++---
>>  xen/arch/x86/hvm/svm/nestedsvm.c  |  5 ++--
>>  xen/arch/x86/hvm/svm/svm.c|  2 +-
> Boris, Suravee? The missing ack is the only thing keeping this and
> patch 2 from going in.


Sorry, I thought I responded and obviously I haven't.

Reviewed-by: Boris Ostrovsky 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v8 05/15] xen: add max possible mfn to struct xen_sysctl_physinfo

2017-09-20 Thread Juergen Gross
On 20/09/17 14:53, Julien Grall wrote:
> Hi Juergen,
> 
> On 20/09/17 07:34, Juergen Gross wrote:
>> Add the maximum possible mfn to struct xen_sysctl_physinfo in order to
>> enable Xen tools to size the grant table frame limits for a domU
>>
>> Signed-off-by: Juergen Gross 
>> ---
>>   xen/common/sysctl.c | 1 +
>>   xen/include/public/sysctl.h | 2 ++
>>   2 files changed, 3 insertions(+)
>>
>> diff --git a/xen/common/sysctl.c b/xen/common/sysctl.c
>> index a6882d1c9d..22f5d991f6 100644
>> --- a/xen/common/sysctl.c
>> +++ b/xen/common/sysctl.c
>> @@ -266,6 +266,7 @@ long
>> do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl)
>>   get_outstanding_claims(&pi->free_pages,
>> &pi->outstanding_pages);
>>   pi->scrub_pages = 0;
>>   pi->cpu_khz = cpu_khz;
>> +    pi->max_mfn = get_upper_mfn_bound() - 1;
>>   arch_do_physinfo(pi);
>>     if ( copy_to_guest(u_sysctl, op, 1) )
>> diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h
>> index 7830b987da..86b9ced86b 100644
>> --- a/xen/include/public/sysctl.h
>> +++ b/xen/include/public/sysctl.h
>> @@ -108,6 +108,8 @@ struct xen_sysctl_physinfo {
>>     /* XEN_SYSCTL_PHYSCAP_??? */
>>   uint32_t capabilities;
>> +
>> +    uint64_t max_mfn; /* Largest possible MFN on this host */
> 
> Don't you need to bump XEN_SYSCTL_INTERFACE_VERSION because of this change?

Thanks for noticing.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v8 04/15] xen: add function for obtaining highest possible memory address

2017-09-20 Thread Juergen Gross
On 20/09/17 14:51, Julien Grall wrote:
> Hi Juergen,
> 
> Sorry for the late comment.
> 
> On 20/09/17 07:34, Juergen Gross wrote:
>> Add a function for obtaining the highest possible physical memory
>> address of the system. This value is influenced by:
>>
>> - hypervisor configuration (CONFIG_BIGMEM)
>> - processor capability (max. addressable physical memory)
>> - memory map at boot time
>> - memory hotplug capability
>>
>> The value is especially needed for dom0 to decide sizing of grant frame
>> limits of guests and for pv domains for selecting the grant interface
> 
> Why limiting to PV domain? Arm domain may also need to switch to another
> interface between v1 only support 32-bit GFN.

Right. And I just used that reasoning for an answer to Jan. :-)

> 
>> version to use.
>>
>> Signed-off-by: Juergen Gross 
> 
> [...]
> 
>> diff --git a/xen/include/asm-arm/mm.h b/xen/include/asm-arm/mm.h
>> index cd6dfb54b9..6aa8cba5e0 100644
>> --- a/xen/include/asm-arm/mm.h
>> +++ b/xen/include/asm-arm/mm.h
>> @@ -376,6 +376,11 @@ static inline void put_page_and_type(struct
>> page_info *page)
>>     void clear_and_clean_page(struct page_info *page);
>>   +static inline unsigned long arch_get_upper_mfn_bound(void)
>> +{
>> +    return 0;
>> +}
> 
> I am not sure to understand the Arm implementation given the description
> of the commit message.
> 
> The guest layout is completely separate from the host layout. It might
> be possible to have all the memory below 40 bits on the host, but this
> does not preclude the guest to have all memory below 40 bits (the
> hardware might support, for instance, up to 48 bits).

Who is setting up the memory map for the guest then?


Juergen


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xen: credit2: fix spinlock irq-safety violation

2017-09-20 Thread George Dunlap
On 09/20/2017 12:59 PM, Jan Beulich wrote:
 On 20.09.17 at 13:44,  wrote:
>> IAC, if you're concerned about that, I'd much rather put both
>> kill_timer() and xfree() before the critical section, rather than
>> after, like in the attached patch.
> 
> Hmm, killing the timer upfront is certainly fine, but is freeing the
> data before removing the element from the list safe not only
> currently, but also going forward?

I agree with Jan -- if you don't want to put the kill_timer() in the
critical section, put it beforehand; but don't free the structure until
after the sdom struct has been removed from the list.

Sorry to be picky, but I'm positive I'm not going to remember this in
six months' time, and I'd just rather be safe.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 06/17] xen/livepatch/x86/arm32: Force .livepatch.depends section to be uint32_t aligned.

2017-09-20 Thread Wei Liu
On Tue, Sep 19, 2017 at 12:05:16PM +0100, Julien Grall wrote:
> Hi Konrad,
> 
> On 19/09/17 01:32, Konrad Rzeszutek Wilk wrote:
> > > > +.PHONY: livepatch_depends.h
> > > > +livepatch_depends.h: note.bin
> > > > +   $(shell (../../../tools/firmware/hvmloader/mkhex 
> > > > $(NOTE_DEPENDS) $^ > $@))
> > > 
> > > It looks quite odd to use a file in firmware/hvmloader for livepatch. 
> > > Would
> > > it be possible to move mkhex to a generic place?
> > 
> > Like so?
> 
> It is what I had in mind. I CCed Ian and Wei to get feedback from them.
> 

Just move it to a directory called tools/scripts?

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v8 04/15] xen: add function for obtaining highest possible memory address

2017-09-20 Thread Julien Grall

Hi Juergen,

On 20/09/17 14:08, Juergen Gross wrote:

On 20/09/17 14:51, Julien Grall wrote:

Hi Juergen,

Sorry for the late comment.

On 20/09/17 07:34, Juergen Gross wrote:

Add a function for obtaining the highest possible physical memory
address of the system. This value is influenced by:

- hypervisor configuration (CONFIG_BIGMEM)
- processor capability (max. addressable physical memory)
- memory map at boot time
- memory hotplug capability

The value is especially needed for dom0 to decide sizing of grant frame
limits of guests and for pv domains for selecting the grant interface


Why limiting to PV domain? Arm domain may also need to switch to another
interface between v1 only support 32-bit GFN.


Right. And I just used that reasoning for an answer to Jan. :-)




version to use.

Signed-off-by: Juergen Gross 


[...]


diff --git a/xen/include/asm-arm/mm.h b/xen/include/asm-arm/mm.h
index cd6dfb54b9..6aa8cba5e0 100644
--- a/xen/include/asm-arm/mm.h
+++ b/xen/include/asm-arm/mm.h
@@ -376,6 +376,11 @@ static inline void put_page_and_type(struct
page_info *page)
 void clear_and_clean_page(struct page_info *page);
   +static inline unsigned long arch_get_upper_mfn_bound(void)
+{
+return 0;
+}


I am not sure to understand the Arm implementation given the description
of the commit message.

The guest layout is completely separate from the host layout. It might
be possible to have all the memory below 40 bits on the host, but this
does not preclude the guest to have all memory below 40 bits (the
hardware might support, for instance, up to 48 bits).


Who is setting up the memory map for the guest then?


The memory map is at the moment static and described in 
public/arch-arm.h. The guest is not allowed to assume it and should 
discover it through ACPI/DT.


There are 2 banks of memory for the guest (it depends on the amount of 
memory requested by the user):

- 3GB @ 1GB
- 1016GB @ 8GB

But the guest would be free to use the populate memory hypercall to 
allocate memory anywhere in the address space.


For Arm32, the maximum IPA (Intermediate Physical Address aka guest 
physical address on Xen) we currently support is always 40 bits.


For Arm64, this range from 32 bits to 48 bits. New hardware can support 
up to 52 bits.


Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v8 12/15] xen/arm: move arch specific grant table bits into grant_table.c

2017-09-20 Thread Julien Grall

Hi Juergen,

On 20/09/17 07:34, Juergen Gross wrote:

diff --git a/xen/include/asm-arm/grant_table.h 
b/xen/include/asm-arm/grant_table.h
index 0a248a765a..0870b5b782 100644
--- a/xen/include/asm-arm/grant_table.h
+++ b/xen/include/asm-arm/grant_table.h
@@ -6,6 +6,10 @@
  
  #define INITIAL_NR_GRANT_FRAMES 4
  
+struct grant_table_arch {

+gfn_t *gfn;
+};
+
  void gnttab_clear_flag(unsigned long nr, uint16_t *addr);
  int create_grant_host_mapping(unsigned long gpaddr,
  unsigned long mfn, unsigned int flags, unsigned int
@@ -22,11 +26,19 @@ static inline int replace_grant_supported(void)
  return 1;
  }
  
-static inline void gnttab_set_frame_gfn(struct domain *d, unsigned long idx,

-gfn_t gfn)
-{
-d->arch.grant_table_gfn[idx] = gfn;
-} > +#define gnttab_init_arch(gt) 

\

+( ((gt)->arch.gfn = xzalloc_array(gfn_t, max_grant_frames)) == 0 \


I am not sure to understand the 0 here. Don't you check the return of 
xzalloc_array? If so it should be NULL if it failed. And therefore the 
return error looks inverted below.



+  ? 0 : -ENOMEM )


I admit I would much prefer to see static inline rather than define. 
More typesafe and usually easier to read. You cannot do that because the 
type of gt is only defined in grant-table?


Nonetheless, I think you could clarify this code by doing:

({
   (gt)->arch.gfn = xzalloc_array();
   ( g->arch.gfn ? 0 : -ENOMEM );
)}

Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v8 04/15] xen: add function for obtaining highest possible memory address

2017-09-20 Thread Juergen Gross
On 20/09/17 16:24, Julien Grall wrote:
> Hi Juergen,
> 
> On 20/09/17 14:08, Juergen Gross wrote:
>> On 20/09/17 14:51, Julien Grall wrote:
>>> Hi Juergen,
>>>
>>> Sorry for the late comment.
>>>
>>> On 20/09/17 07:34, Juergen Gross wrote:
 Add a function for obtaining the highest possible physical memory
 address of the system. This value is influenced by:

 - hypervisor configuration (CONFIG_BIGMEM)
 - processor capability (max. addressable physical memory)
 - memory map at boot time
 - memory hotplug capability

 The value is especially needed for dom0 to decide sizing of grant frame
 limits of guests and for pv domains for selecting the grant interface
>>>
>>> Why limiting to PV domain? Arm domain may also need to switch to another
>>> interface between v1 only support 32-bit GFN.
>>
>> Right. And I just used that reasoning for an answer to Jan. :-)
>>
>>>
 version to use.

 Signed-off-by: Juergen Gross 
>>>
>>> [...]
>>>
 diff --git a/xen/include/asm-arm/mm.h b/xen/include/asm-arm/mm.h
 index cd6dfb54b9..6aa8cba5e0 100644
 --- a/xen/include/asm-arm/mm.h
 +++ b/xen/include/asm-arm/mm.h
 @@ -376,6 +376,11 @@ static inline void put_page_and_type(struct
 page_info *page)
  void clear_and_clean_page(struct page_info *page);
    +static inline unsigned long arch_get_upper_mfn_bound(void)
 +{
 +    return 0;
 +}
>>>
>>> I am not sure to understand the Arm implementation given the description
>>> of the commit message.
>>>
>>> The guest layout is completely separate from the host layout. It might
>>> be possible to have all the memory below 40 bits on the host, but this
>>> does not preclude the guest to have all memory below 40 bits (the
>>> hardware might support, for instance, up to 48 bits).
>>
>> Who is setting up the memory map for the guest then?
> 
> The memory map is at the moment static and described in
> public/arch-arm.h. The guest is not allowed to assume it and should
> discover it through ACPI/DT.

Is there any memory hotplug possible (host level, guest level)?

> There are 2 banks of memory for the guest (it depends on the amount of
> memory requested by the user):
> - 3GB @ 1GB
> - 1016GB @ 8GB
> 
> But the guest would be free to use the populate memory hypercall to
> allocate memory anywhere in the address space.

Okay, so this is similar to x86 HVM then.

> For Arm32, the maximum IPA (Intermediate Physical Address aka guest
> physical address on Xen) we currently support is always 40 bits.
> 
> For Arm64, this range from 32 bits to 48 bits. New hardware can support
> up to 52 bits.

I guess this information is included in some tables like ACPI or DT?


Juergen


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v8 13/15] xen: make grant resource limits per domain

2017-09-20 Thread Julien Grall

Hi Jan,

On 20/09/17 11:34, Jan Beulich wrote:

On 20.09.17 at 08:34,  wrote:

--- a/xen/include/asm-arm/grant_table.h
+++ b/xen/include/asm-arm/grant_table.h
@@ -26,8 +26,8 @@ static inline int replace_grant_supported(void)
  return 1;
  }
  
-#define gnttab_init_arch(gt) \

-( ((gt)->arch.gfn = xzalloc_array(gfn_t, max_grant_frames)) == 0 \
+#define gnttab_init_arch(gt)   \
+( ((gt)->arch.gfn = xzalloc_array(gfn_t, (gt)->max_grant_frames)) == 0 \


Mind switching to use NULL at this occasion?


This would belong to the patch #12 where it as been introduced.

Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 1/2] public/domctl: drop unnecessary typedefs and handles

2017-09-20 Thread Tamas K Lengyel
On Tue, Sep 12, 2017 at 9:08 AM, Jan Beulich  wrote:
> By virtue of the struct xen_domctl container structure, most of them
> are really just cluttering the name space.
>
> While doing so,
> - convert an enum typed (pt_irq_type_t) structure field to a fixed
>   width type,
> - make x86's paging_domctl() and descendants take a properly typed
>   handle,
> - add const in a few places.
>
> Signed-off-by: Jan Beulich 

Acked-by: Tamas K Lengyel 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 03/21] libxl/xl: use the new domain_build_info fields position

2017-09-20 Thread Ian Jackson
Roger Pau Monne writes ("[PATCH v2 03/21] libxl/xl: use the new 
domain_build_info fields position"):
> This is required because those options will be used by the new PVH
> guest type, and thus need to be shared between PV and HVM.
...
> diff --git a/tools/xl/xl_parse.c b/tools/xl/xl_parse.c
> index 02ddd2e90d..61f9a38573 100644
> --- a/tools/xl/xl_parse.c
> +++ b/tools/xl/xl_parse.c
> @@ -1037,6 +1037,65 @@ void parse_config_data(const char *config_source,
>  xlu_cfg_get_defbool(config, "driver_domain", &c_info->driver_domain, 0);
>  xlu_cfg_get_defbool(config, "acpi", &b_info->acpi, 0);
>  
> +xlu_cfg_replace_string (config, "bootloader", &b_info->bootloader, 0);
> +switch (xlu_cfg_get_list_as_string_list(config, "bootloader_args",
> +&b_info->bootloader_args, 1)) {

The code motion would benefit from being moved into a pre-patch.  Is
that easily possible ?  Obviously in the new location, in the
pre-patch it would still need to be in an if () to check the guest
type.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 04/21] xl: introduce a domain type option

2017-09-20 Thread Ian Jackson
Roger Pau Monne writes ("[PATCH v2 04/21] xl: introduce a domain type option"):
> Introduce a new type option to xl configuration files in order to
> specify the domain type. This supersedes the current builder option.
...> 
> Th
>  libxl_defbool_set(&c_info->run_hotplug_scripts, run_hotplug_scripts);
>  c_info->type = LIBXL_DOMAIN_TYPE_PV;
> -if (!xlu_cfg_get_string (config, "builder", &buf, 0) &&
> -!strncmp(buf, "hvm", strlen(buf)))
> -c_info->type = LIBXL_DOMAIN_TYPE_HVM;
> +if (!xlu_cfg_get_string (config, "builder", &buf, 0)) {
> +fprintf(stderr,
> +"The builder option is being deprecated, please use type 
> instead.\n");

Line length.  Probably best to shuffle the message to the left, rather
than linewrapping, for ease of grepping.

The error message would benefit from some `quotes' around the
parameter names.

It would be nice if you did this in a way that meant that a config
file which specified both `type="hvm"' and `builder="hvm"' did not
generate a warning.

And it ought to be an error to specify `type="pv"' with
`builder="hvm"' surely.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 05/21] xl: introduce a firmware option

2017-09-20 Thread Ian Jackson
Roger Pau Monne writes ("[PATCH v2 05/21] xl: introduce a firmware option"):
> The new firmware option aims to provide a coherent way to set the
> firmware for the different kind of guests Xen supports.
...
> +=head3 Non direct Kernel Boot
> +
> +Non direct kernel boot allows booting guests with a firmware. This can be \
used
> +by all types of guests, although the selection of options is different
> +depending on the guest type.
> +
> +This option provides the flexibly of letting the guest decide which kernel\
 they
> +want to boot, while preventing having to poke at the guest file system for\
m the

Line length.  Observe the (simulated) wrap damage.

You should to consider using semantic linefeeds (semantic newlines).
That is, break the lines in the .pod file at sentence or phrase
boundaries.

I see the code has line length problems too (it wraps when I look at
it in my MUA and even more so when I try to quote it when replying).

After we've chatted about the idl compiler changes, can you fix the
long lines everywhere and resend ?

Thanks,
Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 1/2] xen: add a global indicator for grant copy being available

2017-09-20 Thread Anthony PERARD
On Tue, Sep 19, 2017 at 01:50:54PM +0200, Juergen Gross wrote:
> The Xen qdisk backend needs to test whether grant copy operations is
> available in the kernel. Unfortunately this collides with using
> xengnttab_set_max_grants() on some kernels as this operation has to
> be the first one after opening the gnttab device.
> 
> In order to solve this problem test for the availability of grant copy
> in xen_be_init() opening the gnttab device just for that purpose and
> closing it again afterwards. Advertise the availability via a global
> flag and use that flag in the qdisk backend.
> 
> Signed-off-by: Juergen Gross 
> ---
>  hw/block/xen_disk.c  | 18 ++
>  hw/xen/xen_backend.c | 11 +++
>  include/hw/xen/xen_backend.h |  1 +
>  3 files changed, 18 insertions(+), 12 deletions(-)
> 
> diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c
> index d42ed7070d..6632746250 100644
> --- a/hw/block/xen_disk.c
> +++ b/hw/block/xen_disk.c
> @@ -121,9 +121,6 @@ struct XenBlkDev {
>  unsigned intpersistent_gnt_count;
>  unsigned intmax_grants;
>  
> -/* Grant copy */
> -gbooleanfeature_grant_copy;
> -
>  /* qemu block driver */
>  DriveInfo   *dinfo;
>  BlockBackend*blk;
> @@ -616,7 +613,7 @@ static void qemu_aio_complete(void *opaque, int ret)
>  return;
>  }
>  
> -if (ioreq->blkdev->feature_grant_copy) {
> +if (xen_feature_grant_copy) {
>  switch (ioreq->req.operation) {
>  case BLKIF_OP_READ:
>  /* in case of failure ioreq->aio_errors is increased */
> @@ -638,7 +635,7 @@ static void qemu_aio_complete(void *opaque, int ret)
>  }
>  
>  ioreq->status = ioreq->aio_errors ? BLKIF_RSP_ERROR : BLKIF_RSP_OKAY;
> -if (!ioreq->blkdev->feature_grant_copy) {
> +if (!xen_feature_grant_copy) {
>  ioreq_unmap(ioreq);
>  }
>  ioreq_finish(ioreq);
> @@ -698,7 +695,7 @@ static int ioreq_runio_qemu_aio(struct ioreq *ioreq)
>  {
>  struct XenBlkDev *blkdev = ioreq->blkdev;
>  
> -if (ioreq->blkdev->feature_grant_copy) {
> +if (xen_feature_grant_copy) {
>  ioreq_init_copy_buffers(ioreq);
>  if (ioreq->req.nr_segments && (ioreq->req.operation == 
> BLKIF_OP_WRITE ||
>  ioreq->req.operation == BLKIF_OP_FLUSH_DISKCACHE) &&
> @@ -750,7 +747,7 @@ static int ioreq_runio_qemu_aio(struct ioreq *ioreq)
>  }
>  default:
>  /* unknown operation (shouldn't happen -- parse catches this) */
> -if (!ioreq->blkdev->feature_grant_copy) {
> +if (!xen_feature_grant_copy) {
>  ioreq_unmap(ioreq);
>  }
>  goto err;
> @@ -1010,18 +1007,15 @@ static int blk_init(struct XenDevice *xendev)
>  
>  blkdev->file_blk  = BLOCK_SIZE;
>  
> -blkdev->feature_grant_copy =
> -(xengnttab_grant_copy(blkdev->xendev.gnttabdev, 0, NULL) == 
> 0);
> -
>  xen_pv_printf(&blkdev->xendev, 3, "grant copy operation %s\n",
> -  blkdev->feature_grant_copy ? "enabled" : "disabled");
> +  xen_feature_grant_copy ? "enabled" : "disabled");
>  
>  /* fill info
>   * blk_connect supplies sector-size and sectors
>   */
>  xenstore_write_be_int(&blkdev->xendev, "feature-flush-cache", 1);
>  xenstore_write_be_int(&blkdev->xendev, "feature-persistent",
> -  !blkdev->feature_grant_copy);
> +  !xen_feature_grant_copy);
>  xenstore_write_be_int(&blkdev->xendev, "info", info);
>  
>  xenstore_write_be_int(&blkdev->xendev, "max-ring-page-order",
> diff --git a/hw/xen/xen_backend.c b/hw/xen/xen_backend.c
> index c46cbb0759..00210627a9 100644
> --- a/hw/xen/xen_backend.c
> +++ b/hw/xen/xen_backend.c
> @@ -44,6 +44,7 @@ BusState *xen_sysbus;
>  /* public */
>  struct xs_handle *xenstore = NULL;
>  const char *xen_protocol;
> +gboolean xen_feature_grant_copy;

I think it would be better if this was changed to bool instead of a
gboolean.

Beside that,
Acked-by: Anthony PERARD 

>  
>  /* private */
>  static int debug;
> @@ -519,6 +520,8 @@ void xenstore_update_fe(char *watch, struct XenDevice 
> *xendev)
>  
>  int xen_be_init(void)
>  {
> +xengnttab_handle *gnttabdev;
> +
>  xenstore = xs_daemon_open();
>  if (!xenstore) {
>  xen_pv_printf(NULL, 0, "can't connect to xenstored\n");
> @@ -532,6 +535,14 @@ int xen_be_init(void)
>  goto err;
>  }
>  
> +gnttabdev = xengnttab_open(NULL, 0);
> +if (gnttabdev != NULL) {
> +if (xengnttab_grant_copy(gnttabdev, 0, NULL) == 0) {
> +xen_feature_grant_copy = true;
> +}
> +xengnttab_close(gnttabdev);
> +}
> +
>  xen_sysdev = qdev_create(NULL, TYPE_XENSYSDEV);
>  qdev_init_nofail(xen_sysdev);
>  xen_sysbus = qbus_create(TYPE_XENSYSBUS, DEVICE(xen_sysdev), 
> "xen-sysbus");
> diff --git a/include/hw/xen/xen_backend.h b/include/hw/xen/xen_backend.h
> index 8

Re: [Xen-devel] [PATCH 2/2] xen: dont try setting max grants multiple times

2017-09-20 Thread Anthony PERARD
On Tue, Sep 19, 2017 at 01:50:55PM +0200, Juergen Gross wrote:
> Trying to call xengnttab_set_max_grants() with the same file handle
> might fail on some kernels, as this operation is allowed only once.
> 
> This is a problem for the qdisk backend as blk_connect() can be
> called multiple times for a domain, e.g. in case grub-xen is being
> used to boot it.
> 
> So instead of letting the generic backend code open the gnttab device
> do it in blk_connect() and close it again in blk_disconnect.
> 
> Signed-off-by: Juergen Gross 
> ---
>  hw/block/xen_disk.c | 12 +++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c
> index 6632746250..7cff8863cb 100644
> --- a/hw/block/xen_disk.c
> +++ b/hw/block/xen_disk.c
> @@ -1220,6 +1220,12 @@ static int blk_connect(struct XenDevice *xendev)
>  /* Add on the number needed for the ring pages */
>  max_grants += blkdev->nr_ring_ref;
>  
> +blkdev->xendev.gnttabdev = xengnttab_open(NULL, 0);
> +if (blkdev->xendev.gnttabdev == NULL) {
> +xen_pv_printf(xendev, 0, "xengnttab_open failed: %s\n",
> +  strerror(errno));
> +return -1;
> +}
>  if (xengnttab_set_max_grants(blkdev->xendev.gnttabdev, max_grants)) {
>  xen_pv_printf(xendev, 0, "xengnttab_set_max_grants failed: %s\n",
>strerror(errno));
> @@ -1327,6 +1333,11 @@ static void blk_disconnect(struct XenDevice *xendev)
>  }
>  blkdev->feature_persistent = false;
>  }
> +
> +if (blkdev->xendev.gnttabdev) {
> +xengnttab_close(blkdev->xendev.gnttabdev);
> +blkdev->xendev.gnttabdev = NULL;
> +}

I think blk_disconnect needs to be called from blk_free in case where
the gnttabdev is not closed (like it is done when blk or the sring are
not cleared, in blk_free).

>  }

>  static int blk_free(struct XenDevice *xendev)
> @@ -1363,7 +1374,6 @@ static void blk_event(struct XenDevice *xendev)
>  
>  struct XenDevOps xen_blkdev_ops = {
>  .size   = sizeof(struct XenBlkDev),
> -.flags  = DEVOPS_FLAG_NEED_GNTDEV,
>  .alloc  = blk_alloc,
>  .init   = blk_init,
>  .initialise= blk_connect,
> -- 
> 2.12.3
> 

-- 
Anthony PERARD

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] libxl: provide typedefs for device framework functions

2017-09-20 Thread Wei Liu
On Fri, Sep 15, 2017 at 12:01:56PM +0100, Wei Liu wrote:
> Use the new typedefs to avoid copy-n-paste everywhere.
> 
> No functional change.
> 
> Signed-off-by: Wei Liu 

I'm going to wait until Friday to apply this patch.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH] x86/PV: fix/generalize guest nul selector handling

2017-09-20 Thread Jan Beulich
FS/GS base (and limit) aren't being cleared by the loading of a nul
selector into the segment register on AMD CPUs. Therefore, if an
outgoing vCPU has a non-null base in one of these registers and the
subsequent incoming vCPU has a non-zero but nul selector in the
respective register(s), the selector value(s) would be loaded without
clearing the segment base(s) in the hidden register portion.

Since the ABI states "zero" in its description of the fields, it is
worth noting that the chosen approach to fix this alters the written
down ABI. I consider this preferrable over enforcing the previously
written down behavior, as nul selectors are far more likely to be what
was meant from the beginning.

The adjustments also eliminate an inconsistency between FS and GS
handling: Old code had an extra pointless (gs_base_user was always zero
when DIRTY_GS was set) conditional for GS. The old bitkeeper changeset
has no explanation for this asymmetry.

Inspired by Linux commit e137a4d8f4dd2e277e355495b6b2cb241a8693c3.

Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -1260,30 +1260,28 @@ static void load_segments(struct vcpu *n
 if ( unlikely((dirty_segment_mask & DIRTY_ES) | uregs->es) )
 all_segs_okay &= loadsegment(es, uregs->es);
 
-/*
- * Either selector != 0 ==> reload.
- * Also reload to reset FS_BASE if it was non-zero.
- */
-if ( unlikely((dirty_segment_mask & (DIRTY_FS | DIRTY_FS_BASE)) |
-  uregs->fs) )
+/* Either selector != 0 ==> reload. */
+if ( unlikely((dirty_segment_mask & DIRTY_FS) | uregs->fs) )
+{
 all_segs_okay &= loadsegment(fs, uregs->fs);
+/* non-nul selector updates fs_base */
+if ( uregs->fs & ~3 )
+dirty_segment_mask &= ~DIRTY_FS_BASE;
+}
 
-/*
- * Either selector != 0 ==> reload.
- * Also reload to reset GS_BASE if it was non-zero.
- */
-if ( unlikely((dirty_segment_mask & (DIRTY_GS | DIRTY_GS_BASE_USER)) |
-  uregs->gs) )
-{
-/* Reset GS_BASE with user %gs? */
-if ( (dirty_segment_mask & DIRTY_GS) || !n->arch.pv_vcpu.gs_base_user )
-all_segs_okay &= loadsegment(gs, uregs->gs);
+/* Either selector != 0 ==> reload. */
+if ( unlikely((dirty_segment_mask & DIRTY_GS) | uregs->gs) )
+{
+all_segs_okay &= loadsegment(gs, uregs->gs);
+/* non-nul selector updates gs_base_user */
+if ( uregs->gs & ~3 )
+dirty_segment_mask &= ~DIRTY_GS_BASE_USER;
 }
 
 if ( !is_pv_32bit_vcpu(n) )
 {
 /* This can only be non-zero if selector is NULL. */
-if ( n->arch.pv_vcpu.fs_base )
+if ( n->arch.pv_vcpu.fs_base | (dirty_segment_mask & DIRTY_FS_BASE) )
 wrfsbase(n->arch.pv_vcpu.fs_base);
 
 /* Most kernels have non-zero GS base, so don't bother testing. */
@@ -1291,7 +1289,8 @@ static void load_segments(struct vcpu *n
 wrmsrl(MSR_SHADOW_GS_BASE, n->arch.pv_vcpu.gs_base_kernel);
 
 /* This can only be non-zero if selector is NULL. */
-if ( n->arch.pv_vcpu.gs_base_user )
+if ( n->arch.pv_vcpu.gs_base_user |
+ (dirty_segment_mask & DIRTY_GS_BASE_USER) )
 wrgsbase(n->arch.pv_vcpu.gs_base_user);
 
 /* If in kernel mode then switch the GS bases around. */
@@ -1426,22 +1425,22 @@ static void save_segments(struct vcpu *v
 if ( regs->fs || is_pv_32bit_vcpu(v) )
 {
 dirty_segment_mask |= DIRTY_FS;
-v->arch.pv_vcpu.fs_base = 0; /* != 0 selector kills fs_base */
+/* non-nul selector kills fs_base */
+if ( regs->fs & ~3 )
+v->arch.pv_vcpu.fs_base = 0;
 }
-else if ( v->arch.pv_vcpu.fs_base )
-{
+if ( v->arch.pv_vcpu.fs_base )
 dirty_segment_mask |= DIRTY_FS_BASE;
-}
 
 if ( regs->gs || is_pv_32bit_vcpu(v) )
 {
 dirty_segment_mask |= DIRTY_GS;
-v->arch.pv_vcpu.gs_base_user = 0; /* != 0 selector kills gs_base_user 
*/
+/* non-nul selector kills gs_base_user */
+if ( regs->gs & ~3 )
+v->arch.pv_vcpu.gs_base_user = 0;
 }
-else if ( v->arch.pv_vcpu.gs_base_user )
-{
+if ( v->arch.pv_vcpu.gs_base_user )
 dirty_segment_mask |= DIRTY_GS_BASE_USER;
-}
 
 this_cpu(dirty_segment_mask) = dirty_segment_mask;
 }
--- a/xen/include/public/arch-x86/xen-x86_64.h
+++ b/xen/include/public/arch-x86/xen-x86_64.h
@@ -203,8 +203,8 @@ struct cpu_user_regs {
 uint16_t ss, _pad2[3];
 uint16_t es, _pad3[3];
 uint16_t ds, _pad4[3];
-uint16_t fs, _pad5[3]; /* Non-zero => takes precedence over fs_base. */
-uint16_t gs, _pad6[3]; /* Non-zero => takes precedence over gs_base_usr. */
+uint16_t fs, _pad5[3]; /* Non-nul => takes precedence over fs_base.  */
+uint16_t gs, _pad6[3]; /* Non-nul => takes precedence over gs_base_user. */
 };
 typedef struct cpu_user_regs cpu_user_regs_t;
 DEFINE_XEN_GUEST_HAN

Re: [Xen-devel] [PATCH] build: propagate runstatedir

2017-09-20 Thread Wei Liu
On Wed, Sep 13, 2017 at 10:11:25AM +0100, Wei Liu wrote:
> On Wed, Sep 13, 2017 at 11:07:15AM +0200, Juergen Gross wrote:
> > On 13/09/17 10:52, Wei Liu wrote:
> > > Debian Stretch's autoconf has a new variable called runstatedir. Add
> > > that to Paths.mk.in.
> > > 
> > > Signed-off-by: Wei Liu 
> > 
> > You did think of
> > 
> > https://lists.xen.org/archives/html/xen-devel/2017-02/msg01883.html
> > 
> > I suppose?
> > 
> 
> Yes. That was added at a time when autoconf didn't have runstatedir
> iirc. Now autoconf has that. There is no reason not to propagate it.

Thinking about this a bit more, exposing runstatedir like this doesn't
have much value. I want to make it work with the existing option that we
already have. 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 3/5] xen/livepatch/ARM32: Don't load and crash on livepatches loaded with wrong alignment.

2017-09-20 Thread Konrad Rzeszutek Wilk
On Tue, Sep 19, 2017 at 09:04:44AM -0600, Jan Beulich wrote:
> >>> On 18.09.17 at 21:37,  wrote:
> > On Tue, Sep 12, 2017 at 02:57:04AM -0600, Jan Beulich wrote:
> >> >>> On 12.09.17 at 02:22,  wrote:
> >> > If I compile the test-case under ARM32 it works OK (as the
> >> > .livepatch.depends ends up being aligned to four bytes).
> >> 
> >> So why is that? What entity is creating this section (or the
> >> directive(s) to create it)?
> > 
> > gcc
> > 
> > Looking at the xen_bye_world.o produced by cross-compiler:
> > 
> > xen_bye_world.o: file format elf32-littlearm
> > 
> > Contents of section .rodata:
> >   78656e5f 65787472 615f7665 7273696f  xen_extra_versio
> >  0010 6e00 n. 
> > 
> > And native:
> > 
> > xen_bye_world.o: file format elf32-littlearm
> > 
> > Contents of section .rodata:
> >   78656e5f 65787472 615f7665 7273696f  xen_extra_versio
> >  0010 6e00 n...  
> 
> This may rather be a gas than a gcc behavioral difference. What's
> the alignment of .rodata in both cases?

Cross:

* on the livepatch:
..snip..
  [ 4] .rodata   PROGBITS 74 12 00   A  0   0  4
  [ 5] .rodata.str1.4PROGBITS 88 0b 01 AMS  0   0  4
  [ 6] .livepatch.depend PROGBITS 93 24 00   A  0   0  1

* on the .o file:
Section Headers:
  [Nr] Name  TypeAddr OffSize   ES Flg Lk Inf Al
.. snip..
  [ 1] .text PROGBITS 34 00 00  AX  0   0  1
  [ 2] .data PROGBITS 34 00 00  WA  0   0  1
  [ 3] .bss  NOBITS   34 00 00  WA  0   0  1
  [ 4] .rodata   PROGBITS 34 14 00   A  0   0  4
  [ 5] .livepatch.funcs  PROGBITS 48 34 00  WA  0   0  4

Native:

 * on the livepatch:
..snip..
  [ 4] .rodata   PROGBITS 74 14 00   A  0   0  4
  [ 5] .rodata.str1.4PROGBITS 88 0c 01 AMS  0   0  4
  [ 6] .livepatch.depend PROGBITS 94 24 00   A  0   0  1

* on the .o file:
..snip..
  [ 1] .text PROGBITS 34 00 00  AX  0   0  1
  [ 2] .data PROGBITS 34 00 00  WA  0   0  1
  [ 3] .bss  NOBITS   34 00 00  WA  0   0  1
  [ 4] .rodata   PROGBITS 34 12 00   A  0   0  4
  [ 5] .livepatch.funcs  PROGBITS 48 34 00  WA  0   0  4



___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH V3 1/3] Xen: Increase hap/shadow page pool size to support more vcpus support

2017-09-20 Thread Wei Liu
On Tue, Sep 19, 2017 at 11:06:26AM +0800, Lan Tianyu wrote:
> Hi Wei:
> 
> On 2017年09月18日 21:06, Wei Liu wrote:
> > On Wed, Sep 13, 2017 at 12:52:47AM -0400, Lan Tianyu wrote:
> >> This patch is to increase page pool size when max vcpu number is larger
> >> than 128.
> >>
> >> Signed-off-by: Lan Tianyu 
> >> ---
> >>  xen/arch/arm/domain.c|  5 +
> >>  xen/arch/x86/domain.c| 25 +
> >>  xen/common/domctl.c  |  3 +++
> >>  xen/include/xen/domain.h |  2 ++
> >>  4 files changed, 35 insertions(+)
> >>
> >> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
> >> index 6512f01..94cf70b 100644
> >> --- a/xen/arch/arm/domain.c
> >> +++ b/xen/arch/arm/domain.c
> >> @@ -824,6 +824,11 @@ int arch_vcpu_reset(struct vcpu *v)
> >>  return 0;
> >>  }
> >>  
> >> +int arch_domain_set_max_vcpus(struct domain *d)
> >> +{
> >> +return 0;
> >> +}
> >> +
> >>  static int relinquish_memory(struct domain *d, struct page_list_head 
> >> *list)
> >>  {
> >>  struct page_info *page, *tmp;
> >> diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
> >> index dbddc53..0e230f9 100644
> >> --- a/xen/arch/x86/domain.c
> >> +++ b/xen/arch/x86/domain.c
> >> @@ -1161,6 +1161,31 @@ int arch_vcpu_reset(struct vcpu *v)
> >>  return 0;
> >>  }
> >>  
> >> +int arch_domain_set_max_vcpus(struct domain *d)
> > 
> > The name doesn't match what the function does.
> > 
> 
> I originally hoped to introduce a hook for each arch when set max vcpus.
> Each arch function can do customized thing and so named
> "arch_domain_set_max_vcpus".
> 
> How about "arch_domain_setup_vcpus_resource"?

Before you go away and do a lot of work, please let us think about if
this is the right approach first.

We are close to freeze, with the amount of patches we receive everyday
RFC patch like this one is low on my (can't speak for others) priority
list. I am not sure when I will be able to get back to this, but do ping
us if you want to know where things stand.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [ovmf test] 113621: all pass - PUSHED

2017-09-20 Thread osstest service owner
flight 113621 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/113621/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 ovmf 70dc3ec5a72e0e3fc3ea8f63baecdeafd1110db8
baseline version:
 ovmf 424a5ec33b3d5a842bff3f4695d0bd709c91a163

Last test of basis   113608  2017-09-19 16:50:44 Z0 days
Testing same since   113621  2017-09-20 06:49:55 Z0 days1 attempts


People who touched revisions under test:
  Jiewen Yao 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

+ branch=ovmf
+ revision=70dc3ec5a72e0e3fc3ea8f63baecdeafd1110db8
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
 export PERLLIB=.:.
 PERLLIB=.:.
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x '!=' x/home/osstest/repos/lock ']'
++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock
++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push ovmf 
70dc3ec5a72e0e3fc3ea8f63baecdeafd1110db8
+ branch=ovmf
+ revision=70dc3ec5a72e0e3fc3ea8f63baecdeafd1110db8
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
 export PERLLIB=.:.:.
 PERLLIB=.:.:.
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']'
+ . ./cri-common
++ . ./cri-getconfig
+++ export PERLLIB=.:.:.:.
+++ PERLLIB=.:.:.:.
++ umask 002
+ select_xenbranch
+ case "$branch" in
+ tree=ovmf
+ xenbranch=xen-unstable
+ '[' xovmf = xlinux ']'
+ linuxbranch=
+ '[' x = x ']'
+ qemuubranch=qemu-upstream-unstable
+ select_prevxenbranch
++ ./cri-getprevxenbranch xen-unstable
+ prevxenbranch=xen-4.9-testing
+ '[' x70dc3ec5a72e0e3fc3ea8f63baecdeafd1110db8 = x ']'
+ : tested/2.6.39.x
+ . ./ap-common
++ : osst...@xenbits.xen.org
+++ getconfig OsstestUpstream
+++ perl -e '
use Osstest;
readglobalconfig();
print $c{"OsstestUpstream"} or die $!;
'
++ :
++ : git://xenbits.xen.org/xen.git
++ : osst...@xenbits.xen.org:/home/xen/git/xen.git
++ : git://xenbits.xen.org/qemu-xen-traditional.git
++ : git://git.kernel.org
++ : git://git.kernel.org/pub/scm/linux/kernel/git
++ : git
++ : git://xenbits.xen.org/xtf.git
++ : osst...@xenbits.xen.org:/home/xen/git/xtf.git
++ : git://xenbits.xen.org/xtf.git
++ : git://xenbits.xen.org/libvirt.git
++ : osst...@xenbits.xen.org:/home/xen/git/libvirt.git
++ : git://xenbits.xen.org/libvirt.git
++ : git://xenbits.xen.org/osstest/rumprun.git
++ : git
++ : git://xenbits.xen.org/osstest/rumprun.git
++ : osst...@xenbits.xen.org:/home/xen/git/osstest/rumprun.git
++ : git://git.seabios.org/seabios.git
++ : osst...@xenbits.xen.org:/home/xen/git/osstest/seabios.git
++ : git://xenbits.xen.org/osstest/seabios.git
++ : https://github.com/tianocore/edk2.git
++ : osst...@xenbits.xen.org:/home/xen/git/osstest/ovmf.git
++ : git://xenbits.xen.org/osstest/ovmf.git
++ : git://xenbits.xen.org/osstest/linux-firmware.git
++ : osst...@xenbits.xen.org:/home/osstest/ext/linux-firmware.git
++

Re: [Xen-devel] Booting signed xen.efi through shim

2017-09-20 Thread Tamas K Lengyel
On Wed, Sep 20, 2017 at 12:30 AM, Jan Beulich  wrote:
 On 20.09.17 at 00:23,  wrote:
>> On Mon, Sep 18, 2017 at 2:58 AM, Jan Beulich  wrote:
>> On 14.09.17 at 18:20,  wrote:
 Of course, you can grab them from here:
 https://drive.google.com/drive/folders/0B5duyI9SzNtWaXE0cjM1QzZJbVk?usp=shar
 ing
>>>
>>> So the dumps of the two (using my own tool) are identical except for
>>> the expected difference due to the certificate. In particular neither
>>> image has any strange relocation types afaics, and both have the
>>> sort of unexpected, but also supposedly benign
>>> IMAGE_SCN_LNK_NRELOC_OVFL flag set for .bss. Hence I'm afraid ...
>>>
 I've verified that xen-signed.efi boots with Secureboot enabled when
 booted directly but doesn't boot through the shim.
>>>
>>> ... you'll need to do some debugging in order to figure out what's
>>> going on here. With the above the prime suspect is the shim though,
>>> fiddling with the image after loading it into memory. So perhaps
>>> dumping the .reloc section contents in order to compare it with
>>> what's in the image may be a suitable approach.
>>
>> Yeap, the shim pretty simply removed the .reloc section as it was
>> marked discardable and did the relocations for Xen. So with that
>> removed from the shim I no longer get the error and I see that the
>> dom0 kernel gets verified using the shim lock protocol.
>
> So did you instead try whether simply clearing the discardable
> flag from the section also helps? The flag ought to matter in
> paged environments only (which EFI isn't despite paging being
> enabled), but as we see some people think otherwise.

Yes, that would work. Even if the shim does its relocations everything
"just works" as long as Xen can also find the .reloc section. For now
it was just simpler for me to patch the shim to copy the reloc section
but it would be ideal if the xen.efi that's produced during
compilation would not have that flag set to begin with. I did search
around briefly to see where that flag is coming from but the only
reference to it within Xen I found was arch/x86/efi/mkreloc.c. So sans
writing a separate tool that binary patches xen.efi after compilation
to remove that flag I'm not sure how to get that done.

>
>> I still didn't
>> get dom0 to boot for some reason but that might be an unrelated issue
>> (and I have no serial console right now). Nevertheless, progress!
>
> And it doesn't get far enough for you to see any output at all?
> Did you try "earlyprintk=xen" on the kernel command line and/or
> "vga=keep" on the Xen one?

I tried with both just now, no output at all from Xen on the screen
after it exits EFI boot. I also couldn't get any output from it on my
other laptop with Intel AMT.

I did manage to get another Linux kernel booting but my goal was to
get a shim verified dom0 kernel booting without an initrd image as
right now the ramdisk is not verified by the shim (also not sure how
that's supposed to work as sbsign/pesign can only deal with PE/COFF
binaries which the ramdisk isn't).

Tamas

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 10/22] xentoolcore_restrict_all: Implement for libxenforeignmemory

2017-09-20 Thread Wei Liu
On Tue, Sep 19, 2017 at 12:08:27PM +0100, Ian Jackson wrote:
> Wei Liu writes ("Re: [PATCH 10/22] xentoolcore_restrict_all: Implement for 
> libxenforeignmemory"):
> > Sure that's fine.
> 
> I have this now:
> 
>  * If called again with the same domid, it may succeed, or it may
>  * fail (even though such a call is potentially meaningful).
>  * (If called again with a different domid, it will necessariloy
>  * fail.)

Typo "necessariloy".

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 03/22] xentoolcore, _restrict_all: Introduce new library and implementation

2017-09-20 Thread Wei Liu
On Tue, Sep 19, 2017 at 12:04:42PM +0100, Ian Jackson wrote:
> Wei Liu writes ("Re: [PATCH 03/22] xentoolcore, _restrict_all: Introduce new 
> library and implementation"):
> > The impression I get from the name and parameter from this API is that
> > the following use case is allowed: a device model serving multiple
> > domains.
> 
> Such a device model is obviously, by necessity, unrestricted.
> 
> > The device model will open two sets of handlers of various
> > libraries. The device model will call restrict_all(domid) to restrict
> > its own privileges on certain domid when it sees fit.
> 
> After it has restricted its privileges to domid A, it is no longer
> permitted to do things to domid B.  Attempting to call restrict_all(B)
> will fail.
> 
> > Without filtering, the callbacks are called for all the domains at the
> > same time. The code as-is, when resctrict_all(dom1) is called, makes
> > privileges on dom2 are also dropped sometimes -- imagine a xenstore
> > callback registered for dom2 is called, which makes the connection
> > unusable for dom2.
> > 
> > If the aforementioned use case is not anticipated, I think we shouldn't
> > accept domid parameter for the resctrict_all function.
> 
> But the domid is precisely the scope of the intended restriction.
> After making the call, the malign influence of the calling process is
> limited to the specified domid (at least, insofar as the malign
> influence is exercised via already-open Xen library handles).
> 

Ah, I know where I got myself confused.

You can have my Ack on this patch.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 22/22] RFC: tools: xentoolcore_restrict_all: use domid_t

2017-09-20 Thread Wei Liu
On Tue, Sep 19, 2017 at 12:01:15PM +0100, Ian Jackson wrote:
> Wei Liu writes ("Re: [PATCH 22/22] RFC: tools: xentoolcore_restrict_all: use 
> domid_t"):
> > On Fri, Sep 15, 2017 at 07:48:59PM +0100, Ian Jackson wrote:
> > > This is an RFC because it does not currently compile, because not all
> > > the places that use xentoolcore have a definition of domid in scope!
> ...
> > If the places you mentioned are in xen.git we should probably fix them.
> > If they are external users like QEMU then we probably can't do much.
> 
> They are in xen.git.  The difficulty is that they don't currently
> include the right headers.  Also, in general, much of the code in
> xen.git doesn't always use domid_t.
> 
> (Partly because domid_t is a weird short type.)

I would very much like to have domid_t in the toolcore API, just to be
consistent with xendevicemodel APIs. I think we should fix xen.git --
including the right header in the places which use toolcore APIs doesn't
sound too hard for me.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


  1   2   3   >