[Xen-devel] minimum Python version
Hello, for quite a while (apparently as of Marek's series in February) I've been seeing xen/lowlevel/xc/xc.c: In function ‘pyxc_dom_extract_cpuid’: xen/lowlevel/xc/xc.c:692:13: error: implicit declaration of function ‘PyBytes_AS_STRING’ [-Werror=implicit-function-declaration] xen/lowlevel/xc/xc.c:692:21: error: assignment makes pointer from integer without a cast [-Werror] xen/lowlevel/xc/xc.c: In function ‘pyxc_create_cpuid_dict’: xen/lowlevel/xc/xc.c:707:29: error: implicit declaration of function ‘PyBytes_FromString’ [-Werror=implicit-function-declaration] xen/lowlevel/xc/xc.c:707:29: error: passing argument 3 of ‘PyDict_SetItemString’ makes pointer from integer without a cast [-Werror] In file included from /usr/include/python2.4/Python.h:94:0, from xen/lowlevel/xc/xc.c:7: /usr/include/python2.4/dictobject.h:128:17: note: expected ‘struct PyObject *’ but argument is of type ‘int’ xen/lowlevel/xc/xc.c: In function ‘pyxc_readconsolering’: xen/lowlevel/xc/xc.c:931:5: error: implicit declaration of function ‘PyBytes_FromStringAndSize’ [-Werror=implicit-function-declaration] xen/lowlevel/xc/xc.c:931:9: error: assignment makes pointer from integer without a cast [-Werror] xen/lowlevel/xc/xc.c: In function ‘PyXc_dealloc’: xen/lowlevel/xc/xc.c:2650:5: error: implicit declaration of function ‘Py_TYPE’ [-Werror=implicit-function-declaration] xen/lowlevel/xc/xc.c:2650:18: error: invalid type argument of ‘->’ (have ‘int’) cc1: all warnings being treated as errors on an older system having Python 2.4 on it. The minimum version configure checks for appears to be 2.3, matching up with what ./README says. Since the Python bindings aren't something one absolutely needs (afaict), wouldn't it be reasonable to check for a suitable version and if that fails simply disable their building (which currently is unconditional)? What I'd like to avoid is ending up with being unable to build the tools on that system - I know it's rather old, but there's a reason I'd like to keep it (including its old distro level). Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 04/15] x86: implement data structure and CPU init flow for MBA
>>> On 20.09.17 at 05:22, wrote: > On 17-09-19 09:55:28, Roger Pau Monn wrote: >> On Tue, Sep 05, 2017 at 05:32:26PM +0800, Yi Sun wrote: >> > @@ -1389,6 +1480,7 @@ static void psr_cpu_init(void) >> > unsigned int socket, cpu = smp_processor_id(); >> > struct feat_node *feat; >> > struct cpuid_leaf regs; >> > +uint32_t reg_b; >> >> Not sure of the benefit between using regs.b or reg_b (it's only 1 >> char shorter). >> > You can see the 'regs' is overwritten in below codes so that the 'regs.b' is > not > kept. To add a new local variable 'reg_b' here, we can avoid calling > 'cpuid_count_leaf' for L2 CAT and MBA. In which case - wouldn't "ebx" be a better name for the variable? Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] minimum Python version
On 20/09/17 09:00, Jan Beulich wrote: > Hello, > > for quite a while (apparently as of Marek's series in February) I've > been seeing > > xen/lowlevel/xc/xc.c: In function ‘pyxc_dom_extract_cpuid’: > xen/lowlevel/xc/xc.c:692:13: error: implicit declaration of function > ‘PyBytes_AS_STRING’ [-Werror=implicit-function-declaration] > xen/lowlevel/xc/xc.c:692:21: error: assignment makes pointer from integer > without a cast [-Werror] > xen/lowlevel/xc/xc.c: In function ‘pyxc_create_cpuid_dict’: > xen/lowlevel/xc/xc.c:707:29: error: implicit declaration of function > ‘PyBytes_FromString’ [-Werror=implicit-function-declaration] > xen/lowlevel/xc/xc.c:707:29: error: passing argument 3 of > ‘PyDict_SetItemString’ makes pointer from integer without a cast [-Werror] > In file included from /usr/include/python2.4/Python.h:94:0, > from xen/lowlevel/xc/xc.c:7: > /usr/include/python2.4/dictobject.h:128:17: note: expected ‘struct PyObject > *’ but argument is of type ‘int’ > xen/lowlevel/xc/xc.c: In function ‘pyxc_readconsolering’: > xen/lowlevel/xc/xc.c:931:5: error: implicit declaration of function > ‘PyBytes_FromStringAndSize’ [-Werror=implicit-function-declaration] > xen/lowlevel/xc/xc.c:931:9: error: assignment makes pointer from integer > without a cast [-Werror] > xen/lowlevel/xc/xc.c: In function ‘PyXc_dealloc’: > xen/lowlevel/xc/xc.c:2650:5: error: implicit declaration of function > ‘Py_TYPE’ [-Werror=implicit-function-declaration] > xen/lowlevel/xc/xc.c:2650:18: error: invalid type argument of ‘->’ (have > ‘int’) > cc1: all warnings being treated as errors > > on an older system having Python 2.4 on it. The minimum version > configure checks for appears to be 2.3, matching up with what > ./README says. Since the Python bindings aren't something one > absolutely needs (afaict), wouldn't it be reasonable to check for > a suitable version and if that fails simply disable their building > (which currently is unconditional)? What I'd like to avoid is ending > up with being unable to build the tools on that system - I know > it's rather old, but there's a reason I'd like to keep it (including its > old distro level). Hmm, remembers me to my attempt removing most of the python bindings in 2015. I think we should try to avoid such problems in the future by (at least) adding a comment to xen/lowlevel/xc/xc.c that any enhancement should be done only in case of a real need. New bindings should only be added with mentioning who will need this binding in the commit message. Existing bindings should rather be deleted than updated in case an underlying libxc function is being modified (unless there is a user of that binding, of course). For your case: what about adding a configure option to disable building the python bindings (and any dependencies like e.g. pygrub) instead of doing so in case of a version mismatch? This would avoid any surprises in case someone didn't notice that the bindings have been disabled. Juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 14/15] tools: implement new generic set value interface and MBA set value command
On 17-09-19 12:30:59, Roger Pau Monn� wrote: > > int libxl_psr_cat_get_cbm(libxl_ctx *ctx, uint32_t domid, > > @@ -458,7 +432,33 @@ int libxl_psr_set_val(libxl_ctx *ctx, uint32_t domid, > >libxl_psr_type type, libxl_bitmap *target_map, > >uint64_t val) > > { > > -return ERROR_FAIL; > > +GC_INIT(ctx); > > +int rc; > > +int socketid, nr_sockets; > > You could fit them all in a single line. > Yes, thanks. > > +int main_psr_mba_set(int argc, char **argv) > > +{ > > +uint32_t domid; > > +libxl_psr_type type; > > +uint64_t thrtl; > > +int ret, opt = 0; > > +libxl_bitmap target_map; > > +char *value; > > +libxl_string_list socket_list; > > +unsigned long start, end; > > +unsigned int i, j, len; > > + > > +static const struct option opts[] = { > > +{"socket", 1, 0, 's'}, > > +COMMON_LONG_OPTS > > +}; > > + > > +libxl_socket_bitmap_alloc(ctx, &target_map, 0); > > +libxl_bitmap_set_none(&target_map); > > + > > +SWITCH_FOREACH_OPT(opt, "s:", opts, "psr-mba-set", 0) { > > +case 's': > > +trim(isspace, optarg, &value); > > +split_string_into_string_list(value, ",", &socket_list); > > +len = libxl_string_list_length(&socket_list); > > +for (i = 0; i < len; i++) { > > + parse_range(socket_list[i], &start, &end); > > Indentation. > Sorry. > > +for (j = start; j <= end; j++) > > +libxl_bitmap_set(&target_map, j); > > +} > > + > > +libxl_string_list_dispose(&socket_list); > > +free(value); > > +break; > > +} > > + > > +type = LIBXL_PSR_CBM_TYPE_MBA_THRTL; > > + > > +if (libxl_bitmap_is_empty(&target_map)) > > +libxl_bitmap_set_any(&target_map); > > + > > +if (argc != optind + 2) { > > +help("psr-mba-set"); > > +return 2; > > +} > > Can you do this check at the beginning of the function? Also why > return 2 instead of EXIT_FAILURE? > Yes, will move it to the beginning. Will return EXIT_FAILURE. > Thanks, Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 15/15] docs: add MBA description in docs
On 17-09-19 12:37:24, Roger Pau Monn� wrote: > On Tue, Sep 05, 2017 at 05:32:37PM +0800, Yi Sun wrote: > > +Intel Skylake and later server platforms offer capabilities to configure > > and > > +make use of the Memory Bandwidth Allocation (MBA) mechanisms, which > > provides > > +OS/VMMs the ability to slow misbehaving apps/VMs or create advanced > > closed-loop > > I don't get the 'closed-loop' thing again, but that might just be me > since I'm not a native speaker. > Will modify this to be same as feature doc. [...] > > +In the linear mode the input precision is defined as 100-(THRTL_MAX). > > Values > > +not an even multiple of the precision (e.g., 12%) will be rounded down > > (e.g., > > +to 10% delay applied). >^ s/applied/by the hardware/ > Thanks! > Thanks, Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 04/15] x86: implement data structure and CPU init flow for MBA
On 17-09-20 01:11:50, Jan Beulich wrote: > >>> On 20.09.17 at 05:22, wrote: > > On 17-09-19 09:55:28, Roger Pau Monn wrote: > >> On Tue, Sep 05, 2017 at 05:32:26PM +0800, Yi Sun wrote: > >> > @@ -1389,6 +1480,7 @@ static void psr_cpu_init(void) > >> > unsigned int socket, cpu = smp_processor_id(); > >> > struct feat_node *feat; > >> > struct cpuid_leaf regs; > >> > +uint32_t reg_b; > >> > >> Not sure of the benefit between using regs.b or reg_b (it's only 1 > >> char shorter). > >> > > You can see the 'regs' is overwritten in below codes so that the 'regs.b' > > is not > > kept. To add a new local variable 'reg_b' here, we can avoid calling > > 'cpuid_count_leaf' for L2 CAT and MBA. > > In which case - wouldn't "ebx" be a better name for the variable? > Thanks for the suggestion! > Jan > > > ___ > Xen-devel mailing list > Xen-devel@lists.xen.org > https://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC] Unicore Subproject Proposal
Hi Lars, Great news, thanks! We sent v2 to the mailing list on the 18th, 2 days ago. When you say “post v2 first”, should we be posting it somewhere else too? Thanks, — Felipe Dr. Felipe Huici Chief Researcher, Networked Systems and Data Analytics Group NEC Laboratories Europe, Network Research Division Kurfuerstenanlage 36, D-69115 Heidelberg Tel. +49 (0)6221 4342-241 Fax: +49 (0)6221 4342-155 e-mail: felipe.hu...@neclab.eu NEC Europe Limited Registered Office: NEC House, 1 Victoria Road, London W3 6BL Registered in England 2832014 On 9/20/17, 2:20 AM, "Lars Kurth" wrote: >Felipe, Simon, >a quick note to let you know that the Advisory Board in today’s AB >meeting decided to endorse your proposal. >Let me know how you proceed: from my perspective, we can kick off a >formal vote before you make modifications to the proposal, but I think it >is better to post v2 first. >Lars > >On 07/09/2017, 05:26, "Felipe Huici" wrote: > >Dear all, > >Following up on discussions that Simon Kuenzer had with several of >you at >the last Xen summit, we’re now submitting a Xen subproject proposal >based >on our Unicore work. Could you please review it? > >Thanks, > >Felipe Huici & Simon Kuenzer - NEC Labs Heidelberg. > > >PROPOSAL: Unicore >= > >Roles >- >Project Leads: Simon Kuenzer (main lead) > Felipe Huici (co-lead) > Florian Schmidt (co-lead) >Project Mentor: Lars Kurth >Project Sponsor: -To be found- > >Background >-- >In recent years, several papers and projects dedicated to unikernels >have >shown the immense potential for performance gains that these have. By >leveraging specialization and the use of minimalistic OSes, >unikernels are >able to yield impressive numbers, including fast instantiation times >(tens >of milliseconds or less), tiny memory footprints (a few MBs or even >KBs), >high network throughput (10-40 Gb/s), and high consolidation (e.g., >being >able to run thousands of instances on a single commodity server), not >to >mention a reduced attack surface and the potential for easier >certification. Unikernel projects worthy of mention include MirageOS, >ClickOS, Erlang on Xen, OSv, HALVM, and Minicache, among others. > >The fundamental drawback of unikernels is that they require that >applications be manually ported to the underlying minimalistic OS >(e.g. >having to port nginx, snort, mysql or memcached to MiniOS or OSv); >this >requires both expert work and often considerable amount of time. In >essence, we need to pick between either high performance with >unikernels, >or no porting effort but decreased performance and decreased >efficiency >with standard OS/VM images. The goal of this proposal is to change >this >status quo by providing a highly configurable unikernel code base; we >call >this base Unicore. > >This project also aims to concentrate the various efforts currently >going >on in the Xen community regarding minimalistic OSes (essentially >different >variants of MiniOS). We think that splitting the community across >these >variants is counter-productive and hope that Unicore will provide a >common >place for all or most improvements and customizations of minimalistic >OSes. The long term goal is to replace something like MiniOS with a >tool >that can automatically build such a minimalistic OS. > >Unicore - The "Unikernel Core" >- >The high level goal of Unicore is to be able to build unikernels >targeted >at specific applications without requiring the time-consuming, expert >work >that building such a unikernel requires today. An additional goal (or >hope) of Unicore is that all developers interested in unikernel >development would contribute by supplying libraries rather than >working on >independent projects with different code bases as it is done now. The >main >idea behind Unicore is depicted in Figure 1 and consists of two basic >components: > > >[Attachment: unicore-oneslider.pdf] > > >Figure 1. Unicore architecture. > > >Library pools would contain libraries that the user of Unicore can >select >from to create the unikernel. From the bottom up, library pools are >organized into (1) the architecture library tool, containing libraries >specific to a computer architecture (e.g., x86_64, ARM32 or MIPS); >(2) the >platform tool, where target platforms can be Xen, KVM, bare metal >(i.e. no >virtualization) and user-space Linux; and (3) the main library pool, >containing a rich set of functionality to build the unikernel from. >This >last librar
Re: [Xen-devel] minimum Python version
>>> On 20.09.17 at 09:20, wrote: > For your case: what about adding a configure option to disable building > the python bindings (and any dependencies like e.g. pygrub) instead of > doing so in case of a version mismatch? This would avoid any surprises > in case someone didn't notice that the bindings have been disabled. That's certainly an option, but I'm generally not very happy with having to add such options. My script invoking configure becomes more and more unreadable because of such, since this way the requirement to do the version check gets moved from the build system (i.e. configure, whose job this is imo) to that script. And as you can probably understand I don't want to unilaterally disable the building of that code, to be able to build test possible changes I may end up making to e.g. libxc. IOW if such a configure option was added, I'd still think it should default to disabled if the Python in use is not new enough. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v1] x86/vvmx: add hvm_intsrc_vector support to nvmx_intr_intercept()
> From: Sergey Dyasli [mailto:sergey.dya...@citrix.com] > Sent: Wednesday, September 13, 2017 9:01 PM > > Under the following circumstances: > > 1. L1 doesn't enable PAUSE exiting or PAUSE-loop exiting controls > 2. L2 executes PAUSE in a loop with RFLAGS.IE == 0 > > L1's PV IPI through event channel will never reach the target L1's vCPU > which runs L2 because nvmx_intr_intercept() doesn't know about > hvm_intsrc_vector. This leads to infinite L2 loop without nested > vmexits and can cause L1 to hang. > > The issue is easily reproduced with Qemu/KVM on CentOS-7-1611 as L1 > and an L2 guest with SMP. > > Fix nvmx_intr_intercept() by injecting hvm_intsrc_vector irq into L1 > which will cause nested vmexit. > > Signed-off-by: Sergey Dyasli Acked-by: Kevin Tian ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [distros-debian-squeeze test] 72130: tolerable trouble: broken/fail/pass
flight 72130 distros-debian-squeeze real [real] http://osstest.xs.citrite.net/~osstest/testlogs/logs/72130/ Failures :-/ but no regressions. Tests which did not succeed, but are not blocking: build-arm64-pvops 2 hosts-allocate broken like 72101 build-arm64 2 hosts-allocate broken like 72101 build-arm64 3 capture-logs broken like 72101 build-arm64-pvops 3 capture-logs broken like 72101 test-amd64-amd64-amd64-squeeze-netboot-pygrub 10 debian-di-install fail blocked in 72101 test-amd64-amd64-i386-squeeze-netboot-pygrub 10 debian-di-install fail blocked in 72101 test-amd64-i386-i386-squeeze-netboot-pygrub 10 debian-di-install fail like 72101 test-amd64-i386-amd64-squeeze-netboot-pygrub 10 debian-di-install fail like 72101 baseline version: flight 72101 jobs: build-amd64 pass build-arm64 broken build-armhf pass build-i386 pass build-amd64-pvopspass build-arm64-pvopsbroken build-armhf-pvopspass build-i386-pvops pass test-amd64-amd64-amd64-squeeze-netboot-pygrubfail test-amd64-i386-amd64-squeeze-netboot-pygrub fail test-amd64-amd64-i386-squeeze-netboot-pygrub fail test-amd64-i386-i386-squeeze-netboot-pygrub fail sg-report-flight on osstest.xs.citrite.net logs: /home/osstest/logs images: /home/osstest/images Logs, config files, etc. are available at http://osstest.xs.citrite.net/~osstest/testlogs/logs Test harness code can be found at http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary Push not applicable. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4 1/1] public/io/netif.h: add gref mapping control messages
> -Original Message- > From: Joao Martins [mailto:joao.m.mart...@oracle.com] > Sent: 19 September 2017 20:08 > To: Xen-devel > Cc: Wei Liu ; Paul Durrant ; > Konrad Rzeszutek Wilk ; Joao Martins > > Subject: [PATCH v4 1/1] public/io/netif.h: add gref mapping control messages > > Adds 3 messages to allow guest to let backend keep grants mapped, > such that 1) guests allowing fast recycling of pages can avoid doing > grant ops for those cases, or otherwise 2) preferring copies over > grants and 3) always using a fixed set of pages for network I/O. > > The three control ring messages added are: > - Add grefs to be mapped by backend > - Remove grefs mappings (If they are not in use) > - Get maximum amount of grefs kept mapped. > > Signed-off-by: Joao Martins I think the text is clear enough now. Now for the netfront/netback patches :-) Reviewed-by: Paul Durrant > --- > v4: > * Declare xen_netif_gref parameters are input or output. > * Clarify status field and that it doesn't require to be set to zero > prior to its usage. > * Clarify on ADD_GREF_MAPPING is 'all or nothing' > * Improve last paragraph of DEL_GREF_MAPPING > > v3: > * Use DEL for unmapping grefs instead of PUT > * Rname from xen_netif_gref_alloc to xen_netif_gref > * Add 'status' field on xen_netif_gref > * Clarify what 'inflight' means > * Use "beginning of the page" instead of "beginning of the grant" > * Mention that page needs to be r/w (as it will have to modify \.status) > --- > xen/include/public/io/netif.h | 123 > ++ > 1 file changed, 123 insertions(+) > > diff --git a/xen/include/public/io/netif.h b/xen/include/public/io/netif.h > index ca0061410d..2454448baa 100644 > --- a/xen/include/public/io/netif.h > +++ b/xen/include/public/io/netif.h > @@ -353,6 +353,9 @@ struct xen_netif_ctrl_request { > #define XEN_NETIF_CTRL_TYPE_SET_HASH_MAPPING_SIZE 5 > #define XEN_NETIF_CTRL_TYPE_SET_HASH_MAPPING 6 > #define XEN_NETIF_CTRL_TYPE_SET_HASH_ALGORITHM7 > +#define XEN_NETIF_CTRL_TYPE_GET_GREF_MAPPING_SIZE 8 > +#define XEN_NETIF_CTRL_TYPE_ADD_GREF_MAPPING 9 > +#define XEN_NETIF_CTRL_TYPE_DEL_GREF_MAPPING 10 > > uint32_t data[3]; > }; > @@ -391,6 +394,44 @@ struct xen_netif_ctrl_response { > }; > > /* > + * Static Grants (struct xen_netif_gref) > + * = > + * > + * A frontend may provide a fixed set of grant references to be mapped on > + * the backend. The message of type > XEN_NETIF_CTRL_TYPE_ADD_GREF_MAPPING > + * prior its usage in the command ring allows for creation of these mappings. > + * The backend will maintain a fixed amount of these mappings. > + * > + * XEN_NETIF_CTRL_TYPE_GET_GREF_MAPPING_SIZE lets a frontend > query how many > + * of these mappings can be kept. > + * > + * Each entry in the XEN_NETIF_CTRL_TYPE_{ADD,DEL}_GREF_MAPPING > input table has > + * the following format: > + * > + *0 1 2 3 4 5 6 7 octet > + * +-+-+-+-+-+-+-+-+ > + * | grant ref | flags| status | > + * +-+-+-+-+-+-+-+-+ > + * > + * grant ref: grant reference (IN) > + * flags: flags describing the control operation (IN) > + * status: XEN_NETIF_CTRL_STATUS_* (OUT) > + * > + * 'status' is an output parameter which does not require to be set to zero > + * prior to its usage in the corresponding control messages. > + */ > + > +struct xen_netif_gref { > + grant_ref_t ref; > + uint16_t flags; > + > +#define _XEN_NETIF_CTRLF_GREF_readonly0 > +#define XEN_NETIF_CTRLF_GREF_readonly > (1U<<_XEN_NETIF_CTRLF_GREF_readonly) > + > + uint16_t status; > +}; > + > +/* > * Control messages > * > * > @@ -609,6 +650,88 @@ struct xen_netif_ctrl_response { > * invalidate any table data outside that range. > * The grant reference may be read-only and must remain valid until > * the response has been processed. > + * > + * XEN_NETIF_CTRL_TYPE_GET_GREF_MAPPING_SIZE > + * - > + * > + * This is sent by the frontend to fetch the number of grefs that can be kept > + * mapped in the backend. > + * > + * Request: > + * > + * type= XEN_NETIF_CTRL_TYPE_GET_GREF_MAPPING_SIZE > + * data[0] = queue index (assumed 0 for single queue) > + * data[1] = 0 > + * data[2] = 0 > + * > + * Response: > + * > + * status = XEN_NETIF_CTRL_STATUS_NOT_SUPPORTED - Operation not > + * supported > + * XEN_NETIF_CTRL_STATUS_INVALID_PARAMETER - The queue index > is > + * out of range > + * XEN_NETIF_CTRL_STATUS_SUCCESS - Operation successful > + * data = maximum number of entries allowed in the gref mapping table > + * (if operation was successful) or zero if it is not supported. > + * > + * XEN_NETIF_CTRL_TYPE_
Re: [Xen-devel] [PATCH v8 03/15] xen: add new domctl hypercall to set grant table resource limits
> -Original Message- > From: Xen-devel [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of > Juergen Gross > Sent: 20 September 2017 07:34 > To: xen-de...@lists.xenproject.org > Cc: Juergen Gross ; sstabell...@kernel.org; Wei Liu > ; George Dunlap ; > Andrew Cooper ; Ian Jackson > ; Tim (Xen.org) ; > julien.gr...@arm.com; jbeul...@suse.com; dgde...@tycho.nsa.gov > Subject: [Xen-devel] [PATCH v8 03/15] xen: add new domctl hypercall to set > grant table resource limits > > Add a domctl hypercall to set the domain's resource limits regarding > grant tables. > > Signed-off-by: Juergen Gross > Acked-by: Daniel De Graaf Reviewed-by: Paul Durrant > --- > V8: > - remove stale #if 0, adjust comments (Paul Durrant) > > V6: > - moved earlier in series to support set_gnttab_limits being > mandatory for domain creation > > V5: > - add set_gnttab_limits to create_domain_common in xen.if > (Daniel De Graaf) > > V3: > - rename *gnttbl* to *gnttab* (Paul Durrant) > --- > tools/flask/policy/modules/dom0.te | 2 +- > tools/flask/policy/modules/xen.if | 2 +- > xen/common/domctl.c | 6 ++ > xen/common/grant_table.c| 19 +++ > xen/include/public/domctl.h | 7 +++ > xen/include/xen/grant_table.h | 2 ++ > xen/xsm/flask/hooks.c | 3 +++ > xen/xsm/flask/policy/access_vectors | 2 ++ > 8 files changed, 41 insertions(+), 2 deletions(-) > > diff --git a/tools/flask/policy/modules/dom0.te > b/tools/flask/policy/modules/dom0.te > index 338caaf41e..1643b400f0 100644 > --- a/tools/flask/policy/modules/dom0.te > +++ b/tools/flask/policy/modules/dom0.te > @@ -39,7 +39,7 @@ allow dom0_t dom0_t:domain { > }; > allow dom0_t dom0_t:domain2 { > set_cpuid gettsc settsc setscheduler set_max_evtchn > set_vnumainfo > - get_vnumainfo psr_cmt_op psr_cat_op > + get_vnumainfo psr_cmt_op psr_cat_op set_gnttab_limits > }; > allow dom0_t dom0_t:resource { add remove }; > > diff --git a/tools/flask/policy/modules/xen.if > b/tools/flask/policy/modules/xen.if > index 912640002e..55437496f6 100644 > --- a/tools/flask/policy/modules/xen.if > +++ b/tools/flask/policy/modules/xen.if > @@ -52,7 +52,7 @@ define(`create_domain_common', ` > settime setdomainhandle getvcpucontext > set_misc_info }; > allow $1 $2:domain2 { set_cpuid settsc setscheduler setclaim > set_max_evtchn set_vnumainfo get_vnumainfo > cacheflush > - psr_cmt_op psr_cat_op soft_reset }; > + psr_cmt_op psr_cat_op soft_reset set_gnttab_limits > }; > allow $1 $2:security check_context; > allow $1 $2:shadow enable; > allow $1 $2:mmu { map_read map_write adjust memorymap > physmap pinpage mmuext_op updatemp }; > diff --git a/xen/common/domctl.c b/xen/common/domctl.c > index 42658e5744..58381f8fe9 100644 > --- a/xen/common/domctl.c > +++ b/xen/common/domctl.c > @@ -14,6 +14,7 @@ > #include > #include > #include > +#include > #include > #include > #include > @@ -1149,6 +1150,11 @@ long > do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl) > copyback = 1; > break; > > +case XEN_DOMCTL_set_gnttab_limits: > +ret = grant_table_set_limits(d, op->u.set_gnttab_limits.grant_frames, > + > op->u.set_gnttab_limits.maptrack_frames); > +break; > + > default: > ret = arch_do_domctl(op, d, u_domctl); > break; > diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c > index ac845dbb35..f48eeff7ad 100644 > --- a/xen/common/grant_table.c > +++ b/xen/common/grant_table.c > @@ -3640,6 +3640,25 @@ void grant_table_init_vcpu(struct vcpu *v) > v->maptrack_tail = MAPTRACK_TAIL; > } > > +int grant_table_set_limits(struct domain *d, unsigned int grant_frames, > + unsigned int maptrack_frames) > +{ > +struct grant_table *gt = d->grant_table; > +int ret = -EBUSY; > + > +if ( !gt ) > +return -ENOENT; > + > +grant_write_lock(gt); > + > +ret = 0; > +/* Set limits, alloc needed arrays. */ > + > +grant_write_unlock(gt); > + > +return ret; > +} > + > #ifdef CONFIG_HAS_MEM_SHARING > int mem_sharing_gref_to_gfn(struct grant_table *gt, grant_ref_t ref, > gfn_t *gfn, uint16_t *status) > diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h > index 50ff58f5b9..167502c60b 100644 > --- a/xen/include/public/domctl.h > +++ b/xen/include/public/domctl.h > @@ -1163,6 +1163,11 @@ struct xen_domctl_psr_cat_op { > typedef struct xen_domctl_psr_cat_op xen_domctl_psr_cat_op_t; > DEFINE_XEN_GUEST_HANDLE(xen_domctl_psr_cat_op_t); > > +struct xen_domctl_set_gnttab_limits { > +uint32_t grant_frames; /* IN */ > +uint32_t maptrack_frames; /* IN */ > +}; > + > struct xen_domctl { > uint32_t cmd; > #define XEN_DOMCTL_createdomain 1 >
Re: [Xen-devel] [PATCH] vt-d: use two 32-bit writes to update DMAR fault address registers
On Mon, Sep 11, 2017 at 02:00:48PM +0800, Haozhong Zhang wrote: > The 64-bit DMAR fault address is composed of two 32 bits registers > DMAR_FEADDR_REG and DMAR_FEUADDR_REG. According to VT-d spec: > "Software is expected to access 32-bit registers as aligned doublewords", > a hypervisor should use two 32-bit writes to DMAR_FEADDR_REG and > DMAR_FEUADDR_REG separately in order to update a 64-bit fault address, > rather than a 64-bit write to DMAR_FEADDR_REG. I would add: "Note that when x2APIC is disabled DMAR_FEUADDR_REG is reserved and it's not necessary to update it." > Though I haven't seen any errors caused by such one 64-bit write on > real machines, it's still better to follow the specification. > > Signed-off-by: Haozhong Zhang Given the reply from Kevin: Reviewed-by: Roger Pau Monné Thanks, Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 2/3] Introduce migration precopy policy
> -Original Message- > From: Xen-devel [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of > Jennifer Herbert > Sent: 19 September 2017 19:06 > To: Ian Jackson ; Wei Liu ; > xen-de...@lists.xenproject.org; jto...@uwaterloo.ca > Cc: Jennifer Herbert > Subject: [Xen-devel] [PATCH v2 2/3] Introduce migration precopy policy > > This Patch allows a migration precopy policy to be specified. > > The precopy phase of the xc_domain_save() live migration algorithm has > historically been implemented to run until either a) (almost) no pages > are dirty or b) some fixed, hard-coded maximum number of precopy > iterations has been exceeded. This policy and its implementation are > less than ideal for a few reasons: > - the logic of the policy is intertwined with the control flow of the > mechanism of the precopy stage > - it can't take into account facts external to the immediate > migration context, such external state transfer state, interactive > user input, or the passage of wall-clock time. > - it does not permit the user to change their mind, over time, about > what to do at the end of the precopy (they get an unconditional > transition into the stop-and-copy phase of the migration) > > To permit callers to implement arbitrary higher-level policies governing > when the live migration precopy phase should end, and what should be > done next: > - add a precopy_policy() callback to the xc_domain_save() user-supplied > callbacks > - during the precopy phase of live migrations, consult this policy after > each batch of pages transmitted and take the dictated action, which > may be to a) abort the migration entirely, b) continue with the > precopy, or c) proceed to the stop-and-copy phase. > - provide an implementation of the old policy, used when > precopy_policy callback is not provided. > > Signed-off-by: Jennifer Herbert > Signed-off-by: Joshua Otto > > --- > > v2: > > Have made a few formatting corrections, added typedef as suggested. > > v1: > > This is updated/modified subset of patch 7/20, part of > Joshua Otto's "Add postcopy live migration support." patch, > dated 27th March 2017. As indicated on the original thread, > I wish to make use of this this within the XenServer product. > I hope this will aid Josh in pushing the remainder of his series. > > --- > tools/libxc/include/xenguest.h | 31 -- > tools/libxc/xc_sr_common.h | 6 +-- > tools/libxc/xc_sr_save.c | 97 + > - > 3 files changed, 97 insertions(+), 37 deletions(-) > > diff --git a/tools/libxc/include/xenguest.h b/tools/libxc/include/xenguest.h > index 6626f0c..a2a654c 100644 > --- a/tools/libxc/include/xenguest.h > +++ b/tools/libxc/include/xenguest.h > @@ -39,6 +39,16 @@ > */ > struct xenevtchn_handle; > > +/* For save's precopy_policy(). */ > +struct precopy_stats > +{ > +unsigned int iteration; > +unsigned int total_written; > +long dirty_count; /* -1 if unknown */ > +}; > + > +typedef int (*precopy_policy_t)(struct precopy_stats, void *); > + > /* callbacks provided by xc_domain_save */ > struct save_callbacks { > /* Called after expiration of checkpoint interval, > @@ -46,7 +56,20 @@ struct save_callbacks { > */ > int (*suspend)(void* data); > > -/* Called after the guest's dirty pages have been > +/* > + * Called after every batch of page data sent during the precopy > + * phase of a live migration to ask the caller what to do next > + * based on the current state of the precopy migration. > + */ > +#define XGS_POLICY_ABORT (-1) /* Abandon the migration entirely > +* and tidy up. */ > +#define XGS_POLICY_CONTINUE_PRECOPY 0 /* Remain in the precopy > phase. */ > +#define XGS_POLICY_STOP_AND_COPY1 /* Immediately suspend and > transmit the > +* remaining dirty pages. */ > +precopy_policy_t precopy_policy; > + > +/* > + * Called after the guest's dirty pages have been > * copied into an output buffer. > * Callback function resumes the guest & the device model, > * returns to xc_domain_save. > @@ -55,7 +78,8 @@ struct save_callbacks { > */ > int (*postcopy)(void* data); > > -/* Called after the memory checkpoint has been flushed > +/* > + * Called after the memory checkpoint has been flushed > * out into the network. Typical actions performed in this > * callback include: > * (a) send the saved device model state (for HVM guests), > @@ -65,7 +89,8 @@ struct save_callbacks { > * > * returns: > * 0: terminate checkpointing gracefully > - * 1: take another checkpoint */ > + * 1: take another checkpoint > + */ > int (*checkpoint)(void* data); > > /* > diff --git a/tools/libxc/xc_sr_common.h b/tools/libxc/xc_sr_common.h > index a83f22a..3635704 100644 > --- a/tools/libxc/xc_sr_com
Re: [Xen-devel] [PATCH v3 01/15] docs: create Memory Bandwidth Allocation (MBA) feature document
On Wed, Sep 20, 2017 at 11:06:57AM +0800, Yi Sun wrote: > On 17-09-18 18:16:40, Roger Pau Monn� wrote: > > On Tue, Sep 05, 2017 at 05:32:23PM +0800, Yi Sun wrote: > > > +* xl interfaces: > > > + > > > + 1. `psr-mba-show [domain-id]`: > > > > Is this limited to domain-id, or one can also use the domain name? > > Most of the xl commands accept either a domain-id or a domain-name. > > > Both domain-id and domain-name can show it. I thought this is by default and > no need to explicitly declare. If I am wrong, I will change it as below: > `psr-mba-show [domain-id/domain-name]` [domain-id|domain-name] Would be better IMHO. > > > + > > > + 3. DOMCTL: > > > + - XEN_DOMCTL_PSR_MBA_OP_GET_THRTL: Get throttling for a domain. > > > + - XEN_DOMCTL_PSR_MBA_OP_SET_THRTL: Set throttling for a domain. > > > + > > > +* xl interfaces: > > > + > > > + 1. psr-mba-show [domain-id] > > > + Show system/domain runtime MBA throttling value. For linear > > > mode, > > > + it shows the decimal value. For non-linear mode, it shows > > > hexadecimal > > > + value. > > > + => XEN_SYSCTL_PSR_MBA_get_info/XEN_DOMCTL_PSR_MBA_OP_GET_THRTL > > > + > > > + 2. psr-mba-set [OPTIONS] > > > + Set bandwidth throttling for a domain. > > > + => XEN_DOMCTL_PSR_MBA_OP_SET_THRTL > > > + > > > + 3. psr-hwinfo > > > + Show PSR HW information, including L3 CAT/CDP/L2 CAT/MBA. > > > + => XEN_SYSCTL_PSR_MBA_get_info > > > > 'psr-hwinfo' seems to be completely missing from the 'xl interfaces:' > > section above. > > > Because this is not a newly added interface, I do not describe it in 'xl > interfaces'. Is that necessary? Oh, OK, sorry for the noise. Then I guess it's not necessary to describe it here. Maybe a reference to where 'psr-hwinfo' is described would be nice (I assume there's a feature document somewhere that describes 'psr-hwinfo'). Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 06/15] x86: implement get value interface for MBA
On Wed, Sep 20, 2017 at 01:09:06PM +0800, Yi Sun wrote: > On 17-09-19 10:15:42, Roger Pau Monn� wrote: > > On Tue, Sep 05, 2017 at 05:32:28PM +0800, Yi Sun wrote: > > > diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c > > > index 696eff2..7902af7 100644 > > > --- a/xen/arch/x86/domctl.c > > > +++ b/xen/arch/x86/domctl.c > > > @@ -1496,6 +1496,13 @@ long arch_do_domctl( > > > copyback = true; > > > break; > > > > > > +case XEN_DOMCTL_PSR_ALLOC_GET_MBA_THRTL: > > > +ret = psr_get_val(d, domctl->u.psr_alloc.target, > > > + &val32, PSR_TYPE_MBA_THRTL); > > > +domctl->u.psr_alloc.data = val32; > > > > Hm, why does psr_get_val take a uint32_t * instead of a uint64_t *? So > > that you can directly pass &domctl->u.psr_alloc.data. > > > > Or the other way around, why is domctl->u.psr_alloc.data a uint64_t > > instead of a uint32_t? > > > There is a historical reason. The COS MSR is 64bit. So, the original codes > in L3 CAT (submitted years ago) used uint64_t. > > But during L2 CAT review, per Jan's comment, the uint64_t is not necessary > in psr.c. So, we convert it to uint32_t in psr.c and make the codes you see > here. Since this is a DOMCTL, you can change the type of the structure to be an uint32_t, so that you can pass it directly (unless I'm missing something else that requires this to be uint64_t). Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] minimum Python version
On 20/09/17 09:55, Jan Beulich wrote: On 20.09.17 at 09:20, wrote: >> For your case: what about adding a configure option to disable building >> the python bindings (and any dependencies like e.g. pygrub) instead of >> doing so in case of a version mismatch? This would avoid any surprises >> in case someone didn't notice that the bindings have been disabled. > > That's certainly an option, but I'm generally not very happy with > having to add such options. My script invoking configure becomes > more and more unreadable because of such, since this way the > requirement to do the version check gets moved from the build > system (i.e. configure, whose job this is imo) to that script. And > as you can probably understand I don't want to unilaterally > disable the building of that code, to be able to build test possible > changes I may end up making to e.g. libxc. > > IOW if such a configure option was added, I'd still think it should > default to disabled if the Python in use is not new enough. I can understand your concerns. OTOH it is quite nasty to find out that due to a missing or too old package on the system some parts of Xen haven't been built without having them disabled via configure (I had this experience more than once, e.g. with a Mini-OS patch breaking vtpm stubdom which I didn't notice as it wasn't being built on my side). What about the following idea: maybe we could add an option to configure to make it fail instead of disabling components in case some requirements for that component are not met? The default could be to just disable the components and with the option specified all components not disabled explicitly must be buildable or configure will fail. Juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v8 03/15] xen: add new domctl hypercall to set grant table resource limits
>>> On 20.09.17 at 10:26, wrote: >> -Original Message- >> From: Xen-devel [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of >> Juergen Gross >> Sent: 20 September 2017 07:34 >> To: xen-de...@lists.xenproject.org >> Cc: Juergen Gross ; sstabell...@kernel.org; Wei Liu >> ; George Dunlap ; >> Andrew Cooper ; Ian Jackson >> ; Tim (Xen.org) ; >> julien.gr...@arm.com; jbeul...@suse.com; dgde...@tycho.nsa.gov >> Subject: [Xen-devel] [PATCH v8 03/15] xen: add new domctl hypercall to set >> grant table resource limits >> >> Add a domctl hypercall to set the domain's resource limits regarding >> grant tables. >> >> Signed-off-by: Juergen Gross >> Acked-by: Daniel De Graaf > > Reviewed-by: Paul Durrant Acked-by: Jan Beulich despite ... >> --- a/xen/common/grant_table.c >> +++ b/xen/common/grant_table.c >> @@ -3640,6 +3640,25 @@ void grant_table_init_vcpu(struct vcpu *v) >> v->maptrack_tail = MAPTRACK_TAIL; >> } >> >> +int grant_table_set_limits(struct domain *d, unsigned int grant_frames, >> + unsigned int maptrack_frames) >> +{ >> +struct grant_table *gt = d->grant_table; >> +int ret = -EBUSY; ... the initializer being pointless here (it becomes relevant only in patch 11). Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 12/13] [RFC] iommu: VT-d: Squash map_pages/unmap_pages with map_page/unmap_page
this patch alone looks OK to me. but I haven't found time to review the whole series to judge whether below change is necessary. > From: Oleksandr Tyshchenko [mailto:olekst...@gmail.com] > Sent: Tuesday, September 12, 2017 10:44 PM > > Hi. > > Gentle reminder. > > On Mon, Aug 21, 2017 at 7:44 PM, Oleksandr Tyshchenko > wrote: > > Hi, all. > > > > Any comments? > > > > On Tue, Jul 25, 2017 at 8:26 PM, Oleksandr Tyshchenko > > wrote: > >> From: Oleksandr Tyshchenko > >> > >> Reduce the scope of the TODO by squashing single-page stuff with > >> multi-page one. Next target is to use large pages whenever possible > >> in the case that hardware supports them. > >> > >> Signed-off-by: Oleksandr Tyshchenko > > >> CC: Jan Beulich > >> CC: Kevin Tian > >> > >> --- > >>Changes in v1: > >> - > >> > >>Changes in v2: > >> - > >> --- > >> xen/drivers/passthrough/vtd/iommu.c | 138 + > --- > >> 1 file changed, 67 insertions(+), 71 deletions(-) > >> > >> diff --git a/xen/drivers/passthrough/vtd/iommu.c > b/xen/drivers/passthrough/vtd/iommu.c > >> index 45d1f36..d20b2f9 100644 > >> --- a/xen/drivers/passthrough/vtd/iommu.c > >> +++ b/xen/drivers/passthrough/vtd/iommu.c > >> @@ -1750,15 +1750,24 @@ static void > iommu_domain_teardown(struct domain *d) > >> spin_unlock(&hd->arch.mapping_lock); > >> } > >> > >> -static int __must_check intel_iommu_map_page(struct domain *d, > >> - unsigned long gfn, > >> - unsigned long mfn, > >> - unsigned int flags) > >> +static int __must_check intel_iommu_unmap_pages(struct domain *d, > >> +unsigned long gfn, > >> +unsigned int order); > >> + > >> +/* > >> + * TODO: Optimize by using large pages whenever possible in the case > >> + * that hardware supports them. > >> + */ > >> +static int __must_check intel_iommu_map_pages(struct domain *d, > >> + unsigned long gfn, > >> + unsigned long mfn, > >> + unsigned int order, > >> + unsigned int flags) > >> { > >> struct domain_iommu *hd = dom_iommu(d); > >> -struct dma_pte *page = NULL, *pte = NULL, old, new = { 0 }; > >> -u64 pg_maddr; > >> int rc = 0; > >> +unsigned long orig_gfn = gfn; > >> +unsigned long i; > >> > >> /* Do nothing if VT-d shares EPT page table */ > >> if ( iommu_use_hap_pt(d) ) > >> @@ -1768,78 +1777,60 @@ static int __must_check > intel_iommu_map_page(struct domain *d, > >> if ( iommu_passthrough && is_hardware_domain(d) ) > >> return 0; > >> > >> -spin_lock(&hd->arch.mapping_lock); > >> - > >> -pg_maddr = addr_to_dma_page_maddr(d, (paddr_t)gfn << > PAGE_SHIFT_4K, 1); > >> -if ( pg_maddr == 0 ) > >> +for ( i = 0; i < (1UL << order); i++, gfn++, mfn++ ) > >> { > >> -spin_unlock(&hd->arch.mapping_lock); > >> -return -ENOMEM; > >> -} > >> -page = (struct dma_pte *)map_vtd_domain_page(pg_maddr); > >> -pte = page + (gfn & LEVEL_MASK); > >> -old = *pte; > >> -dma_set_pte_addr(new, (paddr_t)mfn << PAGE_SHIFT_4K); > >> -dma_set_pte_prot(new, > >> - ((flags & IOMMUF_readable) ? DMA_PTE_READ : 0) | > >> - ((flags & IOMMUF_writable) ? DMA_PTE_WRITE : 0)); > >> +struct dma_pte *page = NULL, *pte = NULL, old, new = { 0 }; > >> +u64 pg_maddr; > >> > >> -/* Set the SNP on leaf page table if Snoop Control available */ > >> -if ( iommu_snoop ) > >> -dma_set_pte_snp(new); > >> +spin_lock(&hd->arch.mapping_lock); > >> > >> -if ( old.val == new.val ) > >> -{ > >> +pg_maddr = addr_to_dma_page_maddr(d, (paddr_t)gfn << > PAGE_SHIFT_4K, 1); > >> +if ( pg_maddr == 0 ) > >> +{ > >> +spin_unlock(&hd->arch.mapping_lock); > >> +rc = -ENOMEM; > >> +goto err; > >> +} > >> +page = (struct dma_pte *)map_vtd_domain_page(pg_maddr); > >> +pte = page + (gfn & LEVEL_MASK); > >> +old = *pte; > >> +dma_set_pte_addr(new, (paddr_t)mfn << PAGE_SHIFT_4K); > >> +dma_set_pte_prot(new, > >> + ((flags & IOMMUF_readable) ? DMA_PTE_READ : 0) | > >> + ((flags & IOMMUF_writable) ? DMA_PTE_WRITE : 0)); > >> + > >> +/* Set the SNP on leaf page table if Snoop Control available */ > >> +if ( iommu_snoop ) > >> +dma_set_pte_snp(new); > >> + > >> +if ( old.val == new.val ) > >> +{ > >> +spin_unlock(&hd->arch.mapping_lock); > >> +unmap_vtd_domain_page(page); > >> +continue; > >> +
Re: [Xen-devel] [PATCH v3 13/15] tools: implement new generic get value interface and MBA get value command
On Wed, Sep 20, 2017 at 02:46:24PM +0800, Yi Sun wrote: > On 17-09-19 12:02:08, Roger Pau Monn� wrote: > > On Tue, Sep 05, 2017 at 05:32:35PM +0800, Yi Sun wrote: > > > +}; > > > char *msg; > > > > > > switch (err) { > > > case ENODEV: > > > -msg = "CAT is not supported in this system"; > > > +msg = "is not supported in this system"; > > > break; > > > case ENOENT: > > > -msg = "CAT is not enabled on the socket"; > > > +msg = "is not enabled on the socket"; > > > break; > > > case EOVERFLOW: > > > msg = "no free COS available"; > > > @@ -106,7 +120,7 @@ static void libxl__psr_cat_log_err_msg(libxl__gc *gc, > > > int err) > > > return; > > > } > > > > > > -LOGE(ERROR, "%s", msg); > > > +LOGE(ERROR, "%s: %s", feat_name[type], msg); > > > > I don't think you should use LOGE here, but rather LOG. LOGE should be > > used when errno is set, which I don't think is the case here. > > > But two places to call libxl__psr_cat_log_err_msg all set errno in 'xc_' > functions. But you already translate the error into a custom message ('msg' in the above context), and the error variable in this case is 'err' not 'errno', so LOGE is not appropriate here IMHO. > > > @@ -347,18 +361,7 @@ int libxl_psr_cat_get_cbm(libxl_ctx *ctx, uint32_t > > > domid, > > >libxl_psr_cbm_type type, uint32_t target, > > >uint64_t *cbm_r) > > > { > > > -GC_INIT(ctx); > > > -int rc = 0; > > > -xc_psr_type xc_type = libxl__psr_cbm_type_to_libxc_psr_type(type); > > > - > > > -if (xc_psr_cat_get_domain_data(ctx->xch, domid, xc_type, > > > - target, cbm_r)) { > > > -libxl__psr_cat_log_err_msg(gc, errno); > > > -rc = ERROR_FAIL; > > > -} > > > - > > > -GC_FREE; > > > -return rc; > > > +return libxl_psr_get_val(ctx, domid, type, target, cbm_r); > > > } > > > > You could even move this to libxl.h as a static function IMHO. > > > Yes. But I prefer to keep it here with other interfaces together. Is that > acceptable to you? OK, as you wish. > > > diff --git a/tools/xl/xl_psr.c b/tools/xl/xl_psr.c > > > index 40269b4..46b7788 100644 > > > --- a/tools/xl/xl_psr.c > > > +++ b/tools/xl/xl_psr.c > > > -static int psr_cat_print_socket(uint32_t domid, libxl_psr_cat_info *info, > > > -unsigned int lvl) > > > +static int psr_print_socket(uint32_t domid, > > > +libxl_psr_hw_info *info, > > > +libxl_psr_feat_type type, > > > +unsigned int lvl) > > > { > > > -int rc; > > > -uint32_t l3_cache_size; > > > - > > > printf("%-16s: %u\n", "Socket ID", info->id); > > > > > > -/* So far, CMT only supports L3 cache. */ > > > -if (lvl == 3) { > > > -rc = libxl_psr_cmt_get_l3_cache_size(ctx, info->id, > > > &l3_cache_size); > > > -if (rc) { > > > -fprintf(stderr, "Failed to get l3 cache size for > > > socket:%d\n", > > > -info->id); > > > -return -1; > > > +switch (type) { > > > +case LIBXL_PSR_FEAT_TYPE_CAT: > > > +{ > > > +int rc; > > > +uint32_t l3_cache_size; > > > + > > > +/* So far, CMT only supports L3 cache. */ > > > +if (lvl == 3) { > > > > Shouldn't you print some kind of error message if lvl != 3? Or is it > > expected that this function will be called with lvl != 3 and it should > > be ignored? > > > We only get cache size for level 3 cache. So, if input is lvl=2, we print > nothing. My question would be, is it expected that this function is called with lvl == 2 as part of the normal operation with valid input values? Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v8 04/15] xen: add function for obtaining highest possible memory address
>>> On 20.09.17 at 08:34, wrote: > --- a/xen/arch/x86/mm.c > +++ b/xen/arch/x86/mm.c > @@ -6312,6 +6312,17 @@ int pv_ro_page_fault(unsigned long addr, struct > cpu_user_regs *regs) > return 0; > } > > +unsigned long arch_get_upper_mfn_bound(void) > +{ > +unsigned long max_mfn; > + > +max_mfn = mem_hotplug ? PFN_DOWN(mem_hotplug) : max_page; Taking into account the code in the caller of this function as well as the ARM counterpart I find the use of max_page here odd. I'd prefer if get_upper_mfn_bound() went away altogether, and it's sole caller (which strangely enough doesn't get introduced here) called the arch function directly. Additionally, with the caller being a sysctl, how is that supposed to help a PV DomU kernel in their choice of grant table version? Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v8 04/15] xen: add function for obtaining highest possible memory address
On 20/09/17 10:57, Jan Beulich wrote: On 20.09.17 at 08:34, wrote: >> --- a/xen/arch/x86/mm.c >> +++ b/xen/arch/x86/mm.c >> @@ -6312,6 +6312,17 @@ int pv_ro_page_fault(unsigned long addr, struct >> cpu_user_regs *regs) >> return 0; >> } >> >> +unsigned long arch_get_upper_mfn_bound(void) >> +{ >> +unsigned long max_mfn; >> + >> +max_mfn = mem_hotplug ? PFN_DOWN(mem_hotplug) : max_page; > > Taking into account the code in the caller of this function as well > as the ARM counterpart I find the use of max_page here odd. I'd > prefer if get_upper_mfn_bound() went away altogether, and it's > sole caller (which strangely enough doesn't get introduced here) > called the arch function directly. Additionally, with the caller being > a sysctl, how is that supposed to help a PV DomU kernel in their > choice of grant table version? Did you look at patch 15? Juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v8 05/15] xen: add max possible mfn to struct xen_sysctl_physinfo
>>> On 20.09.17 at 08:34, wrote: > Add the maximum possible mfn to struct xen_sysctl_physinfo in order to > enable Xen tools to size the grant table frame limits for a domU. > > Signed-off-by: Juergen Gross > --- > xen/common/sysctl.c | 1 + > xen/include/public/sysctl.h | 2 ++ > 2 files changed, 3 insertions(+) As indicated in reply to patch 4, I think the on here should be folded into that one, in order to not transiently leave around a dead function. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v8 05/15] xen: add max possible mfn to struct xen_sysctl_physinfo
On 20/09/17 10:58, Jan Beulich wrote: On 20.09.17 at 08:34, wrote: >> Add the maximum possible mfn to struct xen_sysctl_physinfo in order to >> enable Xen tools to size the grant table frame limits for a domU. >> >> Signed-off-by: Juergen Gross >> --- >> xen/common/sysctl.c | 1 + >> xen/include/public/sysctl.h | 2 ++ >> 2 files changed, 3 insertions(+) > > As indicated in reply to patch 4, I think the on here should be folded > into that one, in order to not transiently leave around a dead function. Okay, I can do it. Juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 01/15] docs: create Memory Bandwidth Allocation (MBA) feature document
On 17-09-20 09:36:13, Roger Pau Monn� wrote: > On Wed, Sep 20, 2017 at 11:06:57AM +0800, Yi Sun wrote: > > On 17-09-18 18:16:40, Roger Pau Monn� wrote: > > > On Tue, Sep 05, 2017 at 05:32:23PM +0800, Yi Sun wrote: > > > > +* xl interfaces: > > > > + > > > > + 1. `psr-mba-show [domain-id]`: > > > > > > Is this limited to domain-id, or one can also use the domain name? > > > Most of the xl commands accept either a domain-id or a domain-name. > > > > > Both domain-id and domain-name can show it. I thought this is by default and > > no need to explicitly declare. If I am wrong, I will change it as below: > > `psr-mba-show [domain-id/domain-name]` > > [domain-id|domain-name] > > Would be better IMHO. > Thanks! > > > > + > > > > + 3. DOMCTL: > > > > + - XEN_DOMCTL_PSR_MBA_OP_GET_THRTL: Get throttling for a > > > > domain. > > > > + - XEN_DOMCTL_PSR_MBA_OP_SET_THRTL: Set throttling for a > > > > domain. > > > > + > > > > +* xl interfaces: > > > > + > > > > + 1. psr-mba-show [domain-id] > > > > + Show system/domain runtime MBA throttling value. For linear > > > > mode, > > > > + it shows the decimal value. For non-linear mode, it shows > > > > hexadecimal > > > > + value. > > > > + => > > > > XEN_SYSCTL_PSR_MBA_get_info/XEN_DOMCTL_PSR_MBA_OP_GET_THRTL > > > > + > > > > + 2. psr-mba-set [OPTIONS] > > > > + Set bandwidth throttling for a domain. > > > > + => XEN_DOMCTL_PSR_MBA_OP_SET_THRTL > > > > + > > > > + 3. psr-hwinfo > > > > + Show PSR HW information, including L3 CAT/CDP/L2 CAT/MBA. > > > > + => XEN_SYSCTL_PSR_MBA_get_info > > > > > > 'psr-hwinfo' seems to be completely missing from the 'xl interfaces:' > > > section above. > > > > > Because this is not a newly added interface, I do not describe it in 'xl > > interfaces'. Is that necessary? > > Oh, OK, sorry for the noise. Then I guess it's not necessary to > describe it here. Maybe a reference to where 'psr-hwinfo' is described > would be nice (I assume there's a feature document somewhere that > describes 'psr-hwinfo'). > psr-hwinfo is firstly introduced in intel_psr_cat_cdp.pandoc. But MBA feature adds a new sysctl interface 'XEN_SYSCTL_PSR_MBA_get_info' which is used by psr-hwinfo. So, I describe it here again. > Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 13/15] tools: implement new generic get value interface and MBA get value command
On 17-09-20 09:57:59, Roger Pau Monn� wrote: > On Wed, Sep 20, 2017 at 02:46:24PM +0800, Yi Sun wrote: > > On 17-09-19 12:02:08, Roger Pau Monn� wrote: > > > On Tue, Sep 05, 2017 at 05:32:35PM +0800, Yi Sun wrote: > > > > +}; > > > > char *msg; > > > > > > > > switch (err) { > > > > case ENODEV: > > > > -msg = "CAT is not supported in this system"; > > > > +msg = "is not supported in this system"; > > > > break; > > > > case ENOENT: > > > > -msg = "CAT is not enabled on the socket"; > > > > +msg = "is not enabled on the socket"; > > > > break; > > > > case EOVERFLOW: > > > > msg = "no free COS available"; > > > > @@ -106,7 +120,7 @@ static void libxl__psr_cat_log_err_msg(libxl__gc > > > > *gc, int err) > > > > return; > > > > } > > > > > > > > -LOGE(ERROR, "%s", msg); > > > > +LOGE(ERROR, "%s: %s", feat_name[type], msg); > > > > > > I don't think you should use LOGE here, but rather LOG. LOGE should be > > > used when errno is set, which I don't think is the case here. > > > > > But two places to call libxl__psr_cat_log_err_msg all set errno in 'xc_' > > functions. > > But you already translate the error into a custom message ('msg' in the > above context), and the error variable in this case is 'err' not > 'errno', so LOGE is not appropriate here IMHO. > Ok. [...] > > > > diff --git a/tools/xl/xl_psr.c b/tools/xl/xl_psr.c > > > > index 40269b4..46b7788 100644 > > > > --- a/tools/xl/xl_psr.c > > > > +++ b/tools/xl/xl_psr.c > > > > -static int psr_cat_print_socket(uint32_t domid, libxl_psr_cat_info > > > > *info, > > > > -unsigned int lvl) > > > > +static int psr_print_socket(uint32_t domid, > > > > +libxl_psr_hw_info *info, > > > > +libxl_psr_feat_type type, > > > > +unsigned int lvl) > > > > { > > > > -int rc; > > > > -uint32_t l3_cache_size; > > > > - > > > > printf("%-16s: %u\n", "Socket ID", info->id); > > > > > > > > -/* So far, CMT only supports L3 cache. */ > > > > -if (lvl == 3) { > > > > -rc = libxl_psr_cmt_get_l3_cache_size(ctx, info->id, > > > > &l3_cache_size); > > > > -if (rc) { > > > > -fprintf(stderr, "Failed to get l3 cache size for > > > > socket:%d\n", > > > > -info->id); > > > > -return -1; > > > > +switch (type) { > > > > +case LIBXL_PSR_FEAT_TYPE_CAT: > > > > +{ > > > > +int rc; > > > > +uint32_t l3_cache_size; > > > > + > > > > +/* So far, CMT only supports L3 cache. */ > > > > +if (lvl == 3) { > > > > > > Shouldn't you print some kind of error message if lvl != 3? Or is it > > > expected that this function will be called with lvl != 3 and it should > > > be ignored? > > > > > We only get cache size for level 3 cache. So, if input is lvl=2, we print > > nothing. > > My question would be, is it expected that this function is called with > lvl == 2 as part of the normal operation with valid input values? > Yes, lvl == 2 is a normal operation. > Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [qemu-mainline test] 113613: regressions - FAIL
flight 113613 qemu-mainline real [real] http://logs.test-lab.xenproject.org/osstest/logs/113613/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-i386-xl-qemuu-win7-amd64 18 guest-start/win.repeat fail REGR. vs. 113302 Regressions which are regarded as allowable (not blocking): test-armhf-armhf-xl-rtds 6 xen-install fail REGR. vs. 113302 Tests which did not succeed, but are not blocking: test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail like 113302 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail like 113302 test-amd64-amd64-xl-rtds 10 debian-install fail like 113302 test-armhf-armhf-libvirt 14 saverestore-support-checkfail like 113302 test-amd64-amd64-xl-qemuu-ws16-amd64 10 windows-installfail never pass test-amd64-i386-libvirt-xsm 13 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail never pass test-amd64-amd64-libvirt 13 migrate-support-checkfail never pass test-amd64-i386-libvirt 13 migrate-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-armhf-armhf-xl-arndale 13 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 14 saverestore-support-checkfail never pass test-amd64-i386-libvirt-qcow2 12 migrate-support-checkfail never pass test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail never pass test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2 fail never pass test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 13 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 14 saverestore-support-checkfail never pass test-amd64-i386-xl-qemuu-ws16-amd64 13 guest-saverestore fail never pass test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail never pass test-armhf-armhf-xl 13 migrate-support-checkfail never pass test-armhf-armhf-xl 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-xsm 13 migrate-support-checkfail never pass test-armhf-armhf-xl-xsm 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-vhd 12 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 13 saverestore-support-checkfail never pass test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass test-armhf-armhf-libvirt 13 migrate-support-checkfail never pass version targeted for testing: qemuuc51700273ad9802a21c19f8d2b4bcb67c38e74ac baseline version: qemuua6e8c1dacfd37d34542e33600dcc50b7683b735a Last test of basis 113302 2017-09-11 10:18:16 Z8 days Failing since113345 2017-09-12 00:21:07 Z8 days 16 attempts Testing same since 113613 2017-09-20 02:35:03 Z0 days1 attempts People who touched revisions under test: Alex Bennée Alexander Graf Alexey Kardashevskiy Alistair Francis Amador Pahim Benjamin Herrenschmidt Christian Borntraeger Cornelia Huck Cédric Le Goater Daniel Henrique Barboza Daniel P. Berrange David Gibson David Hildenbrand Dr. David Alan Gilbert Eduardo Habkost Eduardo Otubo Eduardo Otubo Eric Auger Eric Blake Fam Zheng Farhan Ali Feng Kan Gerd Hoffmann Gonglei Greg Kurz Halil Pasic Hannes Reinecke Hannes Reinecke Igor Mammedov Jaroslaw Pelczar John Snow Joseph Myers Kamil Rytarowski Kevin Wolf Ladi Prosek Laurent Vivier LluÃs Vilanova Lukáš Doktor Mao Zhongyi Mark Cave-Ayland Matt Parker Matthew Rosato Mohammed Gamal Paolo Bonzini Peter Maydell Peter Xu Philippe Mathieu-Daudé Philippe Mathieu-Daudé Pranavkumar Sawargaonkar Prasad J Pandit Richard Henderson Richard W.M. Jones Roman Kagan Stefan Hajnoczi Thomas Huth Tushar Jagad Vadim Galitsyn Vasilis Liaskovitis Viktor Mihajlovski Xiao Guangrong Yi Min Zhao jobs: build-amd64-xsm pass build-armhf-xsm pass build-i386-xsm
Re: [Xen-devel] [PATCH] xen: credit2: fix spinlock irq-safety violation
On Tue, Sep 19, 2017 at 04:11:28AM +0200, Dario Faggioli wrote: > In commit ad4b3e1e9df34 ("xen: credit2: implement > utilization cap") xfree() was being called (for > deallocating the budget replenishment timer, during > domain destruction) inside an IRQ disabled critical > section. > > That must not happen, as it uses the mem-pool's lock, > which needs to be taken with IRQ enabled. And, in fact, > we crash (in debug builds): > > (XEN) > (XEN) Panic on CPU 0: > (XEN) Xen BUG at spinlock.c:47 > (XEN) > > Let's, therefore, kill and deallocate the timer outside of > the critical sections, when IRQs are enabled. > > Signed-off-by: Dario Faggioli Reviewed-by: Roger Pau Monné It would be good to get this committed sooner rather than later, so the push gate can be unblocked. Thanks, Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] xen: credit2: fix spinlock irq-safety violation
On 09/19/2017 03:11 AM, Dario Faggioli wrote: > In commit ad4b3e1e9df34 ("xen: credit2: implement > utilization cap") xfree() was being called (for > deallocating the budget replenishment timer, during > domain destruction) inside an IRQ disabled critical > section. > > That must not happen, as it uses the mem-pool's lock, > which needs to be taken with IRQ enabled. And, in fact, > we crash (in debug builds): > > (XEN) > (XEN) Panic on CPU 0: > (XEN) Xen BUG at spinlock.c:47 > (XEN) > > Let's, therefore, kill and deallocate the timer outside of > the critical sections, when IRQs are enabled. > > Signed-off-by: Dario Faggioli > --- > Cc: osstest service owner > Cc: George Dunlap > Cc: Wei Liu > --- > This was spotted by OSSTest's flight 113562: > > http://logs.test-lab.xenproject.org/osstest/logs/113562/ > > http://logs.test-lab.xenproject.org/osstest/logs/113562/test-amd64-amd64-xl-credit2/serial-godello0.log > --- > xen/common/sched_credit2.c |6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c > index 32234ac..7a550db 100644 > --- a/xen/common/sched_credit2.c > +++ b/xen/common/sched_credit2.c > @@ -2923,13 +2923,13 @@ csched2_free_domdata(const struct scheduler *ops, > void *data) > > write_lock_irqsave(&prv->lock, flags); > > -kill_timer(sdom->repl_timer); > -xfree(sdom->repl_timer); > - > list_del_init(&sdom->sdom_elem); > > write_unlock_irqrestore(&prv->lock, flags); > > +kill_timer(sdom->repl_timer); > +xfree(sdom->repl_timer); Any particular reason for moving the kill_timer() as well as the xfree outside the lock? What happens if the timer goes off after the irqrestore but before the kill_timer? -George ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH v4 0/3] Various XSA followups
XSA-219 was discovered while trying to implement the bugfix in patch 3. Andrew Cooper (3): [RFC] x86/hvm: Rename enum hvm_copy_result to hvm_translation_result x86/hvm: Break out __hvm_copy()'s translation logic x86/hvm: Implement hvmemul_write() using real mappings Alexandru Isaila (2): x86/hvm: Break out __hvm_copy()'s translation logic x86/hvm: Implement hvmemul_write() using real mappings ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH v4 2/3] x86/hvm: Break out __hvm_copy()'s translation logic
From: Andrew Cooper It will be reused by later changes. Signed-off-by: Andrew Cooper Signed-off-by: Alexandru Isaila Reviewed-by: Paul Durrant Acked-by: Jan Beulich --- xen/arch/x86/hvm/hvm.c| 144 +++--- xen/include/asm-x86/hvm/support.h | 12 2 files changed, 98 insertions(+), 58 deletions(-) diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index 488acbf..93394c1 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -3069,6 +3069,83 @@ void hvm_task_switch( hvm_unmap_entry(nptss_desc); } +enum hvm_translation_result hvm_translate_get_page( +struct vcpu *v, unsigned long addr, bool linear, uint32_t pfec, +pagefault_info_t *pfinfo, struct page_info **page_p, +gfn_t *gfn_p, p2m_type_t *p2mt_p) +{ +struct page_info *page; +p2m_type_t p2mt; +gfn_t gfn; + +if ( linear ) +{ +gfn = _gfn(paging_gva_to_gfn(v, addr, &pfec)); + +if ( gfn_eq(gfn, INVALID_GFN) ) +{ +if ( pfec & PFEC_page_paged ) +return HVMTRANS_gfn_paged_out; + +if ( pfec & PFEC_page_shared ) +return HVMTRANS_gfn_shared; + +if ( pfinfo ) +{ +pfinfo->linear = addr; +pfinfo->ec = pfec & ~PFEC_implicit; +} + +return HVMTRANS_bad_linear_to_gfn; +} +} +else +{ +gfn = gaddr_to_gfn(addr); +ASSERT(!pfinfo); +} + +/* + * No need to do the P2M lookup for internally handled MMIO, benefiting + * - 32-bit WinXP (& older Windows) on AMD CPUs for LAPIC accesses, + * - newer Windows (like Server 2012) for HPET accesses. + */ +if ( v == current + && !nestedhvm_vcpu_in_guestmode(v) + && hvm_mmio_internal(gfn_to_gaddr(gfn)) ) +return HVMTRANS_bad_gfn_to_mfn; + +page = get_page_from_gfn(v->domain, gfn_x(gfn), &p2mt, P2M_UNSHARE); + +if ( !page ) +return HVMTRANS_bad_gfn_to_mfn; + +if ( p2m_is_paging(p2mt) ) +{ +put_page(page); +p2m_mem_paging_populate(v->domain, gfn_x(gfn)); +return HVMTRANS_gfn_paged_out; +} +if ( p2m_is_shared(p2mt) ) +{ +put_page(page); +return HVMTRANS_gfn_shared; +} +if ( p2m_is_grant(p2mt) ) +{ +put_page(page); +return HVMTRANS_unhandleable; +} + +*page_p = page; +if ( gfn_p ) +*gfn_p = gfn; +if ( p2mt_p ) +*p2mt_p = p2mt; + +return HVMTRANS_okay; +} + #define HVMCOPY_from_guest (0u<<0) #define HVMCOPY_to_guest (1u<<0) #define HVMCOPY_phys (0u<<2) @@ -3077,7 +3154,7 @@ static enum hvm_translation_result __hvm_copy( void *buf, paddr_t addr, int size, struct vcpu *v, unsigned int flags, uint32_t pfec, pagefault_info_t *pfinfo) { -unsigned long gfn; +gfn_t gfn; struct page_info *page; p2m_type_t p2mt; char *p; @@ -3103,65 +3180,15 @@ static enum hvm_translation_result __hvm_copy( while ( todo > 0 ) { +enum hvm_translation_result res; paddr_t gpa = addr & ~PAGE_MASK; count = min_t(int, PAGE_SIZE - gpa, todo); -if ( flags & HVMCOPY_linear ) -{ -gfn = paging_gva_to_gfn(v, addr, &pfec); -if ( gfn == gfn_x(INVALID_GFN) ) -{ -if ( pfec & PFEC_page_paged ) -return HVMTRANS_gfn_paged_out; -if ( pfec & PFEC_page_shared ) -return HVMTRANS_gfn_shared; -if ( pfinfo ) -{ -pfinfo->linear = addr; -pfinfo->ec = pfec & ~PFEC_implicit; -} -return HVMTRANS_bad_linear_to_gfn; -} -gpa |= (paddr_t)gfn << PAGE_SHIFT; -} -else -{ -gfn = addr >> PAGE_SHIFT; -gpa = addr; -} - -/* - * No need to do the P2M lookup for internally handled MMIO, benefiting - * - 32-bit WinXP (& older Windows) on AMD CPUs for LAPIC accesses, - * - newer Windows (like Server 2012) for HPET accesses. - */ -if ( v == current - && !nestedhvm_vcpu_in_guestmode(v) - && hvm_mmio_internal(gpa) ) -return HVMTRANS_bad_gfn_to_mfn; - -page = get_page_from_gfn(v->domain, gfn, &p2mt, P2M_UNSHARE); - -if ( !page ) -return HVMTRANS_bad_gfn_to_mfn; - -if ( p2m_is_paging(p2mt) ) -{ -put_page(page); -p2m_mem_paging_populate(v->domain, gfn); -return HVMTRANS_gfn_paged_out; -} -if ( p2m_is_shared(p2mt) ) -{ -put_page(page); -return HVMTRANS_gfn_shared; -} -if ( p2m_is_grant(p2mt) ) -{ -put_page(page); -return HVMTRANS_unhandleable; -} +res = hvm_translate_ge
[Xen-devel] [PATCH v4 3/3] x86/hvm: Implement hvmemul_write() using real mappings
From: Andrew Cooper An access which crosses a page boundary is performed atomically by x86 hardware, albeit with a severe performance penalty. An important corner case is when a straddled access hits two pages which differ in whether a translation exists, or in net access rights. The use of hvm_copy*() in hvmemul_write() is problematic, because it performs a translation then completes the partial write, before moving onto the next translation. If an individual emulated write straddles two pages, the first of which is writable, and the second of which is not, the first half of the write will complete before #PF is raised from the second half. This results in guest state corruption as a side effect of emulation, which has been observed to cause windows to crash while under introspection. Introduce the hvmemul_{,un}map_linear_addr() helpers, which translate an entire contents of a linear access, and vmap() the underlying frames to provide a contiguous virtual mapping for the emulator to use. This is the same mechanism as used by the shadow emulation code. This will catch any translation issues and abort the emulation before any modifications occur. Signed-off-by: Andrew Cooper Signed-off-by: Alexandru Isaila --- Changes since V3: - Changed the if condition form ... > ... -1 to ... >= ... - Removed long cast to the err var - Moved the mfn++ closer to the end of the loop - Integrated the ++frame in the while condition --- xen/arch/x86/hvm/emulate.c| 175 ++ xen/include/asm-x86/hvm/emulate.h | 7 ++ 2 files changed, 165 insertions(+), 17 deletions(-) diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c index cc874ce..9866d64 100644 --- a/xen/arch/x86/hvm/emulate.c +++ b/xen/arch/x86/hvm/emulate.c @@ -498,6 +498,157 @@ static int hvmemul_do_mmio_addr(paddr_t mmio_gpa, } /* + * Map the frame(s) covering an individual linear access, for writeable + * access. May return NULL for MMIO, or ERR_PTR(~X86EMUL_*) for other errors + * including ERR_PTR(~X86EMUL_OKAY) for write-discard mappings. + * + * In debug builds, map() checks that each slot in hvmemul_ctxt->mfn[] is + * clean before use, and poisions unused slots with INVALID_MFN. + */ +static void *hvmemul_map_linear_addr( +unsigned long linear, unsigned int bytes, uint32_t pfec, +struct hvm_emulate_ctxt *hvmemul_ctxt) +{ +struct vcpu *curr = current; +void *err, *mapping; + +/* First and final gfns which need mapping. */ +unsigned long frame = linear >> PAGE_SHIFT, first = frame; +unsigned long final = (linear + bytes - !!bytes) >> PAGE_SHIFT; + +/* + * mfn points to the next free slot. All used slots have a page reference + * held on them. + */ +mfn_t *mfn = &hvmemul_ctxt->mfn[0]; + +/* + * The caller has no legitimate reason for trying a zero-byte write, but + * final is calculate to fail safe in release builds. + * + * The maximum write size depends on the number of adjacent mfns[] which + * can be vmap()'d, accouting for possible misalignment within the region. + * The higher level emulation callers are responsible for ensuring that + * mfns[] is large enough for the requested write size. + */ +if ( bytes == 0 || + final - first >= ARRAY_SIZE(hvmemul_ctxt->mfn) ) +{ +ASSERT_UNREACHABLE(); +goto unhandleable; +} + +do { +enum hvm_translation_result res; +struct page_info *page; +pagefault_info_t pfinfo; +p2m_type_t p2mt; + +/* Error checking. Confirm that the current slot is clean. */ +ASSERT(mfn_x(*mfn) == 0); + +res = hvm_translate_get_page(curr, frame << PAGE_SHIFT, true, pfec, + &pfinfo, &page, NULL, &p2mt); + +switch ( res ) +{ +case HVMTRANS_okay: +break; + +case HVMTRANS_bad_linear_to_gfn: +x86_emul_pagefault(pfinfo.ec, pfinfo.linear, &hvmemul_ctxt->ctxt); +err = ERR_PTR(~X86EMUL_EXCEPTION); +goto out; + +case HVMTRANS_bad_gfn_to_mfn: +err = NULL; +goto out; + +case HVMTRANS_gfn_paged_out: +case HVMTRANS_gfn_shared: +err = ERR_PTR(~X86EMUL_RETRY); +goto out; + +default: +goto unhandleable; +} + +if ( p2m_is_discard_write(p2mt) ) +{ +err = ERR_PTR(~X86EMUL_OKAY); +goto out; +} + +*mfn++ = _mfn(page_to_mfn(page)); + +} while ( ++frame < final ); + +/* Entire access within a single frame? */ +if ( first == final ) +mapping = map_domain_page(hvmemul_ctxt->mfn[0]); +/* Multiple frames? Need to vmap(). */ +else if ( (mapping = vmap(hvmemul_ctxt->mfn, + final - first + 1)) == NULL ) +goto unhandleable; + +#ifndef NDEBUG /* Poision unused mfn
[Xen-devel] [PATCH v4 1/3] x86/hvm: Rename enum hvm_copy_result to hvm_translation_result
From: Andrew Cooper Signed-off-by: Andrew Cooper Acked-by: Tim Deegan Acked-by: Jan Beulich Reviewed-by: Kevin Tian Acked-by: George Dunlap --- xen/arch/x86/hvm/dom0_build.c | 2 +- xen/arch/x86/hvm/emulate.c| 40 ++-- xen/arch/x86/hvm/hvm.c| 56 +++ xen/arch/x86/hvm/intercept.c | 20 +++--- xen/arch/x86/hvm/svm/nestedsvm.c | 5 ++-- xen/arch/x86/hvm/svm/svm.c| 2 +- xen/arch/x86/hvm/viridian.c | 2 +- xen/arch/x86/hvm/vmsi.c | 2 +- xen/arch/x86/hvm/vmx/realmode.c | 2 +- xen/arch/x86/hvm/vmx/vvmx.c | 14 +- xen/arch/x86/mm/shadow/common.c | 12 - xen/common/libelf/libelf-loader.c | 4 +-- xen/include/asm-x86/hvm/support.h | 40 ++-- 13 files changed, 101 insertions(+), 100 deletions(-) diff --git a/xen/arch/x86/hvm/dom0_build.c b/xen/arch/x86/hvm/dom0_build.c index 020c355..e8f746c 100644 --- a/xen/arch/x86/hvm/dom0_build.c +++ b/xen/arch/x86/hvm/dom0_build.c @@ -238,7 +238,7 @@ static int __init pvh_setup_vmx_realmode_helpers(struct domain *d) if ( !pvh_steal_ram(d, HVM_VM86_TSS_SIZE, 128, GB(4), &gaddr) ) { if ( hvm_copy_to_guest_phys(gaddr, NULL, HVM_VM86_TSS_SIZE, v) != - HVMCOPY_okay ) + HVMTRANS_okay ) printk("Unable to zero VM86 TSS area\n"); d->arch.hvm_domain.params[HVM_PARAM_VM86_TSS_SIZED] = VM86_TSS_UPDATED | ((uint64_t)HVM_VM86_TSS_SIZE << 32) | gaddr; diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c index 54811c1..cc874ce 100644 --- a/xen/arch/x86/hvm/emulate.c +++ b/xen/arch/x86/hvm/emulate.c @@ -100,7 +100,7 @@ static int ioreq_server_read(const struct hvm_io_handler *io_handler, uint32_t size, uint64_t *data) { -if ( hvm_copy_from_guest_phys(data, addr, size) != HVMCOPY_okay ) +if ( hvm_copy_from_guest_phys(data, addr, size) != HVMTRANS_okay ) return X86EMUL_UNHANDLEABLE; return X86EMUL_OKAY; @@ -893,18 +893,18 @@ static int __hvmemul_read( switch ( rc ) { -case HVMCOPY_okay: +case HVMTRANS_okay: break; -case HVMCOPY_bad_gva_to_gfn: +case HVMTRANS_bad_linear_to_gfn: x86_emul_pagefault(pfinfo.ec, pfinfo.linear, &hvmemul_ctxt->ctxt); return X86EMUL_EXCEPTION; -case HVMCOPY_bad_gfn_to_mfn: +case HVMTRANS_bad_gfn_to_mfn: if ( access_type == hvm_access_insn_fetch ) return X86EMUL_UNHANDLEABLE; return hvmemul_linear_mmio_read(addr, bytes, p_data, pfec, hvmemul_ctxt, 0); -case HVMCOPY_gfn_paged_out: -case HVMCOPY_gfn_shared: +case HVMTRANS_gfn_paged_out: +case HVMTRANS_gfn_shared: return X86EMUL_RETRY; default: return X86EMUL_UNHANDLEABLE; @@ -1012,15 +1012,15 @@ static int hvmemul_write( switch ( rc ) { -case HVMCOPY_okay: +case HVMTRANS_okay: break; -case HVMCOPY_bad_gva_to_gfn: +case HVMTRANS_bad_linear_to_gfn: x86_emul_pagefault(pfinfo.ec, pfinfo.linear, &hvmemul_ctxt->ctxt); return X86EMUL_EXCEPTION; -case HVMCOPY_bad_gfn_to_mfn: +case HVMTRANS_bad_gfn_to_mfn: return hvmemul_linear_mmio_write(addr, bytes, p_data, pfec, hvmemul_ctxt, 0); -case HVMCOPY_gfn_paged_out: -case HVMCOPY_gfn_shared: +case HVMTRANS_gfn_paged_out: +case HVMTRANS_gfn_shared: return X86EMUL_RETRY; default: return X86EMUL_UNHANDLEABLE; @@ -1384,7 +1384,7 @@ static int hvmemul_rep_movs( return rc; } -rc = HVMCOPY_okay; +rc = HVMTRANS_okay; } else /* @@ -1394,16 +1394,16 @@ static int hvmemul_rep_movs( */ rc = hvm_copy_from_guest_phys(buf, sgpa, bytes); -if ( rc == HVMCOPY_okay ) +if ( rc == HVMTRANS_okay ) rc = hvm_copy_to_guest_phys(dgpa, buf, bytes, current); xfree(buf); -if ( rc == HVMCOPY_gfn_paged_out ) +if ( rc == HVMTRANS_gfn_paged_out ) return X86EMUL_RETRY; -if ( rc == HVMCOPY_gfn_shared ) +if ( rc == HVMTRANS_gfn_shared ) return X86EMUL_RETRY; -if ( rc != HVMCOPY_okay ) +if ( rc != HVMTRANS_okay ) { gdprintk(XENLOG_WARNING, "Failed memory-to-memory REP MOVS: sgpa=%" PRIpaddr" dgpa=%"PRIpaddr" reps=%lu bytes_per_rep=%u\n", @@ -1513,10 +1513,10 @@ static int hvmemul_rep_stos( switch ( rc ) { -case HVMCOPY_gfn_paged_out: -case HVMCOPY_gfn_shared: +case HVMTRANS_gfn_paged_out: +case HVMTRANS_gfn_shared: return X86EMUL_RETRY; -case HVMCOPY_okay: +case HVMTRANS_okay: return X86EMUL_OKAY; } @@ -2172,7 +2172,7 @@ void hvm_emulate_init_per_insn( &addr) && hvm_fetch_from_guest_linear(hvmemul_ctxt
Re: [Xen-devel] [PATCH v8 11/15] xen: delay allocation of grant table sub structures
>>> On 20.09.17 at 08:34, wrote: > @@ -3381,75 +3425,21 @@ grant_table_create( > struct domain *d) > { > struct grant_table *t; > -unsigned int i, j; > > if ( (t = xzalloc(struct grant_table)) == NULL ) > -goto no_mem_0; > +return -ENOMEM; > > /* Simple stuff. */ > percpu_rwlock_resource_init(&t->lock, grant_rwlock); > spin_lock_init(&t->maptrack_lock); > -t->nr_grant_frames = INITIAL_NR_GRANT_FRAMES; > - > -/* Active grant table. */ > -if ( (t->active = xzalloc_array(struct active_grant_entry *, > -max_nr_active_grant_frames)) == NULL ) > -goto no_mem_1; > -for ( i = 0; > - i < num_act_frames_from_sha_frames(INITIAL_NR_GRANT_FRAMES); i++ ) > -{ > -if ( (t->active[i] = alloc_xenheap_page()) == NULL ) > -goto no_mem_2; > -clear_page(t->active[i]); > -for ( j = 0; j < ACGNT_PER_PAGE; j++ ) > -spin_lock_init(&t->active[i][j].lock); > -} > - > -/* Tracking of mapped foreign frames table */ > -t->maptrack = vzalloc(max_maptrack_frames * sizeof(*t->maptrack)); > -if ( t->maptrack == NULL ) > -goto no_mem_2; > - > -/* Shared grant table. */ > -if ( (t->shared_raw = xzalloc_array(void *, max_grant_frames)) == NULL ) > -goto no_mem_3; > -for ( i = 0; i < INITIAL_NR_GRANT_FRAMES; i++ ) > -{ > -if ( (t->shared_raw[i] = alloc_xenheap_page()) == NULL ) > -goto no_mem_4; > -clear_page(t->shared_raw[i]); > -} > - > -/* Status pages for grant table - for version 2 */ > -t->status = xzalloc_array(grant_status_t *, > - grant_to_status_frames(max_grant_frames)); > -if ( t->status == NULL ) > -goto no_mem_4; > - > -for ( i = 0; i < INITIAL_NR_GRANT_FRAMES; i++ ) > -gnttab_create_shared_page(d, t, i); > - > -t->nr_status_frames = 0; > > /* Okay, install the structure. */ > d->grant_table = t; > -return 0; > > - no_mem_4: > -for ( i = 0; i < INITIAL_NR_GRANT_FRAMES; i++ ) > -free_xenheap_page(t->shared_raw[i]); > -xfree(t->shared_raw); > - no_mem_3: > -vfree(t->maptrack); > - no_mem_2: > -for ( i = 0; > - i < num_act_frames_from_sha_frames(INITIAL_NR_GRANT_FRAMES); i++ ) > -free_xenheap_page(t->active[i]); > -xfree(t->active); > - no_mem_1: > -xfree(t); > - no_mem_0: > -return -ENOMEM; > +if ( d->domain_id == 0 ) > +return grant_table_init(t); > + > +return 0; > } > > void > @@ -3651,8 +3641,9 @@ int grant_table_set_limits(struct domain *d, unsigned > int grant_frames, > > grant_write_lock(gt); > > -ret = 0; > -/* Set limits, alloc needed arrays. */ > +/* Set limits. */ > +if ( !gt->active ) > +ret = grant_table_init(gt); > > grant_write_unlock(gt); These changes don't leave the domains in a state similar to that before the change - I'm missing calls to gnttab_grow_table() to establish the minimal sizes. Aiui so far there has been no requirement for a domain to invoke GNTTABOP_setup_table if it is happy with v1 and the minimum size. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 06/15] x86: implement get value interface for MBA
On 17-09-20 09:43:40, Roger Pau Monn� wrote: > On Wed, Sep 20, 2017 at 01:09:06PM +0800, Yi Sun wrote: > > On 17-09-19 10:15:42, Roger Pau Monn� wrote: > > > On Tue, Sep 05, 2017 at 05:32:28PM +0800, Yi Sun wrote: > > > > diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c > > > > index 696eff2..7902af7 100644 > > > > --- a/xen/arch/x86/domctl.c > > > > +++ b/xen/arch/x86/domctl.c > > > > @@ -1496,6 +1496,13 @@ long arch_do_domctl( > > > > copyback = true; > > > > break; > > > > > > > > +case XEN_DOMCTL_PSR_ALLOC_GET_MBA_THRTL: > > > > +ret = psr_get_val(d, domctl->u.psr_alloc.target, > > > > + &val32, PSR_TYPE_MBA_THRTL); > > > > +domctl->u.psr_alloc.data = val32; > > > > > > Hm, why does psr_get_val take a uint32_t * instead of a uint64_t *? So > > > that you can directly pass &domctl->u.psr_alloc.data. > > > > > > Or the other way around, why is domctl->u.psr_alloc.data a uint64_t > > > instead of a uint32_t? > > > > > There is a historical reason. The COS MSR is 64bit. So, the original codes > > in L3 CAT (submitted years ago) used uint64_t. > > > > But during L2 CAT review, per Jan's comment, the uint64_t is not necessary > > in psr.c. So, we convert it to uint32_t in psr.c and make the codes you see > > here. > > Since this is a DOMCTL, you can change the type of the structure to be > an uint32_t, so that you can pass it directly (unless I'm missing > something else that requires this to be uint64_t). > The tools layer implementation uses uint64_t according to SDM definition. So, we have to do a convertion here or in tools/. Or, we need modify interfaces in tools layer. As CAT interfaces in tools layer have been released and it is a natural choice to make it same as SDM, I think should keep uint64_t in tools at least. If so, we must do the convertion somewhere. Then, I think it is not so necessary to change current codes. Is that good to you? > Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v8 12/15] xen/arm: move arch specific grant table bits into grant_table.c
>>> On 20.09.17 at 08:34, wrote: > Instead of attaching the ARM specific grant table data to the domain > structure add it to struct grant_table. Add the needed arch functions > to the asm-*/grant_table.h includes. > > Signed-off-by: Juergen Gross > Reviewed-by: Paul Durrant Non-ARM parts Acked-by: Jan Beulich ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] minimum Python version
>>> On 20.09.17 at 10:49, wrote: > What about the following idea: maybe we could add an option to configure > to make it fail instead of disabling components in case some > requirements for that component are not met? The default could be to > just disable the components and with the option specified all components > not disabled explicitly must be buildable or configure will fail. If that's something that (a) can be reasonably expressed in configure.ac and (b) the tools maintainers can live with, that's certainly an option. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4 3/3] x86/hvm: Implement hvmemul_write() using real mappings
> -Original Message- > From: Alexandru Isaila [mailto:aisa...@bitdefender.com] > Sent: 20 September 2017 10:23 > To: xen-devel@lists.xen.org > Cc: Tim (Xen.org) ; George Dunlap > ; jbeul...@suse.com; Andrew Cooper > ; Ian Jackson ; > konrad.w...@oracle.com; sstabell...@kernel.org; Wei Liu > ; Paul Durrant ; > boris.ostrov...@oracle.com; suravee.suthikulpa...@amd.com; > jun.nakaj...@intel.com; Kevin Tian ; Alexandru Isaila > > Subject: [PATCH v4 3/3] x86/hvm: Implement hvmemul_write() using real > mappings > > From: Andrew Cooper > > An access which crosses a page boundary is performed atomically by x86 > hardware, albeit with a severe performance penalty. An important corner > case > is when a straddled access hits two pages which differ in whether a > translation exists, or in net access rights. > > The use of hvm_copy*() in hvmemul_write() is problematic, because it > performs > a translation then completes the partial write, before moving onto the next > translation. > > If an individual emulated write straddles two pages, the first of which is > writable, and the second of which is not, the first half of the write will > complete before #PF is raised from the second half. > > This results in guest state corruption as a side effect of emulation, which > has been observed to cause windows to crash while under introspection. > > Introduce the hvmemul_{,un}map_linear_addr() helpers, which translate an > entire contents of a linear access, and vmap() the underlying frames to > provide a contiguous virtual mapping for the emulator to use. This is the > same mechanism as used by the shadow emulation code. > > This will catch any translation issues and abort the emulation before any > modifications occur. > > Signed-off-by: Andrew Cooper > Signed-off-by: Alexandru Isaila > > --- > Changes since V3: > - Changed the if condition form ... > ... -1 to ... >= ... > - Removed long cast to the err var > - Moved the mfn++ closer to the end of the loop > - Integrated the ++frame in the while condition > --- > xen/arch/x86/hvm/emulate.c| 175 > ++ > xen/include/asm-x86/hvm/emulate.h | 7 ++ > 2 files changed, 165 insertions(+), 17 deletions(-) > > diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c > index cc874ce..9866d64 100644 > --- a/xen/arch/x86/hvm/emulate.c > +++ b/xen/arch/x86/hvm/emulate.c > @@ -498,6 +498,157 @@ static int hvmemul_do_mmio_addr(paddr_t > mmio_gpa, > } > > /* > + * Map the frame(s) covering an individual linear access, for writeable > + * access. May return NULL for MMIO, or ERR_PTR(~X86EMUL_*) for other > errors > + * including ERR_PTR(~X86EMUL_OKAY) for write-discard mappings. > + * > + * In debug builds, map() checks that each slot in hvmemul_ctxt->mfn[] is > + * clean before use, and poisions unused slots with INVALID_MFN. > + */ > +static void *hvmemul_map_linear_addr( > +unsigned long linear, unsigned int bytes, uint32_t pfec, > +struct hvm_emulate_ctxt *hvmemul_ctxt) > +{ > +struct vcpu *curr = current; > +void *err, *mapping; > + > +/* First and final gfns which need mapping. */ > +unsigned long frame = linear >> PAGE_SHIFT, first = frame; > +unsigned long final = (linear + bytes - !!bytes) >> PAGE_SHIFT; > + > +/* > + * mfn points to the next free slot. All used slots have a page > reference > + * held on them. > + */ > +mfn_t *mfn = &hvmemul_ctxt->mfn[0]; > + > +/* > + * The caller has no legitimate reason for trying a zero-byte write, but > + * final is calculate to fail safe in release builds. > + * > + * The maximum write size depends on the number of adjacent mfns[] > which > + * can be vmap()'d, accouting for possible misalignment within the > region. > + * The higher level emulation callers are responsible for ensuring that > + * mfns[] is large enough for the requested write size. > + */ > +if ( bytes == 0 || > + final - first >= ARRAY_SIZE(hvmemul_ctxt->mfn) ) > +{ > +ASSERT_UNREACHABLE(); > +goto unhandleable; > +} > + > +do { > +enum hvm_translation_result res; > +struct page_info *page; > +pagefault_info_t pfinfo; > +p2m_type_t p2mt; > + > +/* Error checking. Confirm that the current slot is clean. */ > +ASSERT(mfn_x(*mfn) == 0); > + > +res = hvm_translate_get_page(curr, frame << PAGE_SHIFT, true, pfec, > + &pfinfo, &page, NULL, &p2mt); > + > +switch ( res ) > +{ > +case HVMTRANS_okay: > +break; > + > +case HVMTRANS_bad_linear_to_gfn: > +x86_emul_pagefault(pfinfo.ec, pfinfo.linear, > &hvmemul_ctxt->ctxt); > +err = ERR_PTR(~X86EMUL_EXCEPTION); > +goto out; > + > +case HVMTRANS_bad_gfn_to_mfn: > +err = NULL; > +goto out; > + > +
Re: [Xen-devel] [PATCH v8 04/15] xen: add function for obtaining highest possible memory address
>>> On 20.09.17 at 10:58, wrote: > On 20/09/17 10:57, Jan Beulich wrote: > On 20.09.17 at 08:34, wrote: >>> --- a/xen/arch/x86/mm.c >>> +++ b/xen/arch/x86/mm.c >>> @@ -6312,6 +6312,17 @@ int pv_ro_page_fault(unsigned long addr, struct > cpu_user_regs *regs) >>> return 0; >>> } >>> >>> +unsigned long arch_get_upper_mfn_bound(void) >>> +{ >>> +unsigned long max_mfn; >>> + >>> +max_mfn = mem_hotplug ? PFN_DOWN(mem_hotplug) : max_page; >> >> Taking into account the code in the caller of this function as well >> as the ARM counterpart I find the use of max_page here odd. I'd >> prefer if get_upper_mfn_bound() went away altogether, and it's >> sole caller (which strangely enough doesn't get introduced here) >> called the arch function directly. Additionally, with the caller being >> a sysctl, how is that supposed to help a PV DomU kernel in their >> choice of grant table version? > > Did you look at patch 15? Not yet, no (I had looked over the titles, but this one's didn't make me make the connection). So yes, that addresses the PV DomU concern. Still I'd like to get away without the thin common wrapper around the arch specific actual implementation (and I don't care much whether the resulting function has an arch_ prefix). Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v8 04/15] xen: add function for obtaining highest possible memory address
On 20/09/17 11:32, Jan Beulich wrote: On 20.09.17 at 10:58, wrote: >> On 20/09/17 10:57, Jan Beulich wrote: >> On 20.09.17 at 08:34, wrote: --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -6312,6 +6312,17 @@ int pv_ro_page_fault(unsigned long addr, struct >> cpu_user_regs *regs) return 0; } +unsigned long arch_get_upper_mfn_bound(void) +{ +unsigned long max_mfn; + +max_mfn = mem_hotplug ? PFN_DOWN(mem_hotplug) : max_page; >>> >>> Taking into account the code in the caller of this function as well >>> as the ARM counterpart I find the use of max_page here odd. I'd >>> prefer if get_upper_mfn_bound() went away altogether, and it's >>> sole caller (which strangely enough doesn't get introduced here) >>> called the arch function directly. Additionally, with the caller being >>> a sysctl, how is that supposed to help a PV DomU kernel in their >>> choice of grant table version? >> >> Did you look at patch 15? > > Not yet, no (I had looked over the titles, but this one's didn't > make me make the connection). So yes, that addresses the PV > DomU concern. Still I'd like to get away without the thin common > wrapper around the arch specific actual implementation (and I > don't care much whether the resulting function has an arch_ > prefix). Okay, I'll remove the common wrapper and drop the arch_ from the x86 and arm variants. Juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4 1/3] x86/hvm: Rename enum hvm_copy_result to hvm_translation_result
>>> On 20.09.17 at 11:22, wrote: > From: Andrew Cooper > > Signed-off-by: Andrew Cooper > Acked-by: Tim Deegan > Acked-by: Jan Beulich > Reviewed-by: Kevin Tian > Acked-by: George Dunlap I'm sorry to say that, but this time you've lost Paul's R-b. Please pay more attention to such aspects in the future. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v8 11/15] xen: delay allocation of grant table sub structures
On 20/09/17 11:22, Jan Beulich wrote: On 20.09.17 at 08:34, wrote: >> @@ -3381,75 +3425,21 @@ grant_table_create( >> struct domain *d) >> { >> struct grant_table *t; >> -unsigned int i, j; >> >> if ( (t = xzalloc(struct grant_table)) == NULL ) >> -goto no_mem_0; >> +return -ENOMEM; >> >> /* Simple stuff. */ >> percpu_rwlock_resource_init(&t->lock, grant_rwlock); >> spin_lock_init(&t->maptrack_lock); >> -t->nr_grant_frames = INITIAL_NR_GRANT_FRAMES; >> - >> -/* Active grant table. */ >> -if ( (t->active = xzalloc_array(struct active_grant_entry *, >> -max_nr_active_grant_frames)) == NULL ) >> -goto no_mem_1; >> -for ( i = 0; >> - i < num_act_frames_from_sha_frames(INITIAL_NR_GRANT_FRAMES); i++ ) >> -{ >> -if ( (t->active[i] = alloc_xenheap_page()) == NULL ) >> -goto no_mem_2; >> -clear_page(t->active[i]); >> -for ( j = 0; j < ACGNT_PER_PAGE; j++ ) >> -spin_lock_init(&t->active[i][j].lock); >> -} >> - >> -/* Tracking of mapped foreign frames table */ >> -t->maptrack = vzalloc(max_maptrack_frames * sizeof(*t->maptrack)); >> -if ( t->maptrack == NULL ) >> -goto no_mem_2; >> - >> -/* Shared grant table. */ >> -if ( (t->shared_raw = xzalloc_array(void *, max_grant_frames)) == NULL ) >> -goto no_mem_3; >> -for ( i = 0; i < INITIAL_NR_GRANT_FRAMES; i++ ) >> -{ >> -if ( (t->shared_raw[i] = alloc_xenheap_page()) == NULL ) >> -goto no_mem_4; >> -clear_page(t->shared_raw[i]); >> -} >> - >> -/* Status pages for grant table - for version 2 */ >> -t->status = xzalloc_array(grant_status_t *, >> - grant_to_status_frames(max_grant_frames)); >> -if ( t->status == NULL ) >> -goto no_mem_4; >> - >> -for ( i = 0; i < INITIAL_NR_GRANT_FRAMES; i++ ) >> -gnttab_create_shared_page(d, t, i); >> - >> -t->nr_status_frames = 0; >> >> /* Okay, install the structure. */ >> d->grant_table = t; >> -return 0; >> >> - no_mem_4: >> -for ( i = 0; i < INITIAL_NR_GRANT_FRAMES; i++ ) >> -free_xenheap_page(t->shared_raw[i]); >> -xfree(t->shared_raw); >> - no_mem_3: >> -vfree(t->maptrack); >> - no_mem_2: >> -for ( i = 0; >> - i < num_act_frames_from_sha_frames(INITIAL_NR_GRANT_FRAMES); i++ ) >> -free_xenheap_page(t->active[i]); >> -xfree(t->active); >> - no_mem_1: >> -xfree(t); >> - no_mem_0: >> -return -ENOMEM; >> +if ( d->domain_id == 0 ) >> +return grant_table_init(t); >> + >> +return 0; >> } >> >> void >> @@ -3651,8 +3641,9 @@ int grant_table_set_limits(struct domain *d, unsigned >> int grant_frames, >> >> grant_write_lock(gt); >> >> -ret = 0; >> -/* Set limits, alloc needed arrays. */ >> +/* Set limits. */ >> +if ( !gt->active ) >> +ret = grant_table_init(gt); >> >> grant_write_unlock(gt); > > These changes don't leave the domains in a state similar to that > before the change - I'm missing calls to gnttab_grow_table() to > establish the minimal sizes. Aiui so far there has been no > requirement for a domain to invoke GNTTABOP_setup_table if it > is happy with v1 and the minimum size. So you don't like gnttab_grow_table() being called when the guest tries to map the grant frames in this case? I can add the call of gnttab_grow_table() to grant_table_set_limits() in case you like that better. Juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v8 11/15] xen: delay allocation of grant table sub structures
>>> On 20.09.17 at 11:44, wrote: > On 20/09/17 11:22, Jan Beulich wrote: > On 20.09.17 at 08:34, wrote: >>> @@ -3381,75 +3425,21 @@ grant_table_create( >>> struct domain *d) >>> { >>> struct grant_table *t; >>> -unsigned int i, j; >>> >>> if ( (t = xzalloc(struct grant_table)) == NULL ) >>> -goto no_mem_0; >>> +return -ENOMEM; >>> >>> /* Simple stuff. */ >>> percpu_rwlock_resource_init(&t->lock, grant_rwlock); >>> spin_lock_init(&t->maptrack_lock); >>> -t->nr_grant_frames = INITIAL_NR_GRANT_FRAMES; >>> - >>> -/* Active grant table. */ >>> -if ( (t->active = xzalloc_array(struct active_grant_entry *, >>> -max_nr_active_grant_frames)) == NULL ) >>> -goto no_mem_1; >>> -for ( i = 0; >>> - i < num_act_frames_from_sha_frames(INITIAL_NR_GRANT_FRAMES); i++ >>> ) >>> -{ >>> -if ( (t->active[i] = alloc_xenheap_page()) == NULL ) >>> -goto no_mem_2; >>> -clear_page(t->active[i]); >>> -for ( j = 0; j < ACGNT_PER_PAGE; j++ ) >>> -spin_lock_init(&t->active[i][j].lock); >>> -} >>> - >>> -/* Tracking of mapped foreign frames table */ >>> -t->maptrack = vzalloc(max_maptrack_frames * sizeof(*t->maptrack)); >>> -if ( t->maptrack == NULL ) >>> -goto no_mem_2; >>> - >>> -/* Shared grant table. */ >>> -if ( (t->shared_raw = xzalloc_array(void *, max_grant_frames)) == NULL >>> ) >>> -goto no_mem_3; >>> -for ( i = 0; i < INITIAL_NR_GRANT_FRAMES; i++ ) >>> -{ >>> -if ( (t->shared_raw[i] = alloc_xenheap_page()) == NULL ) >>> -goto no_mem_4; >>> -clear_page(t->shared_raw[i]); >>> -} >>> - >>> -/* Status pages for grant table - for version 2 */ >>> -t->status = xzalloc_array(grant_status_t *, >>> - grant_to_status_frames(max_grant_frames)); >>> -if ( t->status == NULL ) >>> -goto no_mem_4; >>> - >>> -for ( i = 0; i < INITIAL_NR_GRANT_FRAMES; i++ ) >>> -gnttab_create_shared_page(d, t, i); >>> - >>> -t->nr_status_frames = 0; >>> >>> /* Okay, install the structure. */ >>> d->grant_table = t; >>> -return 0; >>> >>> - no_mem_4: >>> -for ( i = 0; i < INITIAL_NR_GRANT_FRAMES; i++ ) >>> -free_xenheap_page(t->shared_raw[i]); >>> -xfree(t->shared_raw); >>> - no_mem_3: >>> -vfree(t->maptrack); >>> - no_mem_2: >>> -for ( i = 0; >>> - i < num_act_frames_from_sha_frames(INITIAL_NR_GRANT_FRAMES); i++ >>> ) >>> -free_xenheap_page(t->active[i]); >>> -xfree(t->active); >>> - no_mem_1: >>> -xfree(t); >>> - no_mem_0: >>> -return -ENOMEM; >>> +if ( d->domain_id == 0 ) >>> +return grant_table_init(t); >>> + >>> +return 0; >>> } >>> >>> void >>> @@ -3651,8 +3641,9 @@ int grant_table_set_limits(struct domain *d, unsigned >>> int grant_frames, >>> >>> grant_write_lock(gt); >>> >>> -ret = 0; >>> -/* Set limits, alloc needed arrays. */ >>> +/* Set limits. */ >>> +if ( !gt->active ) >>> +ret = grant_table_init(gt); >>> >>> grant_write_unlock(gt); >> >> These changes don't leave the domains in a state similar to that >> before the change - I'm missing calls to gnttab_grow_table() to >> establish the minimal sizes. Aiui so far there has been no >> requirement for a domain to invoke GNTTABOP_setup_table if it >> is happy with v1 and the minimum size. > > So you don't like gnttab_grow_table() being called when the guest tries > to map the grant frames in this case? I don't like it being called _only_ in that case. > I can add the call of gnttab_grow_table() to grant_table_set_limits() in > case you like that better. And to grant_table_create() for the Dom0 case. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v8 11/15] xen: delay allocation of grant table sub structures
On 20/09/17 12:02, Jan Beulich wrote: On 20.09.17 at 11:44, wrote: >> On 20/09/17 11:22, Jan Beulich wrote: >> On 20.09.17 at 08:34, wrote: @@ -3381,75 +3425,21 @@ grant_table_create( struct domain *d) { struct grant_table *t; -unsigned int i, j; if ( (t = xzalloc(struct grant_table)) == NULL ) -goto no_mem_0; +return -ENOMEM; /* Simple stuff. */ percpu_rwlock_resource_init(&t->lock, grant_rwlock); spin_lock_init(&t->maptrack_lock); -t->nr_grant_frames = INITIAL_NR_GRANT_FRAMES; - -/* Active grant table. */ -if ( (t->active = xzalloc_array(struct active_grant_entry *, -max_nr_active_grant_frames)) == NULL ) -goto no_mem_1; -for ( i = 0; - i < num_act_frames_from_sha_frames(INITIAL_NR_GRANT_FRAMES); i++ ) -{ -if ( (t->active[i] = alloc_xenheap_page()) == NULL ) -goto no_mem_2; -clear_page(t->active[i]); -for ( j = 0; j < ACGNT_PER_PAGE; j++ ) -spin_lock_init(&t->active[i][j].lock); -} - -/* Tracking of mapped foreign frames table */ -t->maptrack = vzalloc(max_maptrack_frames * sizeof(*t->maptrack)); -if ( t->maptrack == NULL ) -goto no_mem_2; - -/* Shared grant table. */ -if ( (t->shared_raw = xzalloc_array(void *, max_grant_frames)) == NULL ) -goto no_mem_3; -for ( i = 0; i < INITIAL_NR_GRANT_FRAMES; i++ ) -{ -if ( (t->shared_raw[i] = alloc_xenheap_page()) == NULL ) -goto no_mem_4; -clear_page(t->shared_raw[i]); -} - -/* Status pages for grant table - for version 2 */ -t->status = xzalloc_array(grant_status_t *, - grant_to_status_frames(max_grant_frames)); -if ( t->status == NULL ) -goto no_mem_4; - -for ( i = 0; i < INITIAL_NR_GRANT_FRAMES; i++ ) -gnttab_create_shared_page(d, t, i); - -t->nr_status_frames = 0; /* Okay, install the structure. */ d->grant_table = t; -return 0; - no_mem_4: -for ( i = 0; i < INITIAL_NR_GRANT_FRAMES; i++ ) -free_xenheap_page(t->shared_raw[i]); -xfree(t->shared_raw); - no_mem_3: -vfree(t->maptrack); - no_mem_2: -for ( i = 0; - i < num_act_frames_from_sha_frames(INITIAL_NR_GRANT_FRAMES); i++ ) -free_xenheap_page(t->active[i]); -xfree(t->active); - no_mem_1: -xfree(t); - no_mem_0: -return -ENOMEM; +if ( d->domain_id == 0 ) +return grant_table_init(t); + +return 0; } void @@ -3651,8 +3641,9 @@ int grant_table_set_limits(struct domain *d, unsigned int grant_frames, grant_write_lock(gt); -ret = 0; -/* Set limits, alloc needed arrays. */ +/* Set limits. */ +if ( !gt->active ) +ret = grant_table_init(gt); grant_write_unlock(gt); >>> >>> These changes don't leave the domains in a state similar to that >>> before the change - I'm missing calls to gnttab_grow_table() to >>> establish the minimal sizes. Aiui so far there has been no >>> requirement for a domain to invoke GNTTABOP_setup_table if it >>> is happy with v1 and the minimum size. >> >> So you don't like gnttab_grow_table() being called when the guest tries >> to map the grant frames in this case? > > I don't like it being called _only_ in that case. > >> I can add the call of gnttab_grow_table() to grant_table_set_limits() in >> case you like that better. > > And to grant_table_create() for the Dom0 case. So in the end: to grant_table_init(). Juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] xen: credit2: fix spinlock irq-safety violation
On 09/20/2017 10:19 AM, George Dunlap wrote: > On 09/19/2017 03:11 AM, Dario Faggioli wrote: >> In commit ad4b3e1e9df34 ("xen: credit2: implement >> utilization cap") xfree() was being called (for >> deallocating the budget replenishment timer, during >> domain destruction) inside an IRQ disabled critical >> section. >> >> That must not happen, as it uses the mem-pool's lock, >> which needs to be taken with IRQ enabled. And, in fact, >> we crash (in debug builds): >> >> (XEN) >> (XEN) Panic on CPU 0: >> (XEN) Xen BUG at spinlock.c:47 >> (XEN) >> >> Let's, therefore, kill and deallocate the timer outside of >> the critical sections, when IRQs are enabled. >> >> Signed-off-by: Dario Faggioli >> --- >> Cc: osstest service owner >> Cc: George Dunlap >> Cc: Wei Liu >> --- >> This was spotted by OSSTest's flight 113562: >> >> http://logs.test-lab.xenproject.org/osstest/logs/113562/ >> >> http://logs.test-lab.xenproject.org/osstest/logs/113562/test-amd64-amd64-xl-credit2/serial-godello0.log >> --- >> xen/common/sched_credit2.c |6 +++--- >> 1 file changed, 3 insertions(+), 3 deletions(-) >> >> diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c >> index 32234ac..7a550db 100644 >> --- a/xen/common/sched_credit2.c >> +++ b/xen/common/sched_credit2.c >> @@ -2923,13 +2923,13 @@ csched2_free_domdata(const struct scheduler *ops, >> void *data) >> >> write_lock_irqsave(&prv->lock, flags); >> >> -kill_timer(sdom->repl_timer); >> -xfree(sdom->repl_timer); >> - >> list_del_init(&sdom->sdom_elem); >> >> write_unlock_irqrestore(&prv->lock, flags); >> >> +kill_timer(sdom->repl_timer); >> +xfree(sdom->repl_timer); > > Any particular reason for moving the kill_timer() as well as the xfree > outside the lock? What happens if the timer goes off after the > irqrestore but before the kill_timer? Looks like if the timer fires, nothing terribly bad will happen; it will just do a useless budget replenishment. It looks like kill_timer() disables irqs, so it could be moved inside the critical section. I'm inclined to say we should do so. I don't anticipate the budget replenishment ever to need to walk the domain list, but should that change, this would be a bear of a bug to find. -George ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v8 13/15] xen: make grant resource limits per domain
>>> On 20.09.17 at 08:34, wrote: > --- a/xen/common/compat/grant_table.c > +++ b/xen/common/compat/grant_table.c > @@ -157,21 +157,14 @@ int compat_grant_table_op(unsigned int cmd, > unsigned int max_frame_list_size_in_page = > (COMPAT_ARG_XLAT_SIZE - sizeof(*nat.setup)) / > sizeof(*nat.setup->frame_list.p); > -if ( max_frame_list_size_in_page < max_grant_frames ) The latest here, but perhaps even earlier I think max_grant_frames should become static, so one can be reasonably certain that all other references are gone. > @@ -290,8 +293,8 @@ num_act_frames_from_sha_frames(const unsigned int num) > return DIV_ROUND_UP(num * sha_per_page, ACGNT_PER_PAGE); > } > > -#define max_nr_active_grant_frames \ > -num_act_frames_from_sha_frames(max_grant_frames) > +#define max_nr_active_grant_frames(gt) \ > +num_act_frames_from_sha_frames(gt->max_grant_frames) Parentheses around gt please. > @@ -1718,8 +1724,9 @@ gnttab_grow_table(struct domain *d, unsigned int > req_nr_frames) > ASSERT(gt->active); > > if ( req_nr_frames < INITIAL_NR_GRANT_FRAMES ) > -req_nr_frames = INITIAL_NR_GRANT_FRAMES; > -ASSERT(req_nr_frames <= max_grant_frames); > +req_nr_frames = min_t(unsigned int, INITIAL_NR_GRANT_FRAMES, > +gt->max_grant_frames); Perhaps give the INITIAL_NR_GRANT_FRAMES value a U suffix instead of using min_t() here? > @@ -1777,13 +1784,15 @@ active_alloc_failed: > > static long > gnttab_setup_table( > -XEN_GUEST_HANDLE_PARAM(gnttab_setup_table_t) uop, unsigned int count) > +XEN_GUEST_HANDLE_PARAM(gnttab_setup_table_t) uop, unsigned int count, > +unsigned int limit_max) > { > struct vcpu *curr = current; > struct gnttab_setup_table op; > struct domain *d = NULL; > struct grant_table *gt; > unsigned int i; > +long ret = 0; Wouldn't int suffice here? > @@ -1819,6 +1819,21 @@ gnttab_setup_table( > gt = d->grant_table; > grant_write_lock(gt); > > +if ( unlikely(op.nr_frames > gt->max_grant_frames) ) > +{ > +gdprintk(XENLOG_INFO, "Domain is limited to %d grant-table > frames.\n", %u and you also want to log the subject domain ID (which may not be current's; same for the other log message below as well as the similar one in gnttab_get_status_frames()). > +gt->max_grant_frames); > +op.status = GNTST_general_error; > +goto unlock; > +} > +if ( unlikely(limit_max < gt->max_grant_frames) ) With the check moved here it can be relaxed afaict: Code below only writes op.nr_frames entries (same then again for gnttab_get_status_frames()). > @@ -1852,10 +1867,10 @@ gnttab_setup_table( > if ( d ) > rcu_unlock_domain(d); > > -if ( unlikely(__copy_field_to_guest(uop, &op, status)) ) > +if ( !ret && unlikely(__copy_field_to_guest(uop, &op, status)) ) I wonder whether it wouldn't be better to switch that check producing -EINVAL to also report the failure in op.status, now that it lives here (same then for gnttab_get_status_frames() once again). > static long > gnttab_get_status_frames(XEN_GUEST_HANDLE_PARAM(gnttab_get_status_frames_t) > uop, > - int count) > + int count, unsigned int limit_max) Take the opportunity and switch count to unsigned int? > @@ -3320,7 +3347,7 @@ do_grant_table_op( > > case GNTTABOP_setup_table: > rc = gnttab_setup_table( > -guest_handle_cast(uop, gnttab_setup_table_t), count); > +guest_handle_cast(uop, gnttab_setup_table_t), count, ~0); UINT_MAX? > @@ -3442,6 +3469,8 @@ grant_table_create( > /* Simple stuff. */ > percpu_rwlock_resource_init(&t->lock, grant_rwlock); > spin_lock_init(&t->maptrack_lock); > +t->max_grant_frames = max_grant_frames; > +t->max_maptrack_frames = max_maptrack_frames; This together with ... > @@ -3655,7 +3686,11 @@ int grant_table_set_limits(struct domain *d, unsigned > int grant_frames, > > /* Set limits. */ > if ( !gt->active ) > +{ > +gt->max_grant_frames = grant_frames; > +gt->max_maptrack_frames = maptrack_frames; > ret = grant_table_init(gt); > +} .. this raises the question of whether it is legal to decrease the limits. There may be code depending on it only ever growing. Additionally to take the input values without applying some upper cap - while we have XSA-77, we still shouldn't introduce new issues making disaggregation more unsafe. Perhaps the global limits could serve as a cap here? > --- a/xen/include/asm-arm/grant_table.h > +++ b/xen/include/asm-arm/grant_table.h > @@ -26,8 +26,8 @@ static inline int replace_grant_supported(void) > return 1; > } > > -#define gnttab_init_arch(gt) \ > -( ((gt)->arch.gfn = xzalloc_array(gfn_t, max_grant_frames)) =
Re: [Xen-devel] [PATCH 1/2] public/domctl: drop unnecessary typedefs and handles
On Tue, Sep 12, 2017 at 4:08 PM, Jan Beulich wrote: > By virtue of the struct xen_domctl container structure, most of them > are really just cluttering the name space. > > While doing so, > - convert an enum typed (pt_irq_type_t) structure field to a fixed > width type, > - make x86's paging_domctl() and descendants take a properly typed > handle, > - add const in a few places. > > Signed-off-by: Jan Beulich Acked-by: George Dunlap > > --- a/tools/libxc/include/xenctrl.h > +++ b/tools/libxc/include/xenctrl.h > @@ -903,7 +903,7 @@ int xc_vcpu_get_extstate(xc_interface *x > uint32_t vcpu, > xc_vcpu_extstate_t *extstate); > > -typedef xen_domctl_getvcpuinfo_t xc_vcpuinfo_t; > +typedef struct xen_domctl_getvcpuinfo xc_vcpuinfo_t; > int xc_vcpu_getinfo(xc_interface *xch, > uint32_t domid, > uint32_t vcpu, > @@ -916,7 +916,7 @@ long long xc_domain_get_cpu_usage(xc_int > int xc_domain_sethandle(xc_interface *xch, uint32_t domid, > xen_domain_handle_t handle); > > -typedef xen_domctl_shadow_op_stats_t xc_shadow_op_stats_t; > +typedef struct xen_domctl_shadow_op_stats xc_shadow_op_stats_t; > int xc_shadow_control(xc_interface *xch, >uint32_t domid, >unsigned int sop, > --- a/tools/libxc/xc_domain.c > +++ b/tools/libxc/xc_domain.c > @@ -1714,8 +1714,7 @@ int xc_domain_update_msi_irq( > uint64_t gtable) > { > int rc; > -xen_domctl_bind_pt_irq_t *bind; > - > +struct xen_domctl_bind_pt_irq *bind; > DECLARE_DOMCTL; > > domctl.cmd = XEN_DOMCTL_bind_pt_irq; > @@ -1740,8 +1739,7 @@ int xc_domain_unbind_msi_irq( > uint32_t gflags) > { > int rc; > -xen_domctl_bind_pt_irq_t *bind; > - > +struct xen_domctl_bind_pt_irq *bind; > DECLARE_DOMCTL; > > domctl.cmd = XEN_DOMCTL_unbind_pt_irq; > @@ -1770,7 +1768,7 @@ static int xc_domain_bind_pt_irq_int( > uint16_t spi) > { > int rc; > -xen_domctl_bind_pt_irq_t * bind; > +struct xen_domctl_bind_pt_irq *bind; > DECLARE_DOMCTL; > > domctl.cmd = XEN_DOMCTL_bind_pt_irq; > @@ -1828,7 +1826,7 @@ static int xc_domain_unbind_pt_irq_int( > uint8_t spi) > { > int rc; > -xen_domctl_bind_pt_irq_t * bind; > +struct xen_domctl_bind_pt_irq *bind; > DECLARE_DOMCTL; > > domctl.cmd = XEN_DOMCTL_unbind_pt_irq; > --- a/xen/arch/arm/domctl.c > +++ b/xen/arch/arm/domctl.c > @@ -41,7 +41,7 @@ long arch_do_domctl(struct xen_domctl *d > case XEN_DOMCTL_bind_pt_irq: > { > int rc; > -xen_domctl_bind_pt_irq_t *bind = &domctl->u.bind_pt_irq; > +struct xen_domctl_bind_pt_irq *bind = &domctl->u.bind_pt_irq; > uint32_t irq = bind->u.spi.spi; > uint32_t virq = bind->machine_irq; > > @@ -87,7 +87,7 @@ long arch_do_domctl(struct xen_domctl *d > case XEN_DOMCTL_unbind_pt_irq: > { > int rc; > -xen_domctl_bind_pt_irq_t *bind = &domctl->u.bind_pt_irq; > +struct xen_domctl_bind_pt_irq *bind = &domctl->u.bind_pt_irq; > uint32_t irq = bind->u.spi.spi; > uint32_t virq = bind->machine_irq; > > --- a/xen/arch/x86/domctl.c > +++ b/xen/arch/x86/domctl.c > @@ -48,7 +48,7 @@ static int gdbsx_guest_mem_io(domid_t do > } > > static int update_domain_cpuid_info(struct domain *d, > -const xen_domctl_cpuid_t *ctl) > +const struct xen_domctl_cpuid *ctl) > { > struct cpuid_policy *p = d->arch.cpuid; > const struct cpuid_leaf leaf = { ctl->eax, ctl->ebx, ctl->ecx, ctl->edx > }; > @@ -363,8 +363,7 @@ long arch_do_domctl( > { > > case XEN_DOMCTL_shadow_op: > -ret = paging_domctl(d, &domctl->u.shadow_op, > -guest_handle_cast(u_domctl, void), 0); > +ret = paging_domctl(d, &domctl->u.shadow_op, u_domctl, 0); > if ( ret == -ERESTART ) > return hypercall_create_continuation(__HYPERVISOR_arch_1, > "h", u_domctl); > @@ -707,7 +706,7 @@ long arch_do_domctl( > > case XEN_DOMCTL_bind_pt_irq: > { > -xen_domctl_bind_pt_irq_t *bind = &domctl->u.bind_pt_irq; > +struct xen_domctl_bind_pt_irq *bind = &domctl->u.bind_pt_irq; > int irq; > > ret = -EINVAL; > @@ -738,7 +737,7 @@ long arch_do_domctl( > > case XEN_DOMCTL_unbind_pt_irq: > { > -xen_domctl_bind_pt_irq_t *bind = &domctl->u.bind_pt_irq; > +struct xen_domctl_bind_pt_irq *bind = &domctl->u.bind_pt_irq; > int irq = domain_pirq_to_irq(d, bind->machine_irq); > > ret = -EPERM; > --- a/xen/arch/x86/hvm/vioapic.c > +++ b/xen/arch/x86/hvm/vioapic.c > @@ -162,7 +162,7 @@ static int vioapic_hwdom_map_gsi(unsigne > unsigned int pol) > { > struct domain *currd = current->domain; > -xen
Re: [Xen-devel] [PATCH 2/2] public/sysctl: drop unnecessary typedefs and handles
On Tue, Sep 12, 2017 at 4:10 PM, Jan Beulich wrote: > By virtue of the struct xen_sysctl container structure, most of them > are really just cluttering the name space. > > Signed-off-by: Jan Beulich Acked-by: George Dunlap > > --- a/tools/libxc/include/xenctrl.h > +++ b/tools/libxc/include/xenctrl.h > @@ -1212,11 +1212,11 @@ int xc_readconsolering(xc_interface *xch > int xc_send_debug_keys(xc_interface *xch, char *keys); > int xc_set_parameters(xc_interface *xch, char *params); > > -typedef xen_sysctl_physinfo_t xc_physinfo_t; > -typedef xen_sysctl_cputopo_t xc_cputopo_t; > -typedef xen_sysctl_numainfo_t xc_numainfo_t; > -typedef xen_sysctl_meminfo_t xc_meminfo_t; > -typedef xen_sysctl_pcitopoinfo_t xc_pcitopoinfo_t; > +typedef struct xen_sysctl_physinfo xc_physinfo_t; > +typedef struct xen_sysctl_cputopo xc_cputopo_t; > +typedef struct xen_sysctl_numainfo xc_numainfo_t; > +typedef struct xen_sysctl_meminfo xc_meminfo_t; > +typedef struct xen_sysctl_pcitopoinfo xc_pcitopoinfo_t; > > typedef uint32_t xc_cpu_to_node_t; > typedef uint32_t xc_cpu_to_socket_t; > @@ -1240,7 +1240,7 @@ int xc_machphys_mfn_list(xc_interface *x > unsigned long max_extents, > xen_pfn_t *extent_start); > > -typedef xen_sysctl_cpuinfo_t xc_cpuinfo_t; > +typedef struct xen_sysctl_cpuinfo xc_cpuinfo_t; > int xc_getcpuinfo(xc_interface *xch, int max_cpus, >xc_cpuinfo_t *info, int *nr_cpus); > > @@ -1853,8 +1853,8 @@ int xc_cpu_offline(xc_interface *xch, in > * cpufreq para name of this structure named > * same as sysfs file name of native linux > */ > -typedef xen_userspace_t xc_userspace_t; > -typedef xen_ondemand_t xc_ondemand_t; > +typedef struct xen_userspace xc_userspace_t; > +typedef struct xen_ondemand xc_ondemand_t; > > struct xc_get_cpufreq_para { > /* IN/OUT variable */ > --- a/tools/libxc/xc_misc.c > +++ b/tools/libxc/xc_misc.c > @@ -547,7 +547,7 @@ int xc_livepatch_upload(xc_interface *xc > DECLARE_SYSCTL; > DECLARE_HYPERCALL_BUFFER(char, local); > DECLARE_HYPERCALL_BOUNCE(name, 0 /* later */, > XC_HYPERCALL_BUFFER_BOUNCE_IN); > -xen_livepatch_name_t def_name = { .pad = { 0, 0, 0 } }; > +struct xen_livepatch_name def_name = { }; > > if ( !name || !payload ) > { > @@ -594,12 +594,12 @@ int xc_livepatch_upload(xc_interface *xc > > int xc_livepatch_get(xc_interface *xch, > char *name, > - xen_livepatch_status_t *status) > + struct xen_livepatch_status *status) > { > int rc; > DECLARE_SYSCTL; > DECLARE_HYPERCALL_BOUNCE(name, 0 /*adjust later */, > XC_HYPERCALL_BUFFER_BOUNCE_IN); > -xen_livepatch_name_t def_name = { .pad = { 0, 0, 0 } }; > +struct xen_livepatch_name def_name = { }; > > if ( !name ) > { > @@ -677,7 +677,7 @@ int xc_livepatch_get(xc_interface *xch, > * retrieved (if any). > */ > int xc_livepatch_list(xc_interface *xch, unsigned int max, unsigned int > start, > - xen_livepatch_status_t *info, > + struct xen_livepatch_status *info, >char *name, uint32_t *len, >unsigned int *done, >unsigned int *left) > @@ -837,7 +837,7 @@ static int _xc_livepatch_action(xc_inter > DECLARE_SYSCTL; > /* The size is figured out when we strlen(name) */ > DECLARE_HYPERCALL_BOUNCE(name, 0, XC_HYPERCALL_BUFFER_BOUNCE_IN); > -xen_livepatch_name_t def_name = { .pad = { 0, 0, 0 } }; > +struct xen_livepatch_name def_name = { }; > > def_name.size = strlen(name) + 1; > > --- a/xen/arch/arm/sysctl.c > +++ b/xen/arch/arm/sysctl.c > @@ -12,7 +12,7 @@ > #include > #include > > -void arch_do_physinfo(xen_sysctl_physinfo_t *pi) { } > +void arch_do_physinfo(struct xen_sysctl_physinfo *pi) { } > > long arch_do_sysctl(struct xen_sysctl *sysctl, > XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl) > --- a/xen/arch/x86/sysctl.c > +++ b/xen/arch/x86/sysctl.c > @@ -72,7 +72,7 @@ long cpu_down_helper(void *data) > return ret; > } > > -void arch_do_physinfo(xen_sysctl_physinfo_t *pi) > +void arch_do_physinfo(struct xen_sysctl_physinfo *pi) > { > memcpy(pi->hw_cap, boot_cpu_data.x86_capability, > min(sizeof(pi->hw_cap), sizeof(boot_cpu_data.x86_capability))); > --- a/xen/common/gcov/gcov.c > +++ b/xen/common/gcov/gcov.c > @@ -209,7 +209,7 @@ static int gcov_dump_all(XEN_GUEST_HANDL > return ret; > } > > -int sysctl_gcov_op(xen_sysctl_gcov_op_t *op) > +int sysctl_gcov_op(struct xen_sysctl_gcov_op *op) > { > int ret; > > --- a/xen/common/livepatch.c > +++ b/xen/common/livepatch.c > @@ -104,7 +104,7 @@ static struct livepatch_work livepatch_w > */ > static DEFINE_PER_CPU(bool_t, work_to_do); > > -static int get_name(const xen_livepatch_name_t *name, char *n) > +static int get_name(const struct xen_livepatch_name *name, char *n) > { >
[Xen-devel] [xen-unstable-coverity test] 113625: all pass - PUSHED
flight 113625 xen-unstable-coverity real [real] http://logs.test-lab.xenproject.org/osstest/logs/113625/ Perfect :-) All tests in this flight passed as required version targeted for testing: xen 64cf3181e4d469a8bd7e7dee8ff2d3bf5b45f4b0 baseline version: xen abd91b2a2bcd05618a71f7e5fe571dd10a5727bc Last test of basis 113540 2017-09-17 09:39:30 Z3 days Testing same since 113625 2017-09-20 09:22:12 Z0 days1 attempts People who touched revisions under test: Andrew Cooper Boris Ostrovsky George Dunlap Ian Jackson Jan Beulich Juergen Gross Julien Grall Kevin Tian Konrad Rzeszutek Wilk Stefano Stabellini Wei Liu jobs: coverity-amd64 pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : + branch=xen-unstable-coverity + revision=64cf3181e4d469a8bd7e7dee8ff2d3bf5b45f4b0 + . ./cri-lock-repos ++ . ./cri-common +++ . ./cri-getconfig export PERLLIB=.:. PERLLIB=.:. +++ umask 002 +++ getrepos getconfig Repos perl -e ' use Osstest; readglobalconfig(); print $c{"Repos"} or die $!; ' +++ local repos=/home/osstest/repos +++ '[' -z /home/osstest/repos ']' +++ '[' '!' -d /home/osstest/repos ']' +++ echo /home/osstest/repos ++ repos=/home/osstest/repos ++ repos_lock=/home/osstest/repos/lock ++ '[' x '!=' x/home/osstest/repos/lock ']' ++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock ++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push xen-unstable-coverity 64cf3181e4d469a8bd7e7dee8ff2d3bf5b45f4b0 + branch=xen-unstable-coverity + revision=64cf3181e4d469a8bd7e7dee8ff2d3bf5b45f4b0 + . ./cri-lock-repos ++ . ./cri-common +++ . ./cri-getconfig export PERLLIB=.:.:. PERLLIB=.:.:. +++ umask 002 +++ getrepos getconfig Repos perl -e ' use Osstest; readglobalconfig(); print $c{"Repos"} or die $!; ' +++ local repos=/home/osstest/repos +++ '[' -z /home/osstest/repos ']' +++ '[' '!' -d /home/osstest/repos ']' +++ echo /home/osstest/repos ++ repos=/home/osstest/repos ++ repos_lock=/home/osstest/repos/lock ++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']' + . ./cri-common ++ . ./cri-getconfig +++ export PERLLIB=.:.:.:. +++ PERLLIB=.:.:.:. ++ umask 002 + select_xenbranch + case "$branch" in + tree=xen + xenbranch=xen-unstable-coverity + qemuubranch=qemu-upstream-unstable-coverity + qemuubranch=qemu-upstream-unstable + '[' xxen = xlinux ']' + linuxbranch= + '[' xqemu-upstream-unstable = x ']' + select_prevxenbranch ++ ./cri-getprevxenbranch xen-unstable-coverity + prevxenbranch=xen-4.9-testing + '[' x64cf3181e4d469a8bd7e7dee8ff2d3bf5b45f4b0 = x ']' + : tested/2.6.39.x + . ./ap-common ++ : osst...@xenbits.xen.org +++ getconfig OsstestUpstream +++ perl -e ' use Osstest; readglobalconfig(); print $c{"OsstestUpstream"} or die $!; ' ++ : ++ : git://xenbits.xen.org/xen.git ++ : osst...@xenbits.xen.org:/home/xen/git/xen.git ++ : git://xenbits.xen.org/qemu-xen-traditional.git ++ : git://git.kernel.org ++ : git://git.kernel.org/pub/scm/linux/kernel/git ++ : git ++ : git://xenbits.xen.org/xtf.git ++ : osst...@xenbits.xen.org:/home/xen/git/xtf.git ++ : git://xenbits.xen.org/xtf.git ++ : git://xenbits.xen.org/libvirt.git ++ : osst...@xenbits.xen.org:/home/xen/git/libvirt.git ++ : git://xenbits.xen.org/libvirt.git ++ : git://xenbits.xen.org/osstest/rumprun.git ++ : git ++ : git://xenbits.xen.org/osstest/rumprun.git ++ : osst...@xenbits.xen.org:/home/xen/git/osstest/rumprun.git ++ : git://git.seabios.org/seabios.git ++ : osst...@xenbits.xen.org:/home/xen/git/osstest/seabios.git ++ : git://xenbits.xen.org/osstest/seabios.git ++ : https://github.com/tianocore/edk2.git ++ : osst...@xenbits.xen.org:/home/xen/git/osstest/ovmf.git ++ : git://xenbits.xen.org/osstest/ovmf.git ++ : git://xenbits.xen.org/osstest/linux-firmware.git ++ : osst...@xenbits.xen.org:/home/osstest/ext/linux-firmware.git ++ : git://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git ++ : osst...@xenbits.xen.org:/home/xen/git/linux-pvops.git ++ : git://xenbits.xen.org/linux-pvops.git ++ : tested/linux-4.9 ++ : tested/linux-arm-xen ++ '[' xgit://xenbits.xen.org/linux-pvops.git = x ']' ++ '[' x = x ']' ++ : git://xenb
Re: [Xen-devel] [PATCH v2 2/3] Introduce migration precopy policy
On Tue, Sep 19, 2017 at 07:06:26PM +0100, Jennifer Herbert wrote: > This Patch allows a migration precopy policy to be specified. > > The precopy phase of the xc_domain_save() live migration algorithm has > historically been implemented to run until either a) (almost) no pages > are dirty or b) some fixed, hard-coded maximum number of precopy > iterations has been exceeded. This policy and its implementation are > less than ideal for a few reasons: > - the logic of the policy is intertwined with the control flow of the > mechanism of the precopy stage > - it can't take into account facts external to the immediate > migration context, such external state transfer state, interactive > user input, or the passage of wall-clock time. > - it does not permit the user to change their mind, over time, about > what to do at the end of the precopy (they get an unconditional > transition into the stop-and-copy phase of the migration) > > To permit callers to implement arbitrary higher-level policies governing > when the live migration precopy phase should end, and what should be > done next: > - add a precopy_policy() callback to the xc_domain_save() user-supplied > callbacks > - during the precopy phase of live migrations, consult this policy after > each batch of pages transmitted and take the dictated action, which > may be to a) abort the migration entirely, b) continue with the > precopy, or c) proceed to the stop-and-copy phase. > - provide an implementation of the old policy, used when > precopy_policy callback is not provided. > > Signed-off-by: Jennifer Herbert > Signed-off-by: Joshua Otto > > --- > > v2: > > Have made a few formatting corrections, added typedef as suggested. > > v1: > > This is updated/modified subset of patch 7/20, part of > Joshua Otto's "Add postcopy live migration support." patch, > dated 27th March 2017. As indicated on the original thread, > I wish to make use of this this within the XenServer product. > I hope this will aid Josh in pushing the remainder of his series. > > --- > tools/libxc/include/xenguest.h | 31 -- > tools/libxc/xc_sr_common.h | 6 +-- > tools/libxc/xc_sr_save.c | 97 > +- > 3 files changed, 97 insertions(+), 37 deletions(-) > > diff --git a/tools/libxc/include/xenguest.h b/tools/libxc/include/xenguest.h > index 6626f0c..a2a654c 100644 > --- a/tools/libxc/include/xenguest.h > +++ b/tools/libxc/include/xenguest.h > @@ -39,6 +39,16 @@ > */ > struct xenevtchn_handle; > > +/* For save's precopy_policy(). */ > +struct precopy_stats > +{ > +unsigned int iteration; > +unsigned int total_written; > +long dirty_count; /* -1 if unknown */ > +}; > + > +typedef int (*precopy_policy_t)(struct precopy_stats, void *); Shouldn't precopy_stats be a pointer (const pointer probably seeing it's usage)? > + > /* callbacks provided by xc_domain_save */ > struct save_callbacks { > /* Called after expiration of checkpoint interval, > @@ -46,7 +56,20 @@ struct save_callbacks { > */ > int (*suspend)(void* data); > > -/* Called after the guest's dirty pages have been > +/* > + * Called after every batch of page data sent during the precopy > + * phase of a live migration to ask the caller what to do next > + * based on the current state of the precopy migration. I would add: "Should return one of the values listed below:" > + */ > +#define XGS_POLICY_ABORT (-1) /* Abandon the migration entirely > +* and tidy up. */ > +#define XGS_POLICY_CONTINUE_PRECOPY 0 /* Remain in the precopy phase. */ > +#define XGS_POLICY_STOP_AND_COPY1 /* Immediately suspend and transmit > the > +* remaining dirty pages. */ > +precopy_policy_t precopy_policy; > + > +/* > + * Called after the guest's dirty pages have been > * copied into an output buffer. > * Callback function resumes the guest & the device model, > * returns to xc_domain_save. > @@ -55,7 +78,8 @@ struct save_callbacks { > */ > int (*postcopy)(void* data); > > -/* Called after the memory checkpoint has been flushed > +/* > + * Called after the memory checkpoint has been flushed > * out into the network. Typical actions performed in this > * callback include: > * (a) send the saved device model state (for HVM guests), > @@ -65,7 +89,8 @@ struct save_callbacks { > * > * returns: > * 0: terminate checkpointing gracefully > - * 1: take another checkpoint */ > + * 1: take another checkpoint ^ trailing space > + */ > int (*checkpoint)(void* data); > > /* > diff --git a/tools/libxc/xc_sr_common.h b/tools/libxc/xc_sr_common.h > index a83f22a..3635704 100644 > --- a/tools/libxc/xc_sr_common.h > +++ b/tools/libxc/xc_sr_common.h > @@ -198,12 +198,10 @@ s
Re: [Xen-devel] [PATCH 0/2] public/*ctl: drop unnecessary typedefs and handles
Hi, On 12/09/17 15:25, Jan Beulich wrote: 1: public/domctl: drop unnecessary typedefs and handles 2: public/sysctl: drop unnecessary typedefs and handles Signed-off-by: Jan Beulich For the ARM changes: Acked-by: Julien Grall Cheers, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v8 13/15] xen: make grant resource limits per domain
On 20/09/17 12:34, Jan Beulich wrote: On 20.09.17 at 08:34, wrote: >> --- a/xen/common/compat/grant_table.c >> +++ b/xen/common/compat/grant_table.c >> @@ -157,21 +157,14 @@ int compat_grant_table_op(unsigned int cmd, >> unsigned int max_frame_list_size_in_page = >> (COMPAT_ARG_XLAT_SIZE - sizeof(*nat.setup)) / >> sizeof(*nat.setup->frame_list.p); >> -if ( max_frame_list_size_in_page < max_grant_frames ) > > The latest here, but perhaps even earlier I think max_grant_frames > should become static, so one can be reasonably certain that all other > references are gone. Patch 14 removes the last references, so I can't do it earlier. >> @@ -290,8 +293,8 @@ num_act_frames_from_sha_frames(const unsigned int num) >> return DIV_ROUND_UP(num * sha_per_page, ACGNT_PER_PAGE); >> } >> >> -#define max_nr_active_grant_frames \ >> -num_act_frames_from_sha_frames(max_grant_frames) >> +#define max_nr_active_grant_frames(gt) \ >> +num_act_frames_from_sha_frames(gt->max_grant_frames) > > Parentheses around gt please. Okay. >> @@ -1718,8 +1724,9 @@ gnttab_grow_table(struct domain *d, unsigned int >> req_nr_frames) >> ASSERT(gt->active); >> >> if ( req_nr_frames < INITIAL_NR_GRANT_FRAMES ) >> -req_nr_frames = INITIAL_NR_GRANT_FRAMES; >> -ASSERT(req_nr_frames <= max_grant_frames); >> +req_nr_frames = min_t(unsigned int, INITIAL_NR_GRANT_FRAMES, >> +gt->max_grant_frames); > > Perhaps give the INITIAL_NR_GRANT_FRAMES value a U suffix > instead of using min_t() here? Okay. >> @@ -1777,13 +1784,15 @@ active_alloc_failed: >> >> static long >> gnttab_setup_table( >> -XEN_GUEST_HANDLE_PARAM(gnttab_setup_table_t) uop, unsigned int count) >> +XEN_GUEST_HANDLE_PARAM(gnttab_setup_table_t) uop, unsigned int count, >> +unsigned int limit_max) >> { >> struct vcpu *curr = current; >> struct gnttab_setup_table op; >> struct domain *d = NULL; >> struct grant_table *gt; >> unsigned int i; >> +long ret = 0; > > Wouldn't int suffice here? I just followed the return type of the function. I think this way is cleaner, but in case you like int better I can change it. >> @@ -1819,6 +1819,21 @@ gnttab_setup_table( >> gt = d->grant_table; >> grant_write_lock(gt); >> >> +if ( unlikely(op.nr_frames > gt->max_grant_frames) ) >> +{ >> +gdprintk(XENLOG_INFO, "Domain is limited to %d grant-table >> frames.\n", > > %u and you also want to log the subject domain ID (which may not > be current's; same for the other log message below as well as the > similar one in gnttab_get_status_frames()). Okay. >> +gt->max_grant_frames); >> +op.status = GNTST_general_error; >> +goto unlock; >> +} >> +if ( unlikely(limit_max < gt->max_grant_frames) ) > > With the check moved here it can be relaxed afaict: Code below > only writes op.nr_frames entries (same then again for > gnttab_get_status_frames()). Okay. >> @@ -1852,10 +1867,10 @@ gnttab_setup_table( >> if ( d ) >> rcu_unlock_domain(d); >> >> -if ( unlikely(__copy_field_to_guest(uop, &op, status)) ) >> +if ( !ret && unlikely(__copy_field_to_guest(uop, &op, status)) ) > > I wonder whether it wouldn't be better to switch that check > producing -EINVAL to also report the failure in op.status, now > that it lives here (same then for gnttab_get_status_frames() > once again). Okay. >> static long >> gnttab_get_status_frames(XEN_GUEST_HANDLE_PARAM(gnttab_get_status_frames_t) >> uop, >> - int count) >> + int count, unsigned int limit_max) > > Take the opportunity and switch count to unsigned int? Okay. >> @@ -3320,7 +3347,7 @@ do_grant_table_op( >> >> case GNTTABOP_setup_table: >> rc = gnttab_setup_table( >> -guest_handle_cast(uop, gnttab_setup_table_t), count); >> +guest_handle_cast(uop, gnttab_setup_table_t), count, ~0); > > UINT_MAX? Yes. >> @@ -3442,6 +3469,8 @@ grant_table_create( >> /* Simple stuff. */ >> percpu_rwlock_resource_init(&t->lock, grant_rwlock); >> spin_lock_init(&t->maptrack_lock); >> +t->max_grant_frames = max_grant_frames; >> +t->max_maptrack_frames = max_maptrack_frames; > > This together with ... > >> @@ -3655,7 +3686,11 @@ int grant_table_set_limits(struct domain *d, unsigned >> int grant_frames, >> >> /* Set limits. */ >> if ( !gt->active ) >> +{ >> +gt->max_grant_frames = grant_frames; >> +gt->max_maptrack_frames = maptrack_frames; >> ret = grant_table_init(gt); >> +} > > .. this raises the question of whether it is legal to decrease the > limits. There may be code depending on it only ever growing. Before grant_table_init() has been called there is no problem decreasing the limits, as nothing depending on them
Re: [Xen-devel] minimum Python version
Jan Beulich writes ("minimum Python version"): > for quite a while (apparently as of Marek's series in February) I've > been seeing > > xen/lowlevel/xc/xc.c: In function ‘pyxc_dom_extract_cpuid’: > xen/lowlevel/xc/xc.c:692:13: error: implicit declaration of function > ‘PyBytes_AS_STRING’ [-Werror=implicit-function-declaration] "PyBytes_AS_STRING" is a compatibility alias to help porting to Python 3. They are provided in Python 2.6.0 and later. https://docs.python.org/release/2.6/c-api/string.html Python 2.6.0 was released on 1st October 2008. In a week and a half's time that will be 10 years ago. Under the circumstances we should clearly keep the Python-3-compatible naming. > on an older system having Python 2.4 on it. The minimum version > configure checks for appears to be 2.3, matching up with what > ./README says. Since the Python bindings aren't something one > absolutely needs (afaict), wouldn't it be reasonable to check for > a suitable version and if that fails simply disable their building > (which currently is unconditional)? What I'd like to avoid is ending > up with being unable to build the tools on that system - I know > it's rather old, but there's a reason I'd like to keep it (including its > old distro level). I would welcome a patch to change the minimum python version to 2.6. I would also welcome a patch to make a too-old python version disable the python build rather than causing the whole build to break. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] minimum Python version
Jan Beulich writes ("Re: [Xen-devel] minimum Python version"): > On 20.09.17 at 10:49, wrote: > > What about the following idea: maybe we could add an option to configure > > to make it fail instead of disabling components in case some > > requirements for that component are not met? The default could be to > > just disable the components and with the option specified all components > > not disabled explicitly must be buildable or configure will fail. > > If that's something that (a) can be reasonably expressed in > configure.ac and (b) the tools maintainers can live with, that's > certainly an option. I'd could live with that. Personally I think the whole purpose of configure is to allow automatic building of what's possible, leaving out anything else. So I think the default for this new option should be to suppress parts of the build whose dependencies are not satisfied, rather than to bomb out. (I wrote "nonessential parts" but I thinkthere are no essential parts. The hypervisor itself is already "nonessential" by this defininion - i386 builds skip it. And the tools are nonessential since you might be building on amd64 just for the hypervisor...) And, yes, configure can express that. It is an m4-generated shell script. My reservation is that this proposed new option would make every dependency for every feature more complicated. So to avoid that it would be necessary to invent new macros to generate the formulaic enable/disable code. I doubt anyone has the effort for that. Jan, I suggest you submit at the very least a patch to change the minimum to 2.6. Personally I would accept one to have it disable python instead of bombing out, if you would like to write it. That is the way we deal with many other nonessential components already. I think that if we want something more sophisticated, involving optionally bombing out on dependency trouble, it's up to people who want that to do the work. (As an aside, I don't think this is a distro/upstream division. With my Debian maintainer had on I actually prefer the automatic fallback. Our build-dependency system avoids accidental creation of broken packages in normal situations, and having graceful degradation makes it easier to do unusual things eg in derivatives, when cross-building, or whatever.) Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 3/3] RFC: migration: defer precopy policy to libxl
On Tue, Sep 19, 2017 at 07:06:27PM +0100, Jennifer Herbert wrote: > Provide an implementation of the old policy as a callback in > libxl and plumb it through the IPC machinery to libxc. > > This serves as an example for defining a libxl policy, > and provides no advantage over the default policy in libxc. > > Signed-off-by: Joshua Otto LGTM: Reviewed-by: Roger Pau Monné Have we ever thought of providing libxl consumers the ability to set these hooks? Would that be interesting? Thanks, Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2] xen: grant-table: Simplify get_paged_frame
Hi Roger, On 19/09/17 12:40, Roger Pau Monné wrote: On Tue, Sep 19, 2017 at 10:44:53AM +0100, Julien Grall wrote: Hi, On 19/09/17 08:13, Roger Pau Monné wrote: On Mon, Sep 18, 2017 at 06:32:22PM +0100, Julien Grall wrote: Hi Roger, On 18/09/17 17:58, Roger Pau Monné wrote: On Mon, Sep 18, 2017 at 05:27:52PM +0100, Julien Grall wrote: The implementation of get_paged_frame is currently different whether the architecture support sharing memory or paging memory. Both version are extremely similar so it is possible to consolidate in a single implementation. The main difference is the x86 version will allow grant on foreign page when using HVM/PVH whilst Arm does not. At the moment, on x86 foreign pages are only allowed for PVH Dom0. It seems that foreign pages should never be granted so deny them The check for shared/paged memory are now gated with the respective ifdef. Potentially, dummy p2m_is_shared/p2m_is_paging could be implemented for Arm. Signed-off-by: Julien Grall --- Cc: Andrew Cooper Cc: George Dunlap Cc: Ian Jackson Cc: Jan Beulich Cc: Konrad Rzeszutek Wilk Cc: Stefano Stabellini Cc: Tim Deegan Cc: Wei Liu Changes in v2: - Deny grant on foreign page (aligned with the ARM code) - Use #ifdef rather than #if defined - Update commit message - Fix typo in the title get_page_from_gfn will be able to get reference on foreign page and as per my understanding will allow to grant page on foreign memory. This was not allowed with a simple get_page(...) on the ARM implementation (no sharing nor paging supprot) but is allowed on the x86 implementation due to get_page_from_gfn. On x86, foreign pages are currently only allowed for PVH dom0, so I think it is not a big deal for now. On Arm, foreign pages can be present on any domain. So this patch would permit grant on foreing pages. This patch will deny granting foreign pages. Jan Beulich is happy with it. Any other opinions? Won't this break QEMU running in stub domains? I haven't tested it, but I'm afraid QEMU running in a stub domain might try to grant a foreign frame. Ie: the emulated network code in QEMU might try to grant a foreign frame in order to forward operations from emulated devices to PV frontends. I don't think it will break any existing setup because foreign mapping are only allowed for the hardware domain (see p2m_add_foreign). IIRC this only applies to auto-translated (HVM) domains, not to PV domains. QEMU stubdomains are PV guests. Still, p2m_map_foreign can only be used for auto-translated domains. Oh, so you are limiting this only for auto-translated guests? Should have looked more closely at the commit message. That's right. I am happy to update the commit message if it is not clear enough. Cheers, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 0/2] public/*ctl: drop unnecessary typedefs and handles
>>> On 20.09.17 at 13:07, wrote: > On 12/09/17 15:25, Jan Beulich wrote: >> 1: public/domctl: drop unnecessary typedefs and handles >> 2: public/sysctl: drop unnecessary typedefs and handles >> >> Signed-off-by: Jan Beulich > > For the ARM changes: > > Acked-by: Julien Grall Thanks, but I have this on record already. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] xen: credit2: fix spinlock irq-safety violation
On Wed, 2017-09-20 at 11:04 +0100, George Dunlap wrote: > On 09/20/2017 10:19 AM, George Dunlap wrote: > > > diff --git a/xen/common/sched_credit2.c > > > b/xen/common/sched_credit2.c > > > index 32234ac..7a550db 100644 > > > --- a/xen/common/sched_credit2.c > > > +++ b/xen/common/sched_credit2.c > > > @@ -2923,13 +2923,13 @@ csched2_free_domdata(const struct > > > scheduler *ops, void *data) > > > > > > write_lock_irqsave(&prv->lock, flags); > > > > > > -kill_timer(sdom->repl_timer); > > > -xfree(sdom->repl_timer); > > > - > > > list_del_init(&sdom->sdom_elem); > > > > > > write_unlock_irqrestore(&prv->lock, flags); > > > > > > +kill_timer(sdom->repl_timer); > > > +xfree(sdom->repl_timer); > > > > Any particular reason for moving the kill_timer() as well as the > > xfree > > outside the lock? What happens if the timer goes off after the > > irqrestore but before the kill_timer? > > Looks like if the timer fires, nothing terribly bad will happen; it > will > just do a useless budget replenishment. > It's just that it has not reason to be there, as nothing that it does is serialized by prv->lock, so it only makes the critical section (with IRQ disabled) longer, for no reason, which is bad, as this being a write_lock(), it'll stop readers too. I think it was (my) mistake to put it there in the first place. > It looks like kill_timer() disables irqs, so it could be moved inside > the critical section. I'm inclined to say we should do so. I don't > anticipate the budget replenishment ever to need to walk the domain > list, but should that change, this would be a bear of a bug to find. > Indeed it's rather unlikely for the replenishment handler to have to use sdom_elem to go through the list of domains. IAC, if you're concerned about that, I'd much rather put both kill_timer() and xfree() before the critical section, rather than after, like in the attached patch. Dario -- <> (Raistlin Majere) - Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)--- Begin Message --- [PATCH v2] xen: credit2: fix spinlock irq-safety violation In commit ad4b3e1e9df34 ("xen: credit2: implement utilization cap") xfree() was being called (for deallocating the budget replenishment timer, during domain destruction) inside an IRQ disabled critical section. That must not happen, as it uses the mem-pool's lock, which needs to be taken with IRQ enabled. And, in fact, we crash (in debug builds): (XEN) (XEN) Panic on CPU 0: (XEN) Xen BUG at spinlock.c:47 (XEN) Let's, therefore, kill and deallocate the timer outside of the critical sections, when IRQs are enabled. Signed-off-by: Dario Faggioli --- Cc: osstest service owner Cc: George Dunlap Cc: Wei Liu --- This was spotted by OSSTest's flight 113562: http://logs.test-lab.xenproject.org/osstest/logs/113562/ http://logs.test-lab.xenproject.org/osstest/logs/113562/test-amd64-amd64-xl-credit2/serial-godello0.log --- Changes from v1: - kill_timer() and xfree() moved above critical section, instead than below. diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c index 32234ac..32b0363 100644 --- a/xen/common/sched_credit2.c +++ b/xen/common/sched_credit2.c @@ -2921,11 +2921,11 @@ csched2_free_domdata(const struct scheduler *ops, void *data) struct csched2_dom *sdom = data; struct csched2_private *prv = csched2_priv(ops); -write_lock_irqsave(&prv->lock, flags); - kill_timer(sdom->repl_timer); xfree(sdom->repl_timer); +write_lock_irqsave(&prv->lock, flags); + list_del_init(&sdom->sdom_elem); write_unlock_irqrestore(&prv->lock, flags); --- End Message --- signature.asc Description: This is a digitally signed message part ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v8 13/15] xen: make grant resource limits per domain
>>> On 20.09.17 at 13:10, wrote: > On 20/09/17 12:34, Jan Beulich wrote: > On 20.09.17 at 08:34, wrote: >>> --- a/xen/common/compat/grant_table.c >>> +++ b/xen/common/compat/grant_table.c >>> @@ -157,21 +157,14 @@ int compat_grant_table_op(unsigned int cmd, >>> unsigned int max_frame_list_size_in_page = >>> (COMPAT_ARG_XLAT_SIZE - sizeof(*nat.setup)) / >>> sizeof(*nat.setup->frame_list.p); >>> -if ( max_frame_list_size_in_page < max_grant_frames ) >> >> The latest here, but perhaps even earlier I think max_grant_frames >> should become static, so one can be reasonably certain that all other >> references are gone. > > Patch 14 removes the last references, so I can't do it earlier. That's unfortunate, but okay. >>> @@ -1777,13 +1784,15 @@ active_alloc_failed: >>> >>> static long >>> gnttab_setup_table( >>> -XEN_GUEST_HANDLE_PARAM(gnttab_setup_table_t) uop, unsigned int count) >>> +XEN_GUEST_HANDLE_PARAM(gnttab_setup_table_t) uop, unsigned int count, >>> +unsigned int limit_max) >>> { >>> struct vcpu *curr = current; >>> struct gnttab_setup_table op; >>> struct domain *d = NULL; >>> struct grant_table *gt; >>> unsigned int i; >>> +long ret = 0; >> >> Wouldn't int suffice here? > > I just followed the return type of the function. I think this way is > cleaner, but in case you like int better I can change it. I sort of expected this reply, but that's not in line with what you did in gnttab_get_status_frames() then. I think we will want to eventually change all function return types to int where the wider type isn't needed. >>> @@ -3442,6 +3469,8 @@ grant_table_create( >>> /* Simple stuff. */ >>> percpu_rwlock_resource_init(&t->lock, grant_rwlock); >>> spin_lock_init(&t->maptrack_lock); >>> +t->max_grant_frames = max_grant_frames; >>> +t->max_maptrack_frames = max_maptrack_frames; >> >> This together with ... >> >>> @@ -3655,7 +3686,11 @@ int grant_table_set_limits(struct domain *d, >>> unsigned int grant_frames, >>> >>> /* Set limits. */ >>> if ( !gt->active ) >>> +{ >>> +gt->max_grant_frames = grant_frames; >>> +gt->max_maptrack_frames = maptrack_frames; >>> ret = grant_table_init(gt); >>> +} >> >> .. this raises the question of whether it is legal to decrease the >> limits. There may be code depending on it only ever growing. > > Before grant_table_init() has been called there is no problem > decreasing the limits, as nothing depending on them has been setup > yet. Oh, right, I didn't pay attention to this being a one-time action. >> Additionally to take the input values without applying some >> upper cap - while we have XSA-77, we still shouldn't introduce >> new issues making disaggregation more unsafe. Perhaps the >> global limits could serve as a cap here? > > I thought about a cap and TBH I'm not sure which would be sane to > apply. The global limits seem wrong, especially looking at patch 14: > those limits will be for dom0 only then. And dom0 won't need many > grant frames in the normal case... I've been thinking about this Dom0 aspect too over lunch. What about allowing the hardware domain to set its limit (only upwards of course) in setup_table(), without any upper bound enforced? This would free up the globals to be used as system wide limits again. > So I could make up a cap in form of either a configurable constant > (CONFIG_* or boot parameter?) or as a fraction of domain memory. Any > preferences here? A config constant as well as a fraction of domain memory might lock out special purpose guests. Which would leave command line options - as per above perhaps the ones we already have. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [libvirt test] 113616: tolerable all pass - PUSHED
flight 113616 libvirt real [real] http://logs.test-lab.xenproject.org/osstest/logs/113616/ Failures :-/ but no regressions. Tests which did not succeed, but are not blocking: test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail like 113592 test-armhf-armhf-libvirt 14 saverestore-support-checkfail like 113592 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail like 113592 test-amd64-i386-libvirt-xsm 13 migrate-support-checkfail never pass test-amd64-i386-libvirt 13 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail never pass test-amd64-amd64-libvirt 13 migrate-support-checkfail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-amd64-i386-libvirt-qcow2 12 migrate-support-checkfail never pass test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail never pass test-armhf-armhf-libvirt 13 migrate-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail never pass version targeted for testing: libvirt dbd98380b9180afab0a9149a2ba4a58bf06d0e02 baseline version: libvirt 3faf3ca60af342ac74a17d9d6decc70a947d Last test of basis 113592 2017-09-19 04:20:58 Z1 days Testing same since 113616 2017-09-20 04:20:57 Z0 days1 attempts People who touched revisions under test: Andrea Bolognani Ashish Mittal Erik Skultety John Ferlan Ján Tomko Laine Stump Michal Privoznik Nikolay Shirokovskiy Peter Krempa jobs: build-amd64-xsm pass build-armhf-xsm pass build-i386-xsm pass build-amd64 pass build-armhf pass build-i386 pass build-amd64-libvirt pass build-armhf-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-armhf-pvopspass build-i386-pvops pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmpass test-amd64-amd64-libvirt-xsm pass test-armhf-armhf-libvirt-xsm pass test-amd64-i386-libvirt-xsm pass test-amd64-amd64-libvirt pass test-armhf-armhf-libvirt pass test-amd64-i386-libvirt pass test-amd64-amd64-libvirt-pairpass test-amd64-i386-libvirt-pair pass test-amd64-i386-libvirt-qcow2pass test-armhf-armhf-libvirt-raw pass test-amd64-amd64-libvirt-vhd pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : + branch=libvirt + revision=dbd98380b9180afab0a9149a2ba4a58bf06d0e02 + . ./cri-lock-repos ++ . ./cri-common +++ . ./cri-getconfig export PERLLIB=.:. PERLLIB=.:. +++ umask 002 +++ getrepos getconfig Repos perl -e ' use Osstest; readglobalconfig(); print $c{"Repos"} or die $!; ' +++ local repos=/home/osstest/repos +++ '[' -z /home/osstest/repos ']' +++ '[' '!' -d /home/osstest/repos ']' +++ echo /home/osstest/repos ++ repos=/home/osstest/repos ++ repos_lock=/home/osstest/repos/lock ++ '[' x '!=' x/home/osstest/repos/lock ']' ++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock ++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push libvirt
Re: [Xen-devel] [PATCH] xen: credit2: fix spinlock irq-safety violation
>>> On 20.09.17 at 13:44, wrote: > IAC, if you're concerned about that, I'd much rather put both > kill_timer() and xfree() before the critical section, rather than > after, like in the attached patch. Hmm, killing the timer upfront is certainly fine, but is freeing the data before removing the element from the list safe not only currently, but also going forward? Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v8 14/15] xen: make grant table limits boot parameters dom0 only
>>> On 20.09.17 at 08:34, wrote: > --- a/xen/arch/arm/domain_build.c > +++ b/xen/arch/arm/domain_build.c > @@ -2098,7 +2098,7 @@ static void __init find_gnttab_region(struct domain *d, > kinfo->gnttab_size = (_etext - _stext) & PAGE_MASK; > > /* Make sure the grant table will fit in the region */ > -if ( (kinfo->gnttab_size >> PAGE_SHIFT) < max_grant_frames ) > +if ( grant_table_verify_size(d, kinfo->gnttab_size >> PAGE_SHIFT) ) > panic("Cannot find a space for the grant table region\n"); I can see how this may cause complications with my alternative proposal for the meaning of the command line options, but that's solvable for sure. > @@ -3462,6 +3449,10 @@ grant_table_create( > struct domain *d) > { > struct grant_table *t; > +static unsigned int max_grant_frames; > +static unsigned int max_maptrack_frames; > +integer_param("gnttab_max_frames", max_grant_frames); > +integer_param("gnttab_max_maptrack_frames", max_maptrack_frames); > > if ( (t = xzalloc(struct grant_table)) == NULL ) > return -ENOMEM; > @@ -3469,14 +3460,17 @@ grant_table_create( > /* Simple stuff. */ > percpu_rwlock_resource_init(&t->lock, grant_rwlock); > spin_lock_init(&t->maptrack_lock); > -t->max_grant_frames = max_grant_frames; > -t->max_maptrack_frames = max_maptrack_frames; > > /* Okay, install the structure. */ > d->grant_table = t; > > if ( d->domain_id == 0 ) > +{ > +t->max_grant_frames = max_grant_frames ? : > DEFAULT_MAX_NR_GRANT_FRAMES; > +t->max_maptrack_frames = > + max_maptrack_frames ? : > DEFAULT_MAX_MAPTRACK_FRAMES; > return grant_table_init(t); > +} The use here makes it that in effect the two variables could then become __initdata. Maybe their definition should then even move here (if the other proposal would be discarded). > +bool __init grant_table_verify_size(struct domain *d, unsigned int frames) const Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v8 15/15] xen: add new Xen cpuid node for max address width info
>>> On 20.09.17 at 08:34, wrote: > On very large hosts a guest needs to know whether it will have to ... a PV guest ... > handle frame numbers larger than 32 bits in order to select the > appropriate grant interface version. > > Add a new Xen specific CPUID node to contain the maximum guest address > width "guest address width" is ambiguous here, the more when looking at what you actually return. We should no longer allow ourselves to mix up the different address spaces. The limit you want to report here is that in MFN space, which ought to be of no relevance to HVM guests. Therefore I'm against uniformly exposing this (as much as almost no other host property should have any relevance for HVM guests), and would instead like to see a PV-only leaf just like we already have a HVM only one. > --- a/xen/arch/x86/traps.c > +++ b/xen/arch/x86/traps.c > @@ -929,6 +929,10 @@ void cpuid_hypervisor_leaves(const struct vcpu *v, > uint32_t leaf, > res->b = v->vcpu_id; > break; > > +case 5: /* Host specific parameters */ > +res->a = generic_flsl(get_upper_mfn_bound() - 1) + PAGE_SHIFT; > +break; Already when looking at the patch introducing the function I was wondering whether the function wouldn't better return the highest frame number instead of the first invalid one. From an abstract perspective this would allow an arch (e.g. ARM32) to report that all addresses are valid. > --- a/xen/include/public/arch-x86/cpuid.h > +++ b/xen/include/public/arch-x86/cpuid.h > @@ -85,6 +85,15 @@ > #define XEN_HVM_CPUID_IOMMU_MAPPINGS (1u << 2) > #define XEN_HVM_CPUID_VCPU_ID_PRESENT (1u << 3) /* vcpu id is present in > EBX > */ > > -#define XEN_CPUID_MAX_NUM_LEAVES 4 > +/* > + * Leaf 6 (0x4x05) > + * Host specific parameters > + * EAX: bits 0-7: max guest address width > + */ > + > +/* Max. address width in bits taking memory hotplug into account. */ > +#define XEN_CPUID_GUEST_ADDRESS_WIDTH_MASK (255u << 0) Please use hex numbers for multi-digit masks, and please make explicit (at least in the comment, perhaps also in the constant's name) which "address" is being talked about here. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [linux-linus test] 113615: regressions - FAIL
flight 113615 linux-linus real [real] http://logs.test-lab.xenproject.org/osstest/logs/113615/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-armhf-armhf-xl-vhd 6 xen-install fail REGR. vs. 113605 Regressions which are regarded as allowable (not blocking): test-amd64-i386-rumprun-i386 17 rumprun-demo-xenstorels/xenstorels.repeat fail REGR. vs. 113605 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stop fail REGR. vs. 113605 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stopfail REGR. vs. 113605 Tests which did not succeed, but are not blocking: test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stop fail blocked in 113605 test-armhf-armhf-libvirt 14 saverestore-support-checkfail like 113605 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail like 113605 test-amd64-amd64-xl-rtds 10 debian-install fail like 113605 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail like 113605 test-amd64-amd64-xl-qemut-ws16-amd64 10 windows-installfail never pass test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail never pass test-amd64-i386-libvirt 13 migrate-support-checkfail never pass test-amd64-amd64-libvirt 13 migrate-support-checkfail never pass test-amd64-i386-libvirt-xsm 13 migrate-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-amd64-i386-libvirt-qcow2 12 migrate-support-checkfail never pass test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail never pass test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2 fail never pass test-armhf-armhf-libvirt 13 migrate-support-checkfail never pass test-armhf-armhf-xl 13 migrate-support-checkfail never pass test-armhf-armhf-xl 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-arndale 13 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 14 saverestore-support-checkfail never pass test-amd64-i386-xl-qemuu-ws16-amd64 13 guest-saverestore fail never pass test-amd64-amd64-xl-qemuu-ws16-amd64 10 windows-installfail never pass test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail never pass test-amd64-i386-xl-qemut-ws16-amd64 13 guest-saverestore fail never pass test-armhf-armhf-xl-rtds 13 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 13 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 14 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail never pass test-armhf-armhf-xl-xsm 13 migrate-support-checkfail never pass test-armhf-armhf-xl-xsm 14 saverestore-support-checkfail never pass test-amd64-amd64-xl-qemut-win10-i386 10 windows-installfail never pass test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass test-amd64-i386-xl-qemut-win10-i386 10 windows-install fail never pass test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail never pass version targeted for testing: linux820bf5c419e4b85298e5c3001bd1b5be46d60765 baseline version: linux12fcf66e74b16b96e57fc1ce32bdf27b3a426fd0 Last test of basis 113605 2017-09-19 15:52:58 Z0 days Testing same since 113615 2017-09-20 03:19:38 Z0 days1 attempts People who touched revisions under test: "Eric W. Biederman" Arnd Bergmann Christoph Hellwig Eric W. Biederman Hannes Reinecke Hannes Reinecke Haozhong Zhang Jan H. Schönherr Linus Torvalds Lukas Czerner Martin K. Petersen Radim KrÄmáŠYu Zhang jobs: build-amd64-xsm pass build-armhf-xsm pass build-i386-xsm pass build-amd64 pass build-armhf pass build-i386 pass build-amd64-libvirt pass build-armhf-libvirt
Re: [Xen-devel] [PATCH v4 3/3] x86/hvm: Implement hvmemul_write() using real mappings
>>> On 20.09.17 at 11:22, wrote: > +static void *hvmemul_map_linear_addr( > +unsigned long linear, unsigned int bytes, uint32_t pfec, > +struct hvm_emulate_ctxt *hvmemul_ctxt) > +{ > +struct vcpu *curr = current; > +void *err, *mapping; > + > +/* First and final gfns which need mapping. */ > +unsigned long frame = linear >> PAGE_SHIFT, first = frame; > +unsigned long final = (linear + bytes - !!bytes) >> PAGE_SHIFT; > + > +/* > + * mfn points to the next free slot. All used slots have a page > reference > + * held on them. > + */ > +mfn_t *mfn = &hvmemul_ctxt->mfn[0]; > + > +/* > + * The caller has no legitimate reason for trying a zero-byte write, but > + * final is calculate to fail safe in release builds. > + * > + * The maximum write size depends on the number of adjacent mfns[] which > + * can be vmap()'d, accouting for possible misalignment within the > region. > + * The higher level emulation callers are responsible for ensuring that > + * mfns[] is large enough for the requested write size. > + */ > +if ( bytes == 0 || > + final - first >= ARRAY_SIZE(hvmemul_ctxt->mfn) ) > +{ > +ASSERT_UNREACHABLE(); > +goto unhandleable; > +} > + > +do { > +enum hvm_translation_result res; > +struct page_info *page; > +pagefault_info_t pfinfo; > +p2m_type_t p2mt; > + > +/* Error checking. Confirm that the current slot is clean. */ > +ASSERT(mfn_x(*mfn) == 0); > + > +res = hvm_translate_get_page(curr, frame << PAGE_SHIFT, true, pfec, > + &pfinfo, &page, NULL, &p2mt); > + > +switch ( res ) > +{ > +case HVMTRANS_okay: > +break; > + > +case HVMTRANS_bad_linear_to_gfn: > +x86_emul_pagefault(pfinfo.ec, pfinfo.linear, > &hvmemul_ctxt->ctxt); > +err = ERR_PTR(~X86EMUL_EXCEPTION); > +goto out; > + > +case HVMTRANS_bad_gfn_to_mfn: > +err = NULL; > +goto out; > + > +case HVMTRANS_gfn_paged_out: > +case HVMTRANS_gfn_shared: > +err = ERR_PTR(~X86EMUL_RETRY); > +goto out; > + > +default: > +goto unhandleable; > +} > + > +if ( p2m_is_discard_write(p2mt) ) > +{ > +err = ERR_PTR(~X86EMUL_OKAY); > +goto out; > +} > + > +*mfn++ = _mfn(page_to_mfn(page)); > + > +} while ( ++frame < final ); Interesting - I had specifically pointed out in a reply to v3 that the increment of mfn _cannot_ be moved down here: You're now leaking a page ref on the p2m_is_discard_write() error path afaict. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] Ping: [PATCH v4 1/3] x86/hvm: Rename enum hvm_copy_result to hvm_translation_result
>>> On 20.09.17 at 11:22, wrote: > From: Andrew Cooper > > Signed-off-by: Andrew Cooper > Acked-by: Tim Deegan > Acked-by: Jan Beulich > Reviewed-by: Kevin Tian > Acked-by: George Dunlap > --- > xen/arch/x86/hvm/dom0_build.c | 2 +- > xen/arch/x86/hvm/emulate.c| 40 ++-- > xen/arch/x86/hvm/hvm.c| 56 > +++ > xen/arch/x86/hvm/intercept.c | 20 +++--- > xen/arch/x86/hvm/svm/nestedsvm.c | 5 ++-- > xen/arch/x86/hvm/svm/svm.c| 2 +- Boris, Suravee? The missing ack is the only thing keeping this and patch 2 from going in. Thanks, Jan > xen/arch/x86/hvm/viridian.c | 2 +- > xen/arch/x86/hvm/vmsi.c | 2 +- > xen/arch/x86/hvm/vmx/realmode.c | 2 +- > xen/arch/x86/hvm/vmx/vvmx.c | 14 +- > xen/arch/x86/mm/shadow/common.c | 12 - > xen/common/libelf/libelf-loader.c | 4 +-- > xen/include/asm-x86/hvm/support.h | 40 ++-- > 13 files changed, 101 insertions(+), 100 deletions(-) > > diff --git a/xen/arch/x86/hvm/dom0_build.c b/xen/arch/x86/hvm/dom0_build.c > index 020c355..e8f746c 100644 > --- a/xen/arch/x86/hvm/dom0_build.c > +++ b/xen/arch/x86/hvm/dom0_build.c > @@ -238,7 +238,7 @@ static int __init pvh_setup_vmx_realmode_helpers(struct > domain *d) > if ( !pvh_steal_ram(d, HVM_VM86_TSS_SIZE, 128, GB(4), &gaddr) ) > { > if ( hvm_copy_to_guest_phys(gaddr, NULL, HVM_VM86_TSS_SIZE, v) != > - HVMCOPY_okay ) > + HVMTRANS_okay ) > printk("Unable to zero VM86 TSS area\n"); > d->arch.hvm_domain.params[HVM_PARAM_VM86_TSS_SIZED] = > VM86_TSS_UPDATED | ((uint64_t)HVM_VM86_TSS_SIZE << 32) | gaddr; > diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c > index 54811c1..cc874ce 100644 > --- a/xen/arch/x86/hvm/emulate.c > +++ b/xen/arch/x86/hvm/emulate.c > @@ -100,7 +100,7 @@ static int ioreq_server_read(const struct hvm_io_handler > *io_handler, > uint32_t size, > uint64_t *data) > { > -if ( hvm_copy_from_guest_phys(data, addr, size) != HVMCOPY_okay ) > +if ( hvm_copy_from_guest_phys(data, addr, size) != HVMTRANS_okay ) > return X86EMUL_UNHANDLEABLE; > > return X86EMUL_OKAY; > @@ -893,18 +893,18 @@ static int __hvmemul_read( > > switch ( rc ) > { > -case HVMCOPY_okay: > +case HVMTRANS_okay: > break; > -case HVMCOPY_bad_gva_to_gfn: > +case HVMTRANS_bad_linear_to_gfn: > x86_emul_pagefault(pfinfo.ec, pfinfo.linear, &hvmemul_ctxt->ctxt); > return X86EMUL_EXCEPTION; > -case HVMCOPY_bad_gfn_to_mfn: > +case HVMTRANS_bad_gfn_to_mfn: > if ( access_type == hvm_access_insn_fetch ) > return X86EMUL_UNHANDLEABLE; > > return hvmemul_linear_mmio_read(addr, bytes, p_data, pfec, > hvmemul_ctxt, 0); > -case HVMCOPY_gfn_paged_out: > -case HVMCOPY_gfn_shared: > +case HVMTRANS_gfn_paged_out: > +case HVMTRANS_gfn_shared: > return X86EMUL_RETRY; > default: > return X86EMUL_UNHANDLEABLE; > @@ -1012,15 +1012,15 @@ static int hvmemul_write( > > switch ( rc ) > { > -case HVMCOPY_okay: > +case HVMTRANS_okay: > break; > -case HVMCOPY_bad_gva_to_gfn: > +case HVMTRANS_bad_linear_to_gfn: > x86_emul_pagefault(pfinfo.ec, pfinfo.linear, &hvmemul_ctxt->ctxt); > return X86EMUL_EXCEPTION; > -case HVMCOPY_bad_gfn_to_mfn: > +case HVMTRANS_bad_gfn_to_mfn: > return hvmemul_linear_mmio_write(addr, bytes, p_data, pfec, > hvmemul_ctxt, 0); > -case HVMCOPY_gfn_paged_out: > -case HVMCOPY_gfn_shared: > +case HVMTRANS_gfn_paged_out: > +case HVMTRANS_gfn_shared: > return X86EMUL_RETRY; > default: > return X86EMUL_UNHANDLEABLE; > @@ -1384,7 +1384,7 @@ static int hvmemul_rep_movs( > return rc; > } > > -rc = HVMCOPY_okay; > +rc = HVMTRANS_okay; > } > else > /* > @@ -1394,16 +1394,16 @@ static int hvmemul_rep_movs( > */ > rc = hvm_copy_from_guest_phys(buf, sgpa, bytes); > > -if ( rc == HVMCOPY_okay ) > +if ( rc == HVMTRANS_okay ) > rc = hvm_copy_to_guest_phys(dgpa, buf, bytes, current); > > xfree(buf); > > -if ( rc == HVMCOPY_gfn_paged_out ) > +if ( rc == HVMTRANS_gfn_paged_out ) > return X86EMUL_RETRY; > -if ( rc == HVMCOPY_gfn_shared ) > +if ( rc == HVMTRANS_gfn_shared ) > return X86EMUL_RETRY; > -if ( rc != HVMCOPY_okay ) > +if ( rc != HVMTRANS_okay ) > { > gdprintk(XENLOG_WARNING, "Failed memory-to-memory REP MOVS: sgpa=%" > PRIpaddr" dgpa=%"PRIpaddr" reps=%lu bytes_per_rep=%u\n", > @@ -1513,10 +1513,10 @@ static int hvmemul_rep_stos( > > switch ( rc ) > { > -case HVMCOPY_gfn_paged_ou
Re: [Xen-devel] [PATCH v4 3/3] x86/hvm: Implement hvmemul_write() using real mappings
On Mi, 2017-09-20 at 06:24 -0600, Jan Beulich wrote: > > > > > > > > > > > > > On 20.09.17 at 11:22, wrote: > > +static void *hvmemul_map_linear_addr( > > +unsigned long linear, unsigned int bytes, uint32_t pfec, > > +struct hvm_emulate_ctxt *hvmemul_ctxt) > > +{ > > +struct vcpu *curr = current; > > +void *err, *mapping; > > + > > +/* First and final gfns which need mapping. */ > > +unsigned long frame = linear >> PAGE_SHIFT, first = frame; > > +unsigned long final = (linear + bytes - !!bytes) >> > > PAGE_SHIFT; > > + > > +/* > > + * mfn points to the next free slot. All used slots have a > > page reference > > + * held on them. > > + */ > > +mfn_t *mfn = &hvmemul_ctxt->mfn[0]; > > + > > +/* > > + * The caller has no legitimate reason for trying a zero-byte > > write, but > > + * final is calculate to fail safe in release builds. > > + * > > + * The maximum write size depends on the number of adjacent > > mfns[] which > > + * can be vmap()'d, accouting for possible misalignment within > > the region. > > + * The higher level emulation callers are responsible for > > ensuring that > > + * mfns[] is large enough for the requested write size. > > + */ > > +if ( bytes == 0 || > > + final - first >= ARRAY_SIZE(hvmemul_ctxt->mfn) ) > > +{ > > +ASSERT_UNREACHABLE(); > > +goto unhandleable; > > +} > > + > > +do { > > +enum hvm_translation_result res; > > +struct page_info *page; > > +pagefault_info_t pfinfo; > > +p2m_type_t p2mt; > > + > > +/* Error checking. Confirm that the current slot is > > clean. */ > > +ASSERT(mfn_x(*mfn) == 0); > > + > > +res = hvm_translate_get_page(curr, frame << PAGE_SHIFT, > > true, pfec, > > + &pfinfo, &page, NULL, &p2mt); > > + > > +switch ( res ) > > +{ > > +case HVMTRANS_okay: > > +break; > > + > > +case HVMTRANS_bad_linear_to_gfn: > > +x86_emul_pagefault(pfinfo.ec, pfinfo.linear, > > &hvmemul_ctxt->ctxt); > > +err = ERR_PTR(~X86EMUL_EXCEPTION); > > +goto out; > > + > > +case HVMTRANS_bad_gfn_to_mfn: > > +err = NULL; > > +goto out; > > + > > +case HVMTRANS_gfn_paged_out: > > +case HVMTRANS_gfn_shared: > > +err = ERR_PTR(~X86EMUL_RETRY); > > +goto out; > > + > > +default: > > +goto unhandleable; > > +} > > + > > +if ( p2m_is_discard_write(p2mt) ) > > +{ > > +err = ERR_PTR(~X86EMUL_OKAY); > > +goto out; > > +} > > + > > +*mfn++ = _mfn(page_to_mfn(page)); > > + > > +} while ( ++frame < final ); > Interesting - I had specifically pointed out in a reply to v3 that > the > increment of mfn _cannot_ be moved down here: You're now > leaking a page ref on the p2m_is_discard_write() error path afaict. > Sorry about that, I realized the error after reading it again. I'll wait for all the comments until doing the final version. Alex This email was scanned by Bitdefender ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v8 01/15] xen: move XENMAPSPACE_grant_table code into grant_table.c
Hi Juergen, On 20/09/17 07:34, Juergen Gross wrote: The x86 and arm versions of XENMAPSPACE_grant_table handling are nearly identical. Move the code into a function in grant_table.c and add an architecture dependant hook to handle the differences. Switch to mfn_t in order to be more type safe. Signed-off-by: Juergen Gross Reviewed-by: Paul Durrant Acked-by: Jan Beulich Acked-by: Julien Grall Cheers, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v8 13/15] xen: make grant resource limits per domain
On 20/09/17 13:48, Jan Beulich wrote: On 20.09.17 at 13:10, wrote: >> On 20/09/17 12:34, Jan Beulich wrote: >> On 20.09.17 at 08:34, wrote: --- a/xen/common/compat/grant_table.c +++ b/xen/common/compat/grant_table.c @@ -1777,13 +1784,15 @@ active_alloc_failed: static long gnttab_setup_table( -XEN_GUEST_HANDLE_PARAM(gnttab_setup_table_t) uop, unsigned int count) +XEN_GUEST_HANDLE_PARAM(gnttab_setup_table_t) uop, unsigned int count, +unsigned int limit_max) { struct vcpu *curr = current; struct gnttab_setup_table op; struct domain *d = NULL; struct grant_table *gt; unsigned int i; +long ret = 0; >>> >>> Wouldn't int suffice here? >> >> I just followed the return type of the function. I think this way is >> cleaner, but in case you like int better I can change it. > > I sort of expected this reply, but that's not in line with what you > did in gnttab_get_status_frames() then. I think we will want to > eventually change all function return types to int where the wider > type isn't needed. Okay. Should I include a patch doing that in this series or would you prefer this cleanup being delayed to 4.11? @@ -3442,6 +3469,8 @@ grant_table_create( /* Simple stuff. */ percpu_rwlock_resource_init(&t->lock, grant_rwlock); spin_lock_init(&t->maptrack_lock); +t->max_grant_frames = max_grant_frames; +t->max_maptrack_frames = max_maptrack_frames; >>> >>> This together with ... >>> @@ -3655,7 +3686,11 @@ int grant_table_set_limits(struct domain *d, unsigned int grant_frames, /* Set limits. */ if ( !gt->active ) +{ +gt->max_grant_frames = grant_frames; +gt->max_maptrack_frames = maptrack_frames; ret = grant_table_init(gt); +} >>> >>> .. this raises the question of whether it is legal to decrease the >>> limits. There may be code depending on it only ever growing. >> >> Before grant_table_init() has been called there is no problem >> decreasing the limits, as nothing depending on them has been setup >> yet. > > Oh, right, I didn't pay attention to this being a one-time action. > >>> Additionally to take the input values without applying some >>> upper cap - while we have XSA-77, we still shouldn't introduce >>> new issues making disaggregation more unsafe. Perhaps the >>> global limits could serve as a cap here? Thinking more about it: what can happen in worst case when no cap is being enforced? A domain with the privilege to create another domain with arbitrary amounts of memory (we have no cap here) might go crazy and give the created domain an arbitrary amount of grant frames or maptrack frames. So what is the difference whether the memory is spent directly for the domain or via grant frames? In both cases there will be no memory left for other purposes. I can't see how this would be worse than allocating the same amount of memory directly for the new domain. >> I thought about a cap and TBH I'm not sure which would be sane to >> apply. The global limits seem wrong, especially looking at patch 14: >> those limits will be for dom0 only then. And dom0 won't need many >> grant frames in the normal case... > > I've been thinking about this Dom0 aspect too over lunch. What > about allowing the hardware domain to set its limit (only upwards > of course) in setup_table(), without any upper bound enforced? > This would free up the globals to be used as system wide limits > again. This would be possible, of course. The question is whether the need to re-allocate the frame pointer arrays is it worth. >> So I could make up a cap in form of either a configurable constant >> (CONFIG_* or boot parameter?) or as a fraction of domain memory. Any >> preferences here? > > A config constant as well as a fraction of domain memory might lock > out special purpose guests. Which would leave command line options > - as per above perhaps the ones we already have. In case we really need the cap parameters I'd prefer distinct ones from the dom0 initial values. Juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v8 14/15] xen: make grant table limits boot parameters dom0 only
On 20/09/17 14:07, Jan Beulich wrote: On 20.09.17 at 08:34, wrote: >> --- a/xen/arch/arm/domain_build.c >> +++ b/xen/arch/arm/domain_build.c >> @@ -2098,7 +2098,7 @@ static void __init find_gnttab_region(struct domain *d, >> kinfo->gnttab_size = (_etext - _stext) & PAGE_MASK; >> >> /* Make sure the grant table will fit in the region */ >> -if ( (kinfo->gnttab_size >> PAGE_SHIFT) < max_grant_frames ) >> +if ( grant_table_verify_size(d, kinfo->gnttab_size >> PAGE_SHIFT) ) >> panic("Cannot find a space for the grant table region\n"); > > I can see how this may cause complications with my alternative > proposal for the meaning of the command line options, but that's > solvable for sure. > >> @@ -3462,6 +3449,10 @@ grant_table_create( >> struct domain *d) >> { >> struct grant_table *t; >> +static unsigned int max_grant_frames; >> +static unsigned int max_maptrack_frames; >> +integer_param("gnttab_max_frames", max_grant_frames); >> +integer_param("gnttab_max_maptrack_frames", max_maptrack_frames); >> >> if ( (t = xzalloc(struct grant_table)) == NULL ) >> return -ENOMEM; >> @@ -3469,14 +3460,17 @@ grant_table_create( >> /* Simple stuff. */ >> percpu_rwlock_resource_init(&t->lock, grant_rwlock); >> spin_lock_init(&t->maptrack_lock); >> -t->max_grant_frames = max_grant_frames; >> -t->max_maptrack_frames = max_maptrack_frames; >> >> /* Okay, install the structure. */ >> d->grant_table = t; >> >> if ( d->domain_id == 0 ) >> +{ >> +t->max_grant_frames = max_grant_frames ? : >> DEFAULT_MAX_NR_GRANT_FRAMES; >> +t->max_maptrack_frames = >> + max_maptrack_frames ? : >> DEFAULT_MAX_MAPTRACK_FRAMES; >> return grant_table_init(t); >> +} > > The use here makes it that in effect the two variables could then > become __initdata. Maybe their definition should then even move > here (if the other proposal would be discarded). Is it possible in the hypervisor to access __initdata variables from non __init functions? In the Linux kernel this is forbidden. >> +bool __init grant_table_verify_size(struct domain *d, unsigned int frames) > > const Yes. Juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v8 04/15] xen: add function for obtaining highest possible memory address
Hi Juergen, Sorry for the late comment. On 20/09/17 07:34, Juergen Gross wrote: Add a function for obtaining the highest possible physical memory address of the system. This value is influenced by: - hypervisor configuration (CONFIG_BIGMEM) - processor capability (max. addressable physical memory) - memory map at boot time - memory hotplug capability The value is especially needed for dom0 to decide sizing of grant frame limits of guests and for pv domains for selecting the grant interface Why limiting to PV domain? Arm domain may also need to switch to another interface between v1 only support 32-bit GFN. version to use. Signed-off-by: Juergen Gross [...] diff --git a/xen/include/asm-arm/mm.h b/xen/include/asm-arm/mm.h index cd6dfb54b9..6aa8cba5e0 100644 --- a/xen/include/asm-arm/mm.h +++ b/xen/include/asm-arm/mm.h @@ -376,6 +376,11 @@ static inline void put_page_and_type(struct page_info *page) void clear_and_clean_page(struct page_info *page); +static inline unsigned long arch_get_upper_mfn_bound(void) +{ +return 0; +} I am not sure to understand the Arm implementation given the description of the commit message. The guest layout is completely separate from the host layout. It might be possible to have all the memory below 40 bits on the host, but this does not preclude the guest to have all memory below 40 bits (the hardware might support, for instance, up to 48 bits). Cheers, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v8 05/15] xen: add max possible mfn to struct xen_sysctl_physinfo
Hi Juergen, On 20/09/17 07:34, Juergen Gross wrote: Add the maximum possible mfn to struct xen_sysctl_physinfo in order to enable Xen tools to size the grant table frame limits for a domU Signed-off-by: Juergen Gross --- xen/common/sysctl.c | 1 + xen/include/public/sysctl.h | 2 ++ 2 files changed, 3 insertions(+) diff --git a/xen/common/sysctl.c b/xen/common/sysctl.c index a6882d1c9d..22f5d991f6 100644 --- a/xen/common/sysctl.c +++ b/xen/common/sysctl.c @@ -266,6 +266,7 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl) get_outstanding_claims(&pi->free_pages, &pi->outstanding_pages); pi->scrub_pages = 0; pi->cpu_khz = cpu_khz; +pi->max_mfn = get_upper_mfn_bound() - 1; arch_do_physinfo(pi); if ( copy_to_guest(u_sysctl, op, 1) ) diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h index 7830b987da..86b9ced86b 100644 --- a/xen/include/public/sysctl.h +++ b/xen/include/public/sysctl.h @@ -108,6 +108,8 @@ struct xen_sysctl_physinfo { /* XEN_SYSCTL_PHYSCAP_??? */ uint32_t capabilities; + +uint64_t max_mfn; /* Largest possible MFN on this host */ Don't you need to bump XEN_SYSCTL_INTERFACE_VERSION because of this change? }; typedef struct xen_sysctl_physinfo xen_sysctl_physinfo_t; DEFINE_XEN_GUEST_HANDLE(xen_sysctl_physinfo_t); Cheers, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v11 3/5] x86emul: Add return code information to error messages
On Ma, 2017-09-19 at 09:22 -0600, Jan Beulich wrote: > > > > > > > > > > > > > On 12.09.17 at 16:32, wrote: > > --- a/xen/arch/x86/hvm/emulate.c > > +++ b/xen/arch/x86/hvm/emulate.c > > @@ -2055,7 +2055,7 @@ int hvm_emulate_one_mmio(unsigned long mfn, > > unsigned long gla) > > { > > case X86EMUL_UNHANDLEABLE: > > case X86EMUL_UNIMPLEMENTED: > > -hvm_dump_emulation_state(XENLOG_G_WARNING, "MMCFG", > > &ctxt); > > +hvm_dump_emulation_state(XENLOG_G_WARNING, "MMCFG", &ctxt, > > rc); > > break; > At the example of this one I think it is pretty clear that the order > of patches would be the other way around. But I won't insist. > > > > > @@ -2242,16 +2242,17 @@ static const char > > *guest_x86_mode_to_str(int mode) > > } > > > > void hvm_dump_emulation_state(const char *loglvl, const char > > *prefix, > > - struct hvm_emulate_ctxt > > *hvmemul_ctxt) > > + struct hvm_emulate_ctxt > > *hvmemul_ctxt, int rc) > > { > > struct vcpu *curr = current; > > const char *mode_str = > > guest_x86_mode_to_str(hvm_guest_x86_mode(curr)); > > const struct segment_register *cs = > > hvmemul_get_seg_reg(x86_seg_cs, hvmemul_ctxt); > > > > -printk("%s%s emulation failed: %pv %s @ %04x:%08lx -> %*ph\n", > > - loglvl, prefix, curr, mode_str, cs->sel, hvmemul_ctxt- > > >insn_buf_eip, > > - hvmemul_ctxt->insn_buf_bytes, hvmemul_ctxt->insn_buf); > > +printk("%s%s emulation failed (rc=%d): %pv %s @ %04x:%08lx -> > > %*ph\n", > Please try to keep log messages short (but without losing relevant > information). In the case here the "rc=" is unnecessary. With it > dropped I added the "rc=" to mark the distinction between "unimplemented" and "unhandleable", as requested by Andrew Cooper for v10 "Please modify hvm_dump_emulation_state to pass rc in, and distinguish UNHANDLEABLE vs UNIMPLEMENTED in the printed message." > Reviewed-by: Jan Beulich > > Jan > > > > This email was scanned by Bitdefender ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v8 15/15] xen: add new Xen cpuid node for max address width info
On 20/09/17 14:18, Jan Beulich wrote: On 20.09.17 at 08:34, wrote: >> On very large hosts a guest needs to know whether it will have to > > ... a PV guest ... What about a HVM guest with (potentially) more than 16TB? >> handle frame numbers larger than 32 bits in order to select the >> appropriate grant interface version. >> >> Add a new Xen specific CPUID node to contain the maximum guest address >> width > > "guest address width" is ambiguous here, the more when looking at > what you actually return. We should no longer allow ourselves to > mix up the different address spaces. I've chosen "guest address width" similar to the "guest frame number" we already have: it is MFN based for PV and PFN based for HVM (and ARM). I'm open for a better name. > The limit you want to report > here is that in MFN space, which ought to be of no relevance to > HVM guests. Therefore I'm against uniformly exposing this (as much > as almost no other host property should have any relevance for > HVM guests), and would instead like to see a PV-only leaf just like > we already have a HVM only one. As said above: a HVM guest needs to know whether it will have to deal with frame numbers >32 bits, too. For HVM guests this would just be a hint that the host might be large enough for this to happen, as even today a HVM guest could in theory reorganize its memory map to have parts of the memory above the 16TB border even with only rather small amounts of memory. But this would be the problem of the guest then. For the future I could envision a per-domain parameter setting the upper limit where a guest could put its memory. This parameter would then influence the new CPUID value fro HVM domains, of course. > >> --- a/xen/arch/x86/traps.c >> +++ b/xen/arch/x86/traps.c >> @@ -929,6 +929,10 @@ void cpuid_hypervisor_leaves(const struct vcpu *v, >> uint32_t leaf, >> res->b = v->vcpu_id; >> break; >> >> +case 5: /* Host specific parameters */ >> +res->a = generic_flsl(get_upper_mfn_bound() - 1) + PAGE_SHIFT; >> +break; > > Already when looking at the patch introducing the function I was > wondering whether the function wouldn't better return the highest > frame number instead of the first invalid one. From an abstract > perspective this would allow an arch (e.g. ARM32) to report that > all addresses are valid. That's fine with me. >> --- a/xen/include/public/arch-x86/cpuid.h >> +++ b/xen/include/public/arch-x86/cpuid.h >> @@ -85,6 +85,15 @@ >> #define XEN_HVM_CPUID_IOMMU_MAPPINGS (1u << 2) >> #define XEN_HVM_CPUID_VCPU_ID_PRESENT (1u << 3) /* vcpu id is present in >> EBX >> */ >> >> -#define XEN_CPUID_MAX_NUM_LEAVES 4 >> +/* >> + * Leaf 6 (0x4x05) >> + * Host specific parameters >> + * EAX: bits 0-7: max guest address width >> + */ >> + >> +/* Max. address width in bits taking memory hotplug into account. */ >> +#define XEN_CPUID_GUEST_ADDRESS_WIDTH_MASK (255u << 0) > > Please use hex numbers for multi-digit masks, and please make > explicit (at least in the comment, perhaps also in the constant's > name) which "address" is being talked about here. Okay. Juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v8 15/15] xen: add new Xen cpuid node for max address width info
Hi Juergen, On 20/09/17 07:34, Juergen Gross wrote: On very large hosts a guest needs to know whether it will have to handle frame numbers larger than 32 bits in order to select the appropriate grant interface version. Can you explain why you decided to have an arch specific interface rather than generic one? On Arm, we would likely need to make same decision for large hosts. Cheers, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] Ping: [PATCH v4 1/3] x86/hvm: Rename enum hvm_copy_result to hvm_translation_result
On 09/20/2017 08:29 AM, Jan Beulich wrote: On 20.09.17 at 11:22, wrote: >> From: Andrew Cooper >> >> Signed-off-by: Andrew Cooper >> Acked-by: Tim Deegan >> Acked-by: Jan Beulich >> Reviewed-by: Kevin Tian >> Acked-by: George Dunlap >> --- >> xen/arch/x86/hvm/dom0_build.c | 2 +- >> xen/arch/x86/hvm/emulate.c| 40 ++-- >> xen/arch/x86/hvm/hvm.c| 56 >> +++ >> xen/arch/x86/hvm/intercept.c | 20 +++--- >> xen/arch/x86/hvm/svm/nestedsvm.c | 5 ++-- >> xen/arch/x86/hvm/svm/svm.c| 2 +- > Boris, Suravee? The missing ack is the only thing keeping this and > patch 2 from going in. Sorry, I thought I responded and obviously I haven't. Reviewed-by: Boris Ostrovsky ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v8 05/15] xen: add max possible mfn to struct xen_sysctl_physinfo
On 20/09/17 14:53, Julien Grall wrote: > Hi Juergen, > > On 20/09/17 07:34, Juergen Gross wrote: >> Add the maximum possible mfn to struct xen_sysctl_physinfo in order to >> enable Xen tools to size the grant table frame limits for a domU >> >> Signed-off-by: Juergen Gross >> --- >> xen/common/sysctl.c | 1 + >> xen/include/public/sysctl.h | 2 ++ >> 2 files changed, 3 insertions(+) >> >> diff --git a/xen/common/sysctl.c b/xen/common/sysctl.c >> index a6882d1c9d..22f5d991f6 100644 >> --- a/xen/common/sysctl.c >> +++ b/xen/common/sysctl.c >> @@ -266,6 +266,7 @@ long >> do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl) >> get_outstanding_claims(&pi->free_pages, >> &pi->outstanding_pages); >> pi->scrub_pages = 0; >> pi->cpu_khz = cpu_khz; >> + pi->max_mfn = get_upper_mfn_bound() - 1; >> arch_do_physinfo(pi); >> if ( copy_to_guest(u_sysctl, op, 1) ) >> diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h >> index 7830b987da..86b9ced86b 100644 >> --- a/xen/include/public/sysctl.h >> +++ b/xen/include/public/sysctl.h >> @@ -108,6 +108,8 @@ struct xen_sysctl_physinfo { >> /* XEN_SYSCTL_PHYSCAP_??? */ >> uint32_t capabilities; >> + >> + uint64_t max_mfn; /* Largest possible MFN on this host */ > > Don't you need to bump XEN_SYSCTL_INTERFACE_VERSION because of this change? Thanks for noticing. Juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v8 04/15] xen: add function for obtaining highest possible memory address
On 20/09/17 14:51, Julien Grall wrote: > Hi Juergen, > > Sorry for the late comment. > > On 20/09/17 07:34, Juergen Gross wrote: >> Add a function for obtaining the highest possible physical memory >> address of the system. This value is influenced by: >> >> - hypervisor configuration (CONFIG_BIGMEM) >> - processor capability (max. addressable physical memory) >> - memory map at boot time >> - memory hotplug capability >> >> The value is especially needed for dom0 to decide sizing of grant frame >> limits of guests and for pv domains for selecting the grant interface > > Why limiting to PV domain? Arm domain may also need to switch to another > interface between v1 only support 32-bit GFN. Right. And I just used that reasoning for an answer to Jan. :-) > >> version to use. >> >> Signed-off-by: Juergen Gross > > [...] > >> diff --git a/xen/include/asm-arm/mm.h b/xen/include/asm-arm/mm.h >> index cd6dfb54b9..6aa8cba5e0 100644 >> --- a/xen/include/asm-arm/mm.h >> +++ b/xen/include/asm-arm/mm.h >> @@ -376,6 +376,11 @@ static inline void put_page_and_type(struct >> page_info *page) >> void clear_and_clean_page(struct page_info *page); >> +static inline unsigned long arch_get_upper_mfn_bound(void) >> +{ >> + return 0; >> +} > > I am not sure to understand the Arm implementation given the description > of the commit message. > > The guest layout is completely separate from the host layout. It might > be possible to have all the memory below 40 bits on the host, but this > does not preclude the guest to have all memory below 40 bits (the > hardware might support, for instance, up to 48 bits). Who is setting up the memory map for the guest then? Juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] xen: credit2: fix spinlock irq-safety violation
On 09/20/2017 12:59 PM, Jan Beulich wrote: On 20.09.17 at 13:44, wrote: >> IAC, if you're concerned about that, I'd much rather put both >> kill_timer() and xfree() before the critical section, rather than >> after, like in the attached patch. > > Hmm, killing the timer upfront is certainly fine, but is freeing the > data before removing the element from the list safe not only > currently, but also going forward? I agree with Jan -- if you don't want to put the kill_timer() in the critical section, put it beforehand; but don't free the structure until after the sdom struct has been removed from the list. Sorry to be picky, but I'm positive I'm not going to remember this in six months' time, and I'd just rather be safe. -George ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 06/17] xen/livepatch/x86/arm32: Force .livepatch.depends section to be uint32_t aligned.
On Tue, Sep 19, 2017 at 12:05:16PM +0100, Julien Grall wrote: > Hi Konrad, > > On 19/09/17 01:32, Konrad Rzeszutek Wilk wrote: > > > > +.PHONY: livepatch_depends.h > > > > +livepatch_depends.h: note.bin > > > > + $(shell (../../../tools/firmware/hvmloader/mkhex > > > > $(NOTE_DEPENDS) $^ > $@)) > > > > > > It looks quite odd to use a file in firmware/hvmloader for livepatch. > > > Would > > > it be possible to move mkhex to a generic place? > > > > Like so? > > It is what I had in mind. I CCed Ian and Wei to get feedback from them. > Just move it to a directory called tools/scripts? ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v8 04/15] xen: add function for obtaining highest possible memory address
Hi Juergen, On 20/09/17 14:08, Juergen Gross wrote: On 20/09/17 14:51, Julien Grall wrote: Hi Juergen, Sorry for the late comment. On 20/09/17 07:34, Juergen Gross wrote: Add a function for obtaining the highest possible physical memory address of the system. This value is influenced by: - hypervisor configuration (CONFIG_BIGMEM) - processor capability (max. addressable physical memory) - memory map at boot time - memory hotplug capability The value is especially needed for dom0 to decide sizing of grant frame limits of guests and for pv domains for selecting the grant interface Why limiting to PV domain? Arm domain may also need to switch to another interface between v1 only support 32-bit GFN. Right. And I just used that reasoning for an answer to Jan. :-) version to use. Signed-off-by: Juergen Gross [...] diff --git a/xen/include/asm-arm/mm.h b/xen/include/asm-arm/mm.h index cd6dfb54b9..6aa8cba5e0 100644 --- a/xen/include/asm-arm/mm.h +++ b/xen/include/asm-arm/mm.h @@ -376,6 +376,11 @@ static inline void put_page_and_type(struct page_info *page) void clear_and_clean_page(struct page_info *page); +static inline unsigned long arch_get_upper_mfn_bound(void) +{ +return 0; +} I am not sure to understand the Arm implementation given the description of the commit message. The guest layout is completely separate from the host layout. It might be possible to have all the memory below 40 bits on the host, but this does not preclude the guest to have all memory below 40 bits (the hardware might support, for instance, up to 48 bits). Who is setting up the memory map for the guest then? The memory map is at the moment static and described in public/arch-arm.h. The guest is not allowed to assume it and should discover it through ACPI/DT. There are 2 banks of memory for the guest (it depends on the amount of memory requested by the user): - 3GB @ 1GB - 1016GB @ 8GB But the guest would be free to use the populate memory hypercall to allocate memory anywhere in the address space. For Arm32, the maximum IPA (Intermediate Physical Address aka guest physical address on Xen) we currently support is always 40 bits. For Arm64, this range from 32 bits to 48 bits. New hardware can support up to 52 bits. Cheers, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v8 12/15] xen/arm: move arch specific grant table bits into grant_table.c
Hi Juergen, On 20/09/17 07:34, Juergen Gross wrote: diff --git a/xen/include/asm-arm/grant_table.h b/xen/include/asm-arm/grant_table.h index 0a248a765a..0870b5b782 100644 --- a/xen/include/asm-arm/grant_table.h +++ b/xen/include/asm-arm/grant_table.h @@ -6,6 +6,10 @@ #define INITIAL_NR_GRANT_FRAMES 4 +struct grant_table_arch { +gfn_t *gfn; +}; + void gnttab_clear_flag(unsigned long nr, uint16_t *addr); int create_grant_host_mapping(unsigned long gpaddr, unsigned long mfn, unsigned int flags, unsigned int @@ -22,11 +26,19 @@ static inline int replace_grant_supported(void) return 1; } -static inline void gnttab_set_frame_gfn(struct domain *d, unsigned long idx, -gfn_t gfn) -{ -d->arch.grant_table_gfn[idx] = gfn; -} > +#define gnttab_init_arch(gt) \ +( ((gt)->arch.gfn = xzalloc_array(gfn_t, max_grant_frames)) == 0 \ I am not sure to understand the 0 here. Don't you check the return of xzalloc_array? If so it should be NULL if it failed. And therefore the return error looks inverted below. + ? 0 : -ENOMEM ) I admit I would much prefer to see static inline rather than define. More typesafe and usually easier to read. You cannot do that because the type of gt is only defined in grant-table? Nonetheless, I think you could clarify this code by doing: ({ (gt)->arch.gfn = xzalloc_array(); ( g->arch.gfn ? 0 : -ENOMEM ); )} Cheers, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v8 04/15] xen: add function for obtaining highest possible memory address
On 20/09/17 16:24, Julien Grall wrote: > Hi Juergen, > > On 20/09/17 14:08, Juergen Gross wrote: >> On 20/09/17 14:51, Julien Grall wrote: >>> Hi Juergen, >>> >>> Sorry for the late comment. >>> >>> On 20/09/17 07:34, Juergen Gross wrote: Add a function for obtaining the highest possible physical memory address of the system. This value is influenced by: - hypervisor configuration (CONFIG_BIGMEM) - processor capability (max. addressable physical memory) - memory map at boot time - memory hotplug capability The value is especially needed for dom0 to decide sizing of grant frame limits of guests and for pv domains for selecting the grant interface >>> >>> Why limiting to PV domain? Arm domain may also need to switch to another >>> interface between v1 only support 32-bit GFN. >> >> Right. And I just used that reasoning for an answer to Jan. :-) >> >>> version to use. Signed-off-by: Juergen Gross >>> >>> [...] >>> diff --git a/xen/include/asm-arm/mm.h b/xen/include/asm-arm/mm.h index cd6dfb54b9..6aa8cba5e0 100644 --- a/xen/include/asm-arm/mm.h +++ b/xen/include/asm-arm/mm.h @@ -376,6 +376,11 @@ static inline void put_page_and_type(struct page_info *page) void clear_and_clean_page(struct page_info *page); +static inline unsigned long arch_get_upper_mfn_bound(void) +{ + return 0; +} >>> >>> I am not sure to understand the Arm implementation given the description >>> of the commit message. >>> >>> The guest layout is completely separate from the host layout. It might >>> be possible to have all the memory below 40 bits on the host, but this >>> does not preclude the guest to have all memory below 40 bits (the >>> hardware might support, for instance, up to 48 bits). >> >> Who is setting up the memory map for the guest then? > > The memory map is at the moment static and described in > public/arch-arm.h. The guest is not allowed to assume it and should > discover it through ACPI/DT. Is there any memory hotplug possible (host level, guest level)? > There are 2 banks of memory for the guest (it depends on the amount of > memory requested by the user): > - 3GB @ 1GB > - 1016GB @ 8GB > > But the guest would be free to use the populate memory hypercall to > allocate memory anywhere in the address space. Okay, so this is similar to x86 HVM then. > For Arm32, the maximum IPA (Intermediate Physical Address aka guest > physical address on Xen) we currently support is always 40 bits. > > For Arm64, this range from 32 bits to 48 bits. New hardware can support > up to 52 bits. I guess this information is included in some tables like ACPI or DT? Juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v8 13/15] xen: make grant resource limits per domain
Hi Jan, On 20/09/17 11:34, Jan Beulich wrote: On 20.09.17 at 08:34, wrote: --- a/xen/include/asm-arm/grant_table.h +++ b/xen/include/asm-arm/grant_table.h @@ -26,8 +26,8 @@ static inline int replace_grant_supported(void) return 1; } -#define gnttab_init_arch(gt) \ -( ((gt)->arch.gfn = xzalloc_array(gfn_t, max_grant_frames)) == 0 \ +#define gnttab_init_arch(gt) \ +( ((gt)->arch.gfn = xzalloc_array(gfn_t, (gt)->max_grant_frames)) == 0 \ Mind switching to use NULL at this occasion? This would belong to the patch #12 where it as been introduced. Cheers, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 1/2] public/domctl: drop unnecessary typedefs and handles
On Tue, Sep 12, 2017 at 9:08 AM, Jan Beulich wrote: > By virtue of the struct xen_domctl container structure, most of them > are really just cluttering the name space. > > While doing so, > - convert an enum typed (pt_irq_type_t) structure field to a fixed > width type, > - make x86's paging_domctl() and descendants take a properly typed > handle, > - add const in a few places. > > Signed-off-by: Jan Beulich Acked-by: Tamas K Lengyel ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 03/21] libxl/xl: use the new domain_build_info fields position
Roger Pau Monne writes ("[PATCH v2 03/21] libxl/xl: use the new domain_build_info fields position"): > This is required because those options will be used by the new PVH > guest type, and thus need to be shared between PV and HVM. ... > diff --git a/tools/xl/xl_parse.c b/tools/xl/xl_parse.c > index 02ddd2e90d..61f9a38573 100644 > --- a/tools/xl/xl_parse.c > +++ b/tools/xl/xl_parse.c > @@ -1037,6 +1037,65 @@ void parse_config_data(const char *config_source, > xlu_cfg_get_defbool(config, "driver_domain", &c_info->driver_domain, 0); > xlu_cfg_get_defbool(config, "acpi", &b_info->acpi, 0); > > +xlu_cfg_replace_string (config, "bootloader", &b_info->bootloader, 0); > +switch (xlu_cfg_get_list_as_string_list(config, "bootloader_args", > +&b_info->bootloader_args, 1)) { The code motion would benefit from being moved into a pre-patch. Is that easily possible ? Obviously in the new location, in the pre-patch it would still need to be in an if () to check the guest type. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 04/21] xl: introduce a domain type option
Roger Pau Monne writes ("[PATCH v2 04/21] xl: introduce a domain type option"): > Introduce a new type option to xl configuration files in order to > specify the domain type. This supersedes the current builder option. ...> > Th > libxl_defbool_set(&c_info->run_hotplug_scripts, run_hotplug_scripts); > c_info->type = LIBXL_DOMAIN_TYPE_PV; > -if (!xlu_cfg_get_string (config, "builder", &buf, 0) && > -!strncmp(buf, "hvm", strlen(buf))) > -c_info->type = LIBXL_DOMAIN_TYPE_HVM; > +if (!xlu_cfg_get_string (config, "builder", &buf, 0)) { > +fprintf(stderr, > +"The builder option is being deprecated, please use type > instead.\n"); Line length. Probably best to shuffle the message to the left, rather than linewrapping, for ease of grepping. The error message would benefit from some `quotes' around the parameter names. It would be nice if you did this in a way that meant that a config file which specified both `type="hvm"' and `builder="hvm"' did not generate a warning. And it ought to be an error to specify `type="pv"' with `builder="hvm"' surely. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 05/21] xl: introduce a firmware option
Roger Pau Monne writes ("[PATCH v2 05/21] xl: introduce a firmware option"): > The new firmware option aims to provide a coherent way to set the > firmware for the different kind of guests Xen supports. ... > +=head3 Non direct Kernel Boot > + > +Non direct kernel boot allows booting guests with a firmware. This can be \ used > +by all types of guests, although the selection of options is different > +depending on the guest type. > + > +This option provides the flexibly of letting the guest decide which kernel\ they > +want to boot, while preventing having to poke at the guest file system for\ m the Line length. Observe the (simulated) wrap damage. You should to consider using semantic linefeeds (semantic newlines). That is, break the lines in the .pod file at sentence or phrase boundaries. I see the code has line length problems too (it wraps when I look at it in my MUA and even more so when I try to quote it when replying). After we've chatted about the idl compiler changes, can you fix the long lines everywhere and resend ? Thanks, Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 1/2] xen: add a global indicator for grant copy being available
On Tue, Sep 19, 2017 at 01:50:54PM +0200, Juergen Gross wrote: > The Xen qdisk backend needs to test whether grant copy operations is > available in the kernel. Unfortunately this collides with using > xengnttab_set_max_grants() on some kernels as this operation has to > be the first one after opening the gnttab device. > > In order to solve this problem test for the availability of grant copy > in xen_be_init() opening the gnttab device just for that purpose and > closing it again afterwards. Advertise the availability via a global > flag and use that flag in the qdisk backend. > > Signed-off-by: Juergen Gross > --- > hw/block/xen_disk.c | 18 ++ > hw/xen/xen_backend.c | 11 +++ > include/hw/xen/xen_backend.h | 1 + > 3 files changed, 18 insertions(+), 12 deletions(-) > > diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c > index d42ed7070d..6632746250 100644 > --- a/hw/block/xen_disk.c > +++ b/hw/block/xen_disk.c > @@ -121,9 +121,6 @@ struct XenBlkDev { > unsigned intpersistent_gnt_count; > unsigned intmax_grants; > > -/* Grant copy */ > -gbooleanfeature_grant_copy; > - > /* qemu block driver */ > DriveInfo *dinfo; > BlockBackend*blk; > @@ -616,7 +613,7 @@ static void qemu_aio_complete(void *opaque, int ret) > return; > } > > -if (ioreq->blkdev->feature_grant_copy) { > +if (xen_feature_grant_copy) { > switch (ioreq->req.operation) { > case BLKIF_OP_READ: > /* in case of failure ioreq->aio_errors is increased */ > @@ -638,7 +635,7 @@ static void qemu_aio_complete(void *opaque, int ret) > } > > ioreq->status = ioreq->aio_errors ? BLKIF_RSP_ERROR : BLKIF_RSP_OKAY; > -if (!ioreq->blkdev->feature_grant_copy) { > +if (!xen_feature_grant_copy) { > ioreq_unmap(ioreq); > } > ioreq_finish(ioreq); > @@ -698,7 +695,7 @@ static int ioreq_runio_qemu_aio(struct ioreq *ioreq) > { > struct XenBlkDev *blkdev = ioreq->blkdev; > > -if (ioreq->blkdev->feature_grant_copy) { > +if (xen_feature_grant_copy) { > ioreq_init_copy_buffers(ioreq); > if (ioreq->req.nr_segments && (ioreq->req.operation == > BLKIF_OP_WRITE || > ioreq->req.operation == BLKIF_OP_FLUSH_DISKCACHE) && > @@ -750,7 +747,7 @@ static int ioreq_runio_qemu_aio(struct ioreq *ioreq) > } > default: > /* unknown operation (shouldn't happen -- parse catches this) */ > -if (!ioreq->blkdev->feature_grant_copy) { > +if (!xen_feature_grant_copy) { > ioreq_unmap(ioreq); > } > goto err; > @@ -1010,18 +1007,15 @@ static int blk_init(struct XenDevice *xendev) > > blkdev->file_blk = BLOCK_SIZE; > > -blkdev->feature_grant_copy = > -(xengnttab_grant_copy(blkdev->xendev.gnttabdev, 0, NULL) == > 0); > - > xen_pv_printf(&blkdev->xendev, 3, "grant copy operation %s\n", > - blkdev->feature_grant_copy ? "enabled" : "disabled"); > + xen_feature_grant_copy ? "enabled" : "disabled"); > > /* fill info > * blk_connect supplies sector-size and sectors > */ > xenstore_write_be_int(&blkdev->xendev, "feature-flush-cache", 1); > xenstore_write_be_int(&blkdev->xendev, "feature-persistent", > - !blkdev->feature_grant_copy); > + !xen_feature_grant_copy); > xenstore_write_be_int(&blkdev->xendev, "info", info); > > xenstore_write_be_int(&blkdev->xendev, "max-ring-page-order", > diff --git a/hw/xen/xen_backend.c b/hw/xen/xen_backend.c > index c46cbb0759..00210627a9 100644 > --- a/hw/xen/xen_backend.c > +++ b/hw/xen/xen_backend.c > @@ -44,6 +44,7 @@ BusState *xen_sysbus; > /* public */ > struct xs_handle *xenstore = NULL; > const char *xen_protocol; > +gboolean xen_feature_grant_copy; I think it would be better if this was changed to bool instead of a gboolean. Beside that, Acked-by: Anthony PERARD > > /* private */ > static int debug; > @@ -519,6 +520,8 @@ void xenstore_update_fe(char *watch, struct XenDevice > *xendev) > > int xen_be_init(void) > { > +xengnttab_handle *gnttabdev; > + > xenstore = xs_daemon_open(); > if (!xenstore) { > xen_pv_printf(NULL, 0, "can't connect to xenstored\n"); > @@ -532,6 +535,14 @@ int xen_be_init(void) > goto err; > } > > +gnttabdev = xengnttab_open(NULL, 0); > +if (gnttabdev != NULL) { > +if (xengnttab_grant_copy(gnttabdev, 0, NULL) == 0) { > +xen_feature_grant_copy = true; > +} > +xengnttab_close(gnttabdev); > +} > + > xen_sysdev = qdev_create(NULL, TYPE_XENSYSDEV); > qdev_init_nofail(xen_sysdev); > xen_sysbus = qbus_create(TYPE_XENSYSBUS, DEVICE(xen_sysdev), > "xen-sysbus"); > diff --git a/include/hw/xen/xen_backend.h b/include/hw/xen/xen_backend.h > index 8
Re: [Xen-devel] [PATCH 2/2] xen: dont try setting max grants multiple times
On Tue, Sep 19, 2017 at 01:50:55PM +0200, Juergen Gross wrote: > Trying to call xengnttab_set_max_grants() with the same file handle > might fail on some kernels, as this operation is allowed only once. > > This is a problem for the qdisk backend as blk_connect() can be > called multiple times for a domain, e.g. in case grub-xen is being > used to boot it. > > So instead of letting the generic backend code open the gnttab device > do it in blk_connect() and close it again in blk_disconnect. > > Signed-off-by: Juergen Gross > --- > hw/block/xen_disk.c | 12 +++- > 1 file changed, 11 insertions(+), 1 deletion(-) > > diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c > index 6632746250..7cff8863cb 100644 > --- a/hw/block/xen_disk.c > +++ b/hw/block/xen_disk.c > @@ -1220,6 +1220,12 @@ static int blk_connect(struct XenDevice *xendev) > /* Add on the number needed for the ring pages */ > max_grants += blkdev->nr_ring_ref; > > +blkdev->xendev.gnttabdev = xengnttab_open(NULL, 0); > +if (blkdev->xendev.gnttabdev == NULL) { > +xen_pv_printf(xendev, 0, "xengnttab_open failed: %s\n", > + strerror(errno)); > +return -1; > +} > if (xengnttab_set_max_grants(blkdev->xendev.gnttabdev, max_grants)) { > xen_pv_printf(xendev, 0, "xengnttab_set_max_grants failed: %s\n", >strerror(errno)); > @@ -1327,6 +1333,11 @@ static void blk_disconnect(struct XenDevice *xendev) > } > blkdev->feature_persistent = false; > } > + > +if (blkdev->xendev.gnttabdev) { > +xengnttab_close(blkdev->xendev.gnttabdev); > +blkdev->xendev.gnttabdev = NULL; > +} I think blk_disconnect needs to be called from blk_free in case where the gnttabdev is not closed (like it is done when blk or the sring are not cleared, in blk_free). > } > static int blk_free(struct XenDevice *xendev) > @@ -1363,7 +1374,6 @@ static void blk_event(struct XenDevice *xendev) > > struct XenDevOps xen_blkdev_ops = { > .size = sizeof(struct XenBlkDev), > -.flags = DEVOPS_FLAG_NEED_GNTDEV, > .alloc = blk_alloc, > .init = blk_init, > .initialise= blk_connect, > -- > 2.12.3 > -- Anthony PERARD ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] libxl: provide typedefs for device framework functions
On Fri, Sep 15, 2017 at 12:01:56PM +0100, Wei Liu wrote: > Use the new typedefs to avoid copy-n-paste everywhere. > > No functional change. > > Signed-off-by: Wei Liu I'm going to wait until Friday to apply this patch. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH] x86/PV: fix/generalize guest nul selector handling
FS/GS base (and limit) aren't being cleared by the loading of a nul selector into the segment register on AMD CPUs. Therefore, if an outgoing vCPU has a non-null base in one of these registers and the subsequent incoming vCPU has a non-zero but nul selector in the respective register(s), the selector value(s) would be loaded without clearing the segment base(s) in the hidden register portion. Since the ABI states "zero" in its description of the fields, it is worth noting that the chosen approach to fix this alters the written down ABI. I consider this preferrable over enforcing the previously written down behavior, as nul selectors are far more likely to be what was meant from the beginning. The adjustments also eliminate an inconsistency between FS and GS handling: Old code had an extra pointless (gs_base_user was always zero when DIRTY_GS was set) conditional for GS. The old bitkeeper changeset has no explanation for this asymmetry. Inspired by Linux commit e137a4d8f4dd2e277e355495b6b2cb241a8693c3. Signed-off-by: Jan Beulich --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -1260,30 +1260,28 @@ static void load_segments(struct vcpu *n if ( unlikely((dirty_segment_mask & DIRTY_ES) | uregs->es) ) all_segs_okay &= loadsegment(es, uregs->es); -/* - * Either selector != 0 ==> reload. - * Also reload to reset FS_BASE if it was non-zero. - */ -if ( unlikely((dirty_segment_mask & (DIRTY_FS | DIRTY_FS_BASE)) | - uregs->fs) ) +/* Either selector != 0 ==> reload. */ +if ( unlikely((dirty_segment_mask & DIRTY_FS) | uregs->fs) ) +{ all_segs_okay &= loadsegment(fs, uregs->fs); +/* non-nul selector updates fs_base */ +if ( uregs->fs & ~3 ) +dirty_segment_mask &= ~DIRTY_FS_BASE; +} -/* - * Either selector != 0 ==> reload. - * Also reload to reset GS_BASE if it was non-zero. - */ -if ( unlikely((dirty_segment_mask & (DIRTY_GS | DIRTY_GS_BASE_USER)) | - uregs->gs) ) -{ -/* Reset GS_BASE with user %gs? */ -if ( (dirty_segment_mask & DIRTY_GS) || !n->arch.pv_vcpu.gs_base_user ) -all_segs_okay &= loadsegment(gs, uregs->gs); +/* Either selector != 0 ==> reload. */ +if ( unlikely((dirty_segment_mask & DIRTY_GS) | uregs->gs) ) +{ +all_segs_okay &= loadsegment(gs, uregs->gs); +/* non-nul selector updates gs_base_user */ +if ( uregs->gs & ~3 ) +dirty_segment_mask &= ~DIRTY_GS_BASE_USER; } if ( !is_pv_32bit_vcpu(n) ) { /* This can only be non-zero if selector is NULL. */ -if ( n->arch.pv_vcpu.fs_base ) +if ( n->arch.pv_vcpu.fs_base | (dirty_segment_mask & DIRTY_FS_BASE) ) wrfsbase(n->arch.pv_vcpu.fs_base); /* Most kernels have non-zero GS base, so don't bother testing. */ @@ -1291,7 +1289,8 @@ static void load_segments(struct vcpu *n wrmsrl(MSR_SHADOW_GS_BASE, n->arch.pv_vcpu.gs_base_kernel); /* This can only be non-zero if selector is NULL. */ -if ( n->arch.pv_vcpu.gs_base_user ) +if ( n->arch.pv_vcpu.gs_base_user | + (dirty_segment_mask & DIRTY_GS_BASE_USER) ) wrgsbase(n->arch.pv_vcpu.gs_base_user); /* If in kernel mode then switch the GS bases around. */ @@ -1426,22 +1425,22 @@ static void save_segments(struct vcpu *v if ( regs->fs || is_pv_32bit_vcpu(v) ) { dirty_segment_mask |= DIRTY_FS; -v->arch.pv_vcpu.fs_base = 0; /* != 0 selector kills fs_base */ +/* non-nul selector kills fs_base */ +if ( regs->fs & ~3 ) +v->arch.pv_vcpu.fs_base = 0; } -else if ( v->arch.pv_vcpu.fs_base ) -{ +if ( v->arch.pv_vcpu.fs_base ) dirty_segment_mask |= DIRTY_FS_BASE; -} if ( regs->gs || is_pv_32bit_vcpu(v) ) { dirty_segment_mask |= DIRTY_GS; -v->arch.pv_vcpu.gs_base_user = 0; /* != 0 selector kills gs_base_user */ +/* non-nul selector kills gs_base_user */ +if ( regs->gs & ~3 ) +v->arch.pv_vcpu.gs_base_user = 0; } -else if ( v->arch.pv_vcpu.gs_base_user ) -{ +if ( v->arch.pv_vcpu.gs_base_user ) dirty_segment_mask |= DIRTY_GS_BASE_USER; -} this_cpu(dirty_segment_mask) = dirty_segment_mask; } --- a/xen/include/public/arch-x86/xen-x86_64.h +++ b/xen/include/public/arch-x86/xen-x86_64.h @@ -203,8 +203,8 @@ struct cpu_user_regs { uint16_t ss, _pad2[3]; uint16_t es, _pad3[3]; uint16_t ds, _pad4[3]; -uint16_t fs, _pad5[3]; /* Non-zero => takes precedence over fs_base. */ -uint16_t gs, _pad6[3]; /* Non-zero => takes precedence over gs_base_usr. */ +uint16_t fs, _pad5[3]; /* Non-nul => takes precedence over fs_base. */ +uint16_t gs, _pad6[3]; /* Non-nul => takes precedence over gs_base_user. */ }; typedef struct cpu_user_regs cpu_user_regs_t; DEFINE_XEN_GUEST_HAN
Re: [Xen-devel] [PATCH] build: propagate runstatedir
On Wed, Sep 13, 2017 at 10:11:25AM +0100, Wei Liu wrote: > On Wed, Sep 13, 2017 at 11:07:15AM +0200, Juergen Gross wrote: > > On 13/09/17 10:52, Wei Liu wrote: > > > Debian Stretch's autoconf has a new variable called runstatedir. Add > > > that to Paths.mk.in. > > > > > > Signed-off-by: Wei Liu > > > > You did think of > > > > https://lists.xen.org/archives/html/xen-devel/2017-02/msg01883.html > > > > I suppose? > > > > Yes. That was added at a time when autoconf didn't have runstatedir > iirc. Now autoconf has that. There is no reason not to propagate it. Thinking about this a bit more, exposing runstatedir like this doesn't have much value. I want to make it work with the existing option that we already have. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 3/5] xen/livepatch/ARM32: Don't load and crash on livepatches loaded with wrong alignment.
On Tue, Sep 19, 2017 at 09:04:44AM -0600, Jan Beulich wrote: > >>> On 18.09.17 at 21:37, wrote: > > On Tue, Sep 12, 2017 at 02:57:04AM -0600, Jan Beulich wrote: > >> >>> On 12.09.17 at 02:22, wrote: > >> > If I compile the test-case under ARM32 it works OK (as the > >> > .livepatch.depends ends up being aligned to four bytes). > >> > >> So why is that? What entity is creating this section (or the > >> directive(s) to create it)? > > > > gcc > > > > Looking at the xen_bye_world.o produced by cross-compiler: > > > > xen_bye_world.o: file format elf32-littlearm > > > > Contents of section .rodata: > > 78656e5f 65787472 615f7665 7273696f xen_extra_versio > > 0010 6e00 n. > > > > And native: > > > > xen_bye_world.o: file format elf32-littlearm > > > > Contents of section .rodata: > > 78656e5f 65787472 615f7665 7273696f xen_extra_versio > > 0010 6e00 n... > > This may rather be a gas than a gcc behavioral difference. What's > the alignment of .rodata in both cases? Cross: * on the livepatch: ..snip.. [ 4] .rodata PROGBITS 74 12 00 A 0 0 4 [ 5] .rodata.str1.4PROGBITS 88 0b 01 AMS 0 0 4 [ 6] .livepatch.depend PROGBITS 93 24 00 A 0 0 1 * on the .o file: Section Headers: [Nr] Name TypeAddr OffSize ES Flg Lk Inf Al .. snip.. [ 1] .text PROGBITS 34 00 00 AX 0 0 1 [ 2] .data PROGBITS 34 00 00 WA 0 0 1 [ 3] .bss NOBITS 34 00 00 WA 0 0 1 [ 4] .rodata PROGBITS 34 14 00 A 0 0 4 [ 5] .livepatch.funcs PROGBITS 48 34 00 WA 0 0 4 Native: * on the livepatch: ..snip.. [ 4] .rodata PROGBITS 74 14 00 A 0 0 4 [ 5] .rodata.str1.4PROGBITS 88 0c 01 AMS 0 0 4 [ 6] .livepatch.depend PROGBITS 94 24 00 A 0 0 1 * on the .o file: ..snip.. [ 1] .text PROGBITS 34 00 00 AX 0 0 1 [ 2] .data PROGBITS 34 00 00 WA 0 0 1 [ 3] .bss NOBITS 34 00 00 WA 0 0 1 [ 4] .rodata PROGBITS 34 12 00 A 0 0 4 [ 5] .livepatch.funcs PROGBITS 48 34 00 WA 0 0 4 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC PATCH V3 1/3] Xen: Increase hap/shadow page pool size to support more vcpus support
On Tue, Sep 19, 2017 at 11:06:26AM +0800, Lan Tianyu wrote: > Hi Wei: > > On 2017年09月18日 21:06, Wei Liu wrote: > > On Wed, Sep 13, 2017 at 12:52:47AM -0400, Lan Tianyu wrote: > >> This patch is to increase page pool size when max vcpu number is larger > >> than 128. > >> > >> Signed-off-by: Lan Tianyu > >> --- > >> xen/arch/arm/domain.c| 5 + > >> xen/arch/x86/domain.c| 25 + > >> xen/common/domctl.c | 3 +++ > >> xen/include/xen/domain.h | 2 ++ > >> 4 files changed, 35 insertions(+) > >> > >> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c > >> index 6512f01..94cf70b 100644 > >> --- a/xen/arch/arm/domain.c > >> +++ b/xen/arch/arm/domain.c > >> @@ -824,6 +824,11 @@ int arch_vcpu_reset(struct vcpu *v) > >> return 0; > >> } > >> > >> +int arch_domain_set_max_vcpus(struct domain *d) > >> +{ > >> +return 0; > >> +} > >> + > >> static int relinquish_memory(struct domain *d, struct page_list_head > >> *list) > >> { > >> struct page_info *page, *tmp; > >> diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c > >> index dbddc53..0e230f9 100644 > >> --- a/xen/arch/x86/domain.c > >> +++ b/xen/arch/x86/domain.c > >> @@ -1161,6 +1161,31 @@ int arch_vcpu_reset(struct vcpu *v) > >> return 0; > >> } > >> > >> +int arch_domain_set_max_vcpus(struct domain *d) > > > > The name doesn't match what the function does. > > > > I originally hoped to introduce a hook for each arch when set max vcpus. > Each arch function can do customized thing and so named > "arch_domain_set_max_vcpus". > > How about "arch_domain_setup_vcpus_resource"? Before you go away and do a lot of work, please let us think about if this is the right approach first. We are close to freeze, with the amount of patches we receive everyday RFC patch like this one is low on my (can't speak for others) priority list. I am not sure when I will be able to get back to this, but do ping us if you want to know where things stand. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [ovmf test] 113621: all pass - PUSHED
flight 113621 ovmf real [real] http://logs.test-lab.xenproject.org/osstest/logs/113621/ Perfect :-) All tests in this flight passed as required version targeted for testing: ovmf 70dc3ec5a72e0e3fc3ea8f63baecdeafd1110db8 baseline version: ovmf 424a5ec33b3d5a842bff3f4695d0bd709c91a163 Last test of basis 113608 2017-09-19 16:50:44 Z0 days Testing same since 113621 2017-09-20 06:49:55 Z0 days1 attempts People who touched revisions under test: Jiewen Yao jobs: build-amd64-xsm pass build-i386-xsm pass build-amd64 pass build-i386 pass build-amd64-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-i386-pvops pass test-amd64-amd64-xl-qemuu-ovmf-amd64 pass test-amd64-i386-xl-qemuu-ovmf-amd64 pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : + branch=ovmf + revision=70dc3ec5a72e0e3fc3ea8f63baecdeafd1110db8 + . ./cri-lock-repos ++ . ./cri-common +++ . ./cri-getconfig export PERLLIB=.:. PERLLIB=.:. +++ umask 002 +++ getrepos getconfig Repos perl -e ' use Osstest; readglobalconfig(); print $c{"Repos"} or die $!; ' +++ local repos=/home/osstest/repos +++ '[' -z /home/osstest/repos ']' +++ '[' '!' -d /home/osstest/repos ']' +++ echo /home/osstest/repos ++ repos=/home/osstest/repos ++ repos_lock=/home/osstest/repos/lock ++ '[' x '!=' x/home/osstest/repos/lock ']' ++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock ++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push ovmf 70dc3ec5a72e0e3fc3ea8f63baecdeafd1110db8 + branch=ovmf + revision=70dc3ec5a72e0e3fc3ea8f63baecdeafd1110db8 + . ./cri-lock-repos ++ . ./cri-common +++ . ./cri-getconfig export PERLLIB=.:.:. PERLLIB=.:.:. +++ umask 002 +++ getrepos getconfig Repos perl -e ' use Osstest; readglobalconfig(); print $c{"Repos"} or die $!; ' +++ local repos=/home/osstest/repos +++ '[' -z /home/osstest/repos ']' +++ '[' '!' -d /home/osstest/repos ']' +++ echo /home/osstest/repos ++ repos=/home/osstest/repos ++ repos_lock=/home/osstest/repos/lock ++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']' + . ./cri-common ++ . ./cri-getconfig +++ export PERLLIB=.:.:.:. +++ PERLLIB=.:.:.:. ++ umask 002 + select_xenbranch + case "$branch" in + tree=ovmf + xenbranch=xen-unstable + '[' xovmf = xlinux ']' + linuxbranch= + '[' x = x ']' + qemuubranch=qemu-upstream-unstable + select_prevxenbranch ++ ./cri-getprevxenbranch xen-unstable + prevxenbranch=xen-4.9-testing + '[' x70dc3ec5a72e0e3fc3ea8f63baecdeafd1110db8 = x ']' + : tested/2.6.39.x + . ./ap-common ++ : osst...@xenbits.xen.org +++ getconfig OsstestUpstream +++ perl -e ' use Osstest; readglobalconfig(); print $c{"OsstestUpstream"} or die $!; ' ++ : ++ : git://xenbits.xen.org/xen.git ++ : osst...@xenbits.xen.org:/home/xen/git/xen.git ++ : git://xenbits.xen.org/qemu-xen-traditional.git ++ : git://git.kernel.org ++ : git://git.kernel.org/pub/scm/linux/kernel/git ++ : git ++ : git://xenbits.xen.org/xtf.git ++ : osst...@xenbits.xen.org:/home/xen/git/xtf.git ++ : git://xenbits.xen.org/xtf.git ++ : git://xenbits.xen.org/libvirt.git ++ : osst...@xenbits.xen.org:/home/xen/git/libvirt.git ++ : git://xenbits.xen.org/libvirt.git ++ : git://xenbits.xen.org/osstest/rumprun.git ++ : git ++ : git://xenbits.xen.org/osstest/rumprun.git ++ : osst...@xenbits.xen.org:/home/xen/git/osstest/rumprun.git ++ : git://git.seabios.org/seabios.git ++ : osst...@xenbits.xen.org:/home/xen/git/osstest/seabios.git ++ : git://xenbits.xen.org/osstest/seabios.git ++ : https://github.com/tianocore/edk2.git ++ : osst...@xenbits.xen.org:/home/xen/git/osstest/ovmf.git ++ : git://xenbits.xen.org/osstest/ovmf.git ++ : git://xenbits.xen.org/osstest/linux-firmware.git ++ : osst...@xenbits.xen.org:/home/osstest/ext/linux-firmware.git ++
Re: [Xen-devel] Booting signed xen.efi through shim
On Wed, Sep 20, 2017 at 12:30 AM, Jan Beulich wrote: On 20.09.17 at 00:23, wrote: >> On Mon, Sep 18, 2017 at 2:58 AM, Jan Beulich wrote: >> On 14.09.17 at 18:20, wrote: Of course, you can grab them from here: https://drive.google.com/drive/folders/0B5duyI9SzNtWaXE0cjM1QzZJbVk?usp=shar ing >>> >>> So the dumps of the two (using my own tool) are identical except for >>> the expected difference due to the certificate. In particular neither >>> image has any strange relocation types afaics, and both have the >>> sort of unexpected, but also supposedly benign >>> IMAGE_SCN_LNK_NRELOC_OVFL flag set for .bss. Hence I'm afraid ... >>> I've verified that xen-signed.efi boots with Secureboot enabled when booted directly but doesn't boot through the shim. >>> >>> ... you'll need to do some debugging in order to figure out what's >>> going on here. With the above the prime suspect is the shim though, >>> fiddling with the image after loading it into memory. So perhaps >>> dumping the .reloc section contents in order to compare it with >>> what's in the image may be a suitable approach. >> >> Yeap, the shim pretty simply removed the .reloc section as it was >> marked discardable and did the relocations for Xen. So with that >> removed from the shim I no longer get the error and I see that the >> dom0 kernel gets verified using the shim lock protocol. > > So did you instead try whether simply clearing the discardable > flag from the section also helps? The flag ought to matter in > paged environments only (which EFI isn't despite paging being > enabled), but as we see some people think otherwise. Yes, that would work. Even if the shim does its relocations everything "just works" as long as Xen can also find the .reloc section. For now it was just simpler for me to patch the shim to copy the reloc section but it would be ideal if the xen.efi that's produced during compilation would not have that flag set to begin with. I did search around briefly to see where that flag is coming from but the only reference to it within Xen I found was arch/x86/efi/mkreloc.c. So sans writing a separate tool that binary patches xen.efi after compilation to remove that flag I'm not sure how to get that done. > >> I still didn't >> get dom0 to boot for some reason but that might be an unrelated issue >> (and I have no serial console right now). Nevertheless, progress! > > And it doesn't get far enough for you to see any output at all? > Did you try "earlyprintk=xen" on the kernel command line and/or > "vga=keep" on the Xen one? I tried with both just now, no output at all from Xen on the screen after it exits EFI boot. I also couldn't get any output from it on my other laptop with Intel AMT. I did manage to get another Linux kernel booting but my goal was to get a shim verified dom0 kernel booting without an initrd image as right now the ramdisk is not verified by the shim (also not sure how that's supposed to work as sbsign/pesign can only deal with PE/COFF binaries which the ramdisk isn't). Tamas ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 10/22] xentoolcore_restrict_all: Implement for libxenforeignmemory
On Tue, Sep 19, 2017 at 12:08:27PM +0100, Ian Jackson wrote: > Wei Liu writes ("Re: [PATCH 10/22] xentoolcore_restrict_all: Implement for > libxenforeignmemory"): > > Sure that's fine. > > I have this now: > > * If called again with the same domid, it may succeed, or it may > * fail (even though such a call is potentially meaningful). > * (If called again with a different domid, it will necessariloy > * fail.) Typo "necessariloy". ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 03/22] xentoolcore, _restrict_all: Introduce new library and implementation
On Tue, Sep 19, 2017 at 12:04:42PM +0100, Ian Jackson wrote: > Wei Liu writes ("Re: [PATCH 03/22] xentoolcore, _restrict_all: Introduce new > library and implementation"): > > The impression I get from the name and parameter from this API is that > > the following use case is allowed: a device model serving multiple > > domains. > > Such a device model is obviously, by necessity, unrestricted. > > > The device model will open two sets of handlers of various > > libraries. The device model will call restrict_all(domid) to restrict > > its own privileges on certain domid when it sees fit. > > After it has restricted its privileges to domid A, it is no longer > permitted to do things to domid B. Attempting to call restrict_all(B) > will fail. > > > Without filtering, the callbacks are called for all the domains at the > > same time. The code as-is, when resctrict_all(dom1) is called, makes > > privileges on dom2 are also dropped sometimes -- imagine a xenstore > > callback registered for dom2 is called, which makes the connection > > unusable for dom2. > > > > If the aforementioned use case is not anticipated, I think we shouldn't > > accept domid parameter for the resctrict_all function. > > But the domid is precisely the scope of the intended restriction. > After making the call, the malign influence of the calling process is > limited to the specified domid (at least, insofar as the malign > influence is exercised via already-open Xen library handles). > Ah, I know where I got myself confused. You can have my Ack on this patch. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 22/22] RFC: tools: xentoolcore_restrict_all: use domid_t
On Tue, Sep 19, 2017 at 12:01:15PM +0100, Ian Jackson wrote: > Wei Liu writes ("Re: [PATCH 22/22] RFC: tools: xentoolcore_restrict_all: use > domid_t"): > > On Fri, Sep 15, 2017 at 07:48:59PM +0100, Ian Jackson wrote: > > > This is an RFC because it does not currently compile, because not all > > > the places that use xentoolcore have a definition of domid in scope! > ... > > If the places you mentioned are in xen.git we should probably fix them. > > If they are external users like QEMU then we probably can't do much. > > They are in xen.git. The difficulty is that they don't currently > include the right headers. Also, in general, much of the code in > xen.git doesn't always use domid_t. > > (Partly because domid_t is a weird short type.) I would very much like to have domid_t in the toolcore API, just to be consistent with xendevicemodel APIs. I think we should fix xen.git -- including the right header in the places which use toolcore APIs doesn't sound too hard for me. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel