[Xen-devel] [ovmf test] 135451: regressions - FAIL
flight 135451 ovmf real [real] http://logs.test-lab.xenproject.org/osstest/logs/135451/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-amd64-xl-qemuu-ovmf-amd64 21 leak-check/check fail REGR. vs. 135318 version targeted for testing: ovmf 727d7ebaa9f3dab8822d264fbc8104aee8f08867 baseline version: ovmf 20029ca22baaeb9418c1fd9df88d12d32d585cb6 Last test of basis 135318 2019-04-26 10:41:23 Z5 days Failing since135371 2019-04-28 00:41:24 Z4 days3 attempts Testing same since 135451 2019-05-01 06:46:37 Z1 days1 attempts People who touched revisions under test: "Tien Hock, Loh" Anthony PERARD Ard Biesheuvel Bob Feng Bret Barkelew Dandan Bi Fan, ZhijuX Feng, Bob C Hao Wu Igor Druzhinin Laszlo Ersek Marcin Wojtas Michael D Kinney Shenglei Zhang Tien Hock, Loh Wang Fan Wang, Fan Xue ShengfengX Xue, ShengfengX Zhichao Gao Zhiju.Fan jobs: build-amd64-xsm pass build-i386-xsm pass build-amd64 pass build-i386 pass build-amd64-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-i386-pvops pass test-amd64-amd64-xl-qemuu-ovmf-amd64 fail test-amd64-i386-xl-qemuu-ovmf-amd64 pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Not pushing. (No revision log; it would be 1773 lines long.) ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v3 2/4] x86/mem_sharing: introduce and use page_lock_memshr instead of page_lock
>>> On 30.04.19 at 18:03, wrote: > On 4/30/19 4:06 PM, Jan Beulich wrote: > On 30.04.19 at 16:43, wrote: >>> On 4/30/19 9:44 AM, Jan Beulich wrote: >>> On 30.04.19 at 10:28, wrote: > On Tue, Apr 30, 2019 at 1:15 AM Jan Beulich wrote: >> I've outlined a solution already: Make a mem-sharing private variant >> of page_{,un}lock(), derived from the PV ones (but with pieces >> dropped you don't want/need). > > Well, that's what I already did here in this patch. No? No - you've retained a shared _page_{,un}lock(), whereas my suggestion was to have a completely independent pair of functions in mem_sharing.c. The only thing needed by both PV and HVM would then be the PGT_locked flag. >>> >>> But it wasn't obvious to me how the implementations of the actual lock >>> function would be be different. And there's no point in having two >>> identical implementations; in fact, it would be harmful. >> >> The main difference would be the one that Tamas is after - not >> doing the checking that we do for PV. Whether other bits could >> be dropped for a mem-sharing special variant I don't know (yet). > > The "checking" being that the type count doesn't go to 0? > > It's not just page_lock() that does that checking; it's also > _put_page_type(). We can't really change one but leave the other alone. No, I mean the extra debug checking (current_locked_page_*()). See his patch as to what he keeps for mem-sharing, and what he drops. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [freebsd-master test] 135317: regressions - FAIL
On Wed, May 01, 2019 at 11:50:44AM +0100, Ian Jackson wrote: > osstest service owner writes ("[freebsd-master test] 135317: regressions - > FAIL"): > > Tests which did not succeed and are blocking, > > including tests which could not be run: > > build-amd64-xen-freebsd 5 host-install(5) fail REGR. vs. > > 135233 > > I guess this must be a host-specific FreeBSD kernel bug ? Roger, are > you investigating ? Hm, I'm not sure I follow why this is host-specific. It has happened on both fiano1 and godello1. AFAICT this is a regression in the FreeBSD kernel. Do you know if osstest has started a bisection of this? I'm not seeing anything on the summary page. Thanks, Roger. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] xen/arm: skip first page when RAM starts at 0x0
>>> On 02.05.19 at 00:44, wrote: > Hi Jan, I have a question on the PDX code. > > The PDX initialization is a bit different between x86 and ARM, but it > follows roughly the same pattern, look at > xen/arch/x86/srat.c:srat_parse_regions (I take that is where things > happen on x86) and xen/arch/arm/setup.c:init_pdx. > > Mask is initialized calling pdx_init_mask on a start address, then a > loop fills in the rest of the mask calling pdx_region_mask, based on the > memory regions present. > > I wrote a small unit test of the ARM PDX initialization and while I was > playing with addresses and values I noticed that actually if I simply > skip pdx_init_mask and just initialize the mask to 0 (mask = 0) in > init_pdx, the rest of the function always calculates the right mask. > > In fact, there are cases where initializing the mask to a value causes > the rest of the code to miss some potential compressions. While > initializing the mask to 0 leads to more optimizations. I can provide > specific examples if you are curious. > > Before I make any changes to that code, I would like to understand a bit > better why things are done that way today. Do you know why the mask is > initialized to pdx_init_mask(start-of-ram)? I'm confused, and hence I'm perhaps misunderstanding your question. To me it looks like you're re-asking a question already answered. On x86 we don't want to squash out any of the low 32 bits, because of the special address ranges that live below 4Gb. Hence we invoke pdx_init_mask(first-block-at-or-above-4Gb). Note it's not start-of-ram, as you've said. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2 1/4] xen: add hypercall for getting parameters at runtime
>>> On 30.04.19 at 21:05, wrote: > On Fri, 2019-04-05 at 17:01 +0200, Jan Beulich wrote:On 22.03.19 at > 20:28, wrote: >> > Limitations: >> > - Custom runtime parameters (OPT_CUSTOM) are not supported yet. >> > - For integer parameters (OPT_UINT), only unsigned parameters are >> > printed >> > correctly. >> >> For this latter case I wonder whether it wouldn't be better to >> return back the raw binary value, allowing the caller to format >> it in suitable ways (e.g. both signed and unsigned, or dec and >> hex). > > Currently the caller is oblivious to the parameters and their types, > and all retrieved values are printed together in a single string (see > patch 4/4). I'd like to keep it like that for simplicity. Otherwise I > think the caller has to find the types of requested parameters to > produce the right formatting. Actually, since OPT_UINT is used for both > signed and unsigned integer parameters, case-by-case parameter > formatting would be required to do this, and the libx* callers do not > have that knowledge. A "get_" per-parameter function pointer, which > would handle formatting for each parameter, be it int, uint, string or > custom, seems cleaner to me than leaving it to the caller. I disagree. The caller requires no knowledge of the actual format. It could easily format the string twice (once assuming it might be a signed quantity, and another time assuming it might be an unsigned one). All it would need to know is the width of the binary representation. > By the way, the current implementation searches in [__param_start > __param_end), which are only the runtime parameters, not all > parameters, correct? So the series should be renamed to "Support for > reading runtime-only hypervisor parameters". The command is useful for > checking parameters that can be changed at runtime after all. Yes, such renaming would seem desirable. > I believe there are currently no signed integer runtime parameters. So > alternatively we could add a warning to the commit message and/or to > the code stating that signed integer runtime parameters, if added, > would not be printed correctly at the moment. This would gloss over > rather than solve the issue. That's an option. As for other aspects, I don't heavily mind cases not getting dealt with right away which have no practical use right now, as long as the restrictions are clearly spelled out, such that no-one will be surprised when trying to add a runtime option beyond what the code is able to deal with. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [xen-4.9-testing bisection] complete build-i386
branch xen-4.9-testing xenbranch xen-4.9-testing job build-i386 testid xen-build Tree: ovmf git://xenbits.xen.org/osstest/ovmf.git Tree: qemu git://xenbits.xen.org/qemu-xen-traditional.git Tree: qemuu git://xenbits.xen.org/qemu-xen.git Tree: xen git://xenbits.xen.org/xen.git *** Found and reproduced problem changeset *** Bug is in tree: ovmf git://xenbits.xen.org/osstest/ovmf.git Bug introduced: 20029ca22baaeb9418c1fd9df88d12d32d585cb6 Bug not present: 5920a9d16b1ab887c2858224316a98e961d71b05 Last fail repro: http://logs.test-lab.xenproject.org/osstest/logs/135531/ (Revision log too long, omitted.) For bisection revision-tuple graph see: http://logs.test-lab.xenproject.org/osstest/results/bisect/xen-4.9-testing/build-i386.xen-build.html Revision IDs in each graph node refer, respectively, to the Trees above. Running cs-bisection-step --graph-out=/home/logs/results/bisect/xen-4.9-testing/build-i386.xen-build --summary-out=tmp/135531.bisection-summary --basis-template=132889 --blessings=real,real-bisect xen-4.9-testing build-i386 xen-build Searching for failure / basis pass: 135421 fail [host=albana0] / 135185 [host=baroque0] 135037 [host=albana1] 134971 [host=albana1] 134855 ok. Failure / basis pass flights: 135421 / 134855 (tree with no url: minios) (tree with no url: seabios) Tree: ovmf git://xenbits.xen.org/osstest/ovmf.git Tree: qemu git://xenbits.xen.org/qemu-xen-traditional.git Tree: qemuu git://xenbits.xen.org/qemu-xen.git Tree: xen git://xenbits.xen.org/xen.git Latest 20029ca22baaeb9418c1fd9df88d12d32d585cb6 8051789e982499050680a26febeada7467e18a8d aad23066e4b27296d219b9123393fbe2a5a885bb f72414a56fecd8db2963a2dfe4409e27a479992e Basis pass 5920a9d16b1ab887c2858224316a98e961d71b05 8051789e982499050680a26febeada7467e18a8d aad23066e4b27296d219b9123393fbe2a5a885bb f72414a56fecd8db2963a2dfe4409e27a479992e Generating revisions with ./adhoc-revtuple-generator git://xenbits.xen.org/osstest/ovmf.git#5920a9d16b1ab887c2858224316a98e961d71b05-20029ca22baaeb9418c1fd9df88d12d32d585cb6 git://xenbits.xen.org/qemu-xen-traditional.git#8051789e982499050680a26febeada7467e18a8d-8051789e982499050680a26febeada7467e18a8d git://xenbits.xen.org/qemu-xen.git#aad23066e4b27296d219b9123393fbe2a5a885bb-aad23066e4b27296d219b9123393fbe2a5a885bb git://xenbits.xen.org/xen.git#f72414a56fecd8db2963a2dfe4409e27a479992e-f72414a5\ 6fecd8db2963a2dfe4409e27a479992e adhoc-revtuple-generator: tree discontiguous: ovmf Loaded 2 nodes in revision graph Searching for test results: 117931 [host=baroque1] 118167 [host=huxelrebe0] 118222 [host=huxelrebe1] 118347 [host=huxelrebe1] 118387 [host=huxelrebe0] 118417 [host=italia0] 118416 [host=italia0] 118438 [host=huxelrebe0] 118524 [host=huxelrebe0] 118452 [host=fiano0] 118487 [host=huxelrebe1] 118683 [host=pinot0] 118658 [host=huxelrebe1] 118784 [host=italia0] 119776 [host=italia0] 119839 [host=huxelrebe1] 119954 [host=rimava0] 11 [host=rimava0] 120018 [host=huxelrebe1] 12 [host=huxelrebe1] 120063 [host=huxelrebe1] 120105 [host=huxelrebe1] 120144 [host=fiano0] 120221 [host=rimava0] 120320 [host=italia0] 120239 [host=pinot0] 120336 [host=chardonnay1] 120385 [host=italia0] 120538 [host=italia0] 120803 [host=pinot0] 120877 [host=elbling1] 121015 [host=italia0] 120957 [host=elbling1] 121331 [host=fiano0] 121356 [host=italia1] 121460 [host=italia0] 121426 [host=italia0] 121704 [host=huxelrebe1] 121728 [host=rimava0] 121761 [host=huxelrebe1] 122355 [host=huxelrebe1] 122387 [host=huxelrebe0] 122417 [host=fiano0] 122472 [host=huxelrebe0] 122512 [host=italia1] 122659 [host=chardonnay1] 122770 [host=chardonnay0] 122775 [host=elbling1] 122816 [host=baroque1] 122785 [host=baroque1] 122876 [host=baroque1] 122960 [host=chardonnay0] 123009 [host=elbling1] 123123 [host=baroque1] 123122 [host=debina1] 123343 [host=debina1] 123473 [host=elbling1] 123543 [host=pinot1] 123567 [host=elbling1] 123546 [host=huxelrebe0] 123572 [host=debina0] 123586 [host=debina0] 123561 [host=baroque0] 123563 [host=elbling1] 123590 [host=italia1] 123676 [host=italia1] 123801 [host=fiano1] 123835 [host=italia0] 123939 [host=elbling0] 124009 [host=huxelrebe1] 124043 [host=huxelrebe0] 124180 [host=italia0] 124206 pass irrelevant 124248 [host=debina1] 124375 pass irrelevant 124329 [host=debina0] 124328 [host=debina0] 124950 [host=huxelrebe0] 124902 [host=debina1] 125077 [host=joubertin1] 125044 [host=albana1] 125173 pass irrelevant 125171 [host=joubertin1] 125144 [host=debina1] 125253 pass irrelevant 125292 [host=albana1] 125416 [host=albana1] 125529 [host=fiano0] 125605 pass irrelevant 125570 [host=debina1] 125642 [host=debina1] 125650 [host=joubertin0] 125663 [host=debina1] 125686 [host=albana1] 125901 pass irrelevant 125922 [host=albana1] 126077 pass irrelevant 126201 [host=baroque1] 126160 pass irrelevant 126075 [host=baroque1] 126296 [host=huxelrebe1
Re: [Xen-devel] [RFC PATCH 2/7] x86/boot: Remove gratuitous call back into low-memory code
>>> On 01.05.19 at 13:17, wrote: > We appear to have implemented a memcpy() in the low-memory trampoline > which we then call into from __start_xen(), for no adequately defined > reason. May I suggest that in cases like this you look at the commit introducing the function? It supplies a reason: "Add a new raw e820 table for common purpose and copy the BIOS buffer to it. Doing the copying in assembly avoids the need to export the symbols for the BIOS E820 buffer and number of entries." I have to admit that I see value in this, as it avoids introduction of wrong accesses to these variables. Therefore I'd like to ask for better justification of the change. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [OSSTEST PATCH] Drop Xen 4.5 and earlier
>>> On 01.05.19 at 12:46, wrote: > These releases are out of security support. They are known not to > build on Debian stretch, which is what we are using, and we do not > intend to ever update them to fix that. > > Xen 4.6 is also out of security support but we want osstest to be able > to continue to build it so that we can test 4.6->4.7 migration, for > the purposes of testing Xen 4.7, which is still supported right now. > > So we have recently applied some build fixes to the 4.6 tree, and for > now we retain 4.6 in osstest so that build fixes applied to > staging-4.6 can propagate to stable-4.6. > > CC: committ...@xenproject.org > Signed-off-by: Ian Jackson In case it matters Acked-by: Jan Beulich However, ... > --- a/cr-for-branches > +++ b/cr-for-branches > @@ -31,7 +31,7 @@ scriptoptions="$1"; shift > LOGFILE=tmp/cr-for-branches.log > export LOGFILE > > -: ${BRANCHES:=osstest xen-4.0-testing xen-4.1-testing xen-4.2-testing > xen-4.3-testing xen-4.4-testing xen-4.5-testing xen-4.6-testing > xen-4.7-testing > xen-4.8-testing xen-4.9-testing xen-4.10-testing xen-4.11-testing > xen-4.12-testing > xen-unstable qemu-mainline qemu-upstream-unstable qemu-upstream-4.2-testing > qemu-upstream-4.3-testing qemu-upstream-4.4-testing qemu-upstream-4.5-testing > qemu-upstream-4.6-testing qemu-upstream-4.7-testing qemu-upstream-4.8-testing > qemu-upstream-4.9-testing qemu-upstream-4.10-testing > qemu-upstream-4.11-testing > qemu-upstream-4.12-testing linux-linus linux-4.19 linux-4.14 linux-4.9 > linux-4.4 > linux-4.1 linux-3.18 linux-3.16 linux-3.14 linux-3.10 linux-3.4 linux-arm-xen > seabios > ovmf xtf ${EXTRA_BRANCHES}} > +: ${BRANCHES:=osstest xen-4.6-testing xen-4.7-testing xen-4.8-testing > xen-4.9-testing xen-4.10-testing xen-4.11-testing xen-4.12-testing > xen-unstable > qemu-mainline qemu-upstream-unstable qemu-upstream-4.6-testing > qemu-upstream-4.7-testing qemu-upstream-4.8-testing qemu-upstream-4.9-testing > qemu-upstream-4.10-testing qemu-upstream-4.11-testing > qemu-upstream-4.12-testing > linux-linus linux-4.19 linux-4.14 linux-4.9 linux-4.4 linux-4.1 linux-3.18 > linux-3.16 linux-3.14 linux-3.10 linux-3.4 linux-arm-xen seabios ovmf xtf > ${EXTRA_BRANCHES}} ... aren't the older Linux branches then in similar need of pruning? Seeing that for 32-bit Arm you can't even build mainline Linux with the newer compiler, I doubt at least the pretty old ones here have any chance of building cleanly. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] x86/pt: skip setup of posted format IRTE when gvec is 0
On Wed, May 01, 2019 at 12:41:13AM +0800, Chao Gao wrote: > On Tue, Apr 30, 2019 at 11:30:33AM +0200, Roger Pau Monné wrote: > >On Tue, Apr 30, 2019 at 05:01:21PM +0800, Chao Gao wrote: > >> On Tue, Apr 30, 2019 at 01:56:31AM -0600, Jan Beulich wrote: > >> On 30.04.19 at 07:19, wrote: > >> >> When testing with an UP guest with a pass-thru device with vt-d pi > >> >> enabled in host, we observed that guest couldn't receive interrupts > >> >> from that pass-thru device. Dumping IRTE, we found the corresponding > >> >> IRTE is set to posted format with "vector" field as 0. > >> >> > >> >> We would fall into this issue when guest used the pirq format of MSI > >> >> (see the comment xen_msi_compose_msg() in linux kernel). As 'dest_id' > >> >> is repurposed, skip migration which is based on 'dest_id'. > >> > > >> >I've gone through all uses of gvec, and I couldn't find any existing > >> >special casing of it being zero. I assume this is actually communication > >> >between the kernel and qemu, > >> > >> Yes. > >> > >> >in which case I'd like to see an > >> >explanation of why the issue needs to be addressed in Xen rather > >> >than qemu. > >> > >> To call pirq_guest_bind() to configure irq_desc properly. > >> Especially, we append a pointer of struct domain to 'action->guest' in > >> pirq_guest_bind(). Then __do_IRQ_guest() knows domains that are interested > >> in this interrupt and injects an interrupt to those domains. > >> > >> >Otherwise, if I've overlooked something, would you > >> >mind pointing out where such special casing lives in Xen? > >> > > >> >In any event it doesn't look correct to skip migration altogether in > >> >that case. I'd rather expect it to require getting done differently. > >> >After all there still is a (CPU, vector) tuple associated with that > >> >{,p}IRQ if it's not posted, and hvm_migrate_pirq() is a no-op if it is > >> >posted. > >> > >> Here, we try to set irq's target cpu to the cpu which the vmsi's target > >> vcpu > >> is running on to reduce IPI. But the 'dest_id' field which used to > >> indicate the vmsi's target vcpu is missing, we don't know which cpu we > >> should > >> migrate the irq to. One possible choice is the 'chn->notify_vcpu_id' > >> used in send_guest_pirq(). Do you think this choice is fine? > > > >I think that by the time the device model calls into pirq_guest_bind > >the PIRQ won't be bound to any event channel, so pirq->evtchn would be > >0. > > Then skip pirq migration is the only choice here? And we can migrate > pirq when it is bound with an event channel. > > > > >Note that the binding of the PIRQ with the event channel is done > >afterwards in xen_hvm_setup_msi_irqs by the Linux kernel. > > > >It seems like the device model should be using a different set of > >hypercalls to setup a PIRQ that is routed over an event channel, ie: > >PHYSDEVOP_map_pirq and friends. > > Now qemu is using PHYSDEVOP_map_pirq. Right? Oh yes, QEMU already uses PHYSDEVOP_map_pirq to setup the interrupt. Then I'm not sure I see why QEMU calls XEN_DOMCTL_bind_pt_irq for interrupts that are routed over event channels. That hypercall is used to bind a pirq to a native guest interrupt injection mechanism, which shouldn't be used if the interrupt is going to be delivered over an event channel. Can you see about avoiding the XEN_DOMCTL_bind_pt_irq call in QEMU if the interrupt is going to be routed over an event channel? That would avoid having to add more quirks to the hypercall implementation. Thanks, Roger. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] x86/vm_event: add gdtr_base to the vm_event structure
On 5/2/19 2:57 AM, Tamas K Lengyel wrote: Receiving this register is useful for introspecting 32-bit Windows when the event being trapped happened while in ring3. Signed-off-by: Tamas K Lengyel Cc: Razvan Cojocaru Cc: Tamas K Lengyel Cc: Jan Beulich Cc: Andrew Cooper Cc: Wei Liu Cc: Roger Pau Monne --- xen/arch/x86/vm_event.c | 5 + xen/include/public/vm_event.h | 3 ++- 2 files changed, 7 insertions(+), 1 deletion(-) diff --git a/xen/arch/x86/vm_event.c b/xen/arch/x86/vm_event.c index 51c3493b1d..873788e32f 100644 --- a/xen/arch/x86/vm_event.c +++ b/xen/arch/x86/vm_event.c @@ -179,6 +179,10 @@ static void vm_event_pack_segment_register(enum x86_segment segment, reg->es_sel = seg.sel; break; +case x86_seg_gdtr: +reg->gdtr_base = seg.base; +break; Please also add limit, ar, sel, like the others do. In addition to making this modification look less strange (since, in contrast to the function name, nothing is packed for gdtr_base), it will also save future work for applications wanting to use gdtr which also require backwards compatibility with previous vm_event versions. As you know, for each such modification we need to have a separate vm_event_vX header in our applications so that we're ready to create a ring buffer using requests and replies of the right size, and also to be able to properly interpret the ring buffer data, so the least frequent changes to the vm_event struct, the better. Petre is currently working on the vm_event changes that will hopefully enable us to just cache everything that the getcontext_partial hypercall retrieves. Thanks, Razvan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] x86/pt: skip setup of posted format IRTE when gvec is 0
On Tue, Apr 30, 2019 at 05:48:14PM +0100, Andrew Cooper wrote: > On 30/04/2019 17:24, Chao Gao wrote: > > On Tue, Apr 30, 2019 at 11:08:54AM +0200, Roger Pau Monné wrote: > >> On Tue, Apr 30, 2019 at 01:19:19PM +0800, Chao Gao wrote: > >>> When testing with an UP guest with a pass-thru device with vt-d pi > >>> enabled in host, we observed that guest couldn't receive interrupts > >>> from that pass-thru device. Dumping IRTE, we found the corresponding > >>> IRTE is set to posted format with "vector" field as 0. > >>> > >>> We would fall into this issue when guest used the pirq format of MSI > >> I think the above sentence is a bit confusing. I would rather say that > >> the guest is routing interrupts from passthrough devices over event > >> channels using pirqs. > > Yes. It is better than it was. > > > >>> (see the comment xen_msi_compose_msg() in linux kernel). As 'dest_id' > >>> is repurposed, skip migration which is based on 'dest_id'. > >> Like Jan, I wonder why the device model (I assume QEMU) issues a > >> XEN_DOMCTL_bind_pt_irq hypercall when the guest is attempting to route > >> this interrupts over an event channel, attempting to bind it to a > >> vector seems like a bug in the device model. > >> > >> I would also consider whether it would make sense to not expose the > >> XENFEAT_hvm_pirqs feature when doing passthrough if posted interrupts > >> are supported, performance wise it's better to use posted interrupts > >> rather than routing them over event channels (which requires Xen > >> interaction). > > It is reasonable. But I think currently guest can add "xen_nopv" to its > > kernel boot options to avoid using evtchn. There might be some cases > > even with pass-thru devices (such as, guest uses polling mode > > driver for pass-thru devices), we care more about the efficiency of > > delivering virtual interrupts. So a separate option for evtchn in guest > > kernel seems like a better choice. > > This is a festering swamp, and is going to need some careful > architectural work to untangle. > > At the moment, guests which are xen-aware will try to use event channels > for everything. I'm pretty sure we've got some ABI incompatibilities > here with hardware APIC acceleration, and I still have yet(!) to > diagnose the underlying problem which is preventing XenServer from > enabling hardware APIC acceleration. (VMs still intermittently wedge on > migrate with what appears to be lost interrupt.) My opinion is that XENFEAT_hvm_pirqs should be removed: - It abuses of a hardware interface to pass Xen specific information. When routing pirqs over event channels for HVM the pirq is passed on the MSI address field, hijacking the native dest_id field. - Cannot be used with posted interrupts or APICV. Iff routing physical interrupts over event channels is still a useful thing for HVM, it should be done using hypercalls, like it's done on a PV dom0. > > The legacy HVM callback via single vector is incompatible with hardware > APIC acceleration, because it exists specifically to skip going through > the IRR. I can't think a proper solution to this, other than crashing > the guest when it tries to set such a situation up. > > From an "architecturally ideal" point of view, we obviously want to be > using APICV and/or Posted interrupt support whenever possible, and be > using these methods in preference to event channels. > > Injecting interrupts (including signalling an event upcall) using the > PIR is more efficient than other means, so long as we can ensure that > guest will handle the vector like any other edge triggered interrupt, > and ack it at the LAPIC. (This is where the legacy callback via method > breaks down, as it was intentionally designed to be used without an ack > at the LAPIC.) IIRC there's already a correct way to setup an event channel upcall against a vector using HVMOP_set_evtchn_upcall_vector which requires an APIC ack. Now the issue is switching existing users to this new callback setup mechanism. > As soon as we get to device interrupts, posted is definitely the way to > go, especially as the guest should not be aware of the routing behind > the scenes in the first place. > > To be clear, Chao - I'm not asking you to fix this. This is a decade of > technical debt which has been steadily growing, and it needs to start > with a coherent plan what the current state of play is, and how we'd > like to proceed. I agree. I've suggested a possible fix for the problem at hand, but as stated above I think XENFEAT_hvm_pirqs should be removed. Thanks, Roger. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2] x86/vmx: correctly gather gs_shadow value for current vCPU
>>> On 02.05.19 at 08:20, wrote: > On 5/2/19 2:52 AM, Tamas K Lengyel wrote: >> --- a/xen/arch/x86/hvm/vmx/vmx.c >> +++ b/xen/arch/x86/hvm/vmx/vmx.c >> @@ -779,12 +779,17 @@ static void vmx_load_cpu_state(struct vcpu *v, struct >> hvm_hw_cpu *data) >> >> static void vmx_save_vmcs_ctxt(struct vcpu *v, struct hvm_hw_cpu *ctxt) >> { >> +if ( v == current ) >> +vmx_save_guest_msrs(v); > > vmx_save_guest_msrs() is simple enough that the if is probably not > necessary here (we can just call the function directly, regardless of > what v holds). Avoiding an MSR access is perhaps worth the conditional. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2] x86/vmx: correctly gather gs_shadow value for current vCPU
On 5/2/19 11:36 AM, Jan Beulich wrote: On 02.05.19 at 08:20, wrote: On 5/2/19 2:52 AM, Tamas K Lengyel wrote: --- a/xen/arch/x86/hvm/vmx/vmx.c +++ b/xen/arch/x86/hvm/vmx/vmx.c @@ -779,12 +779,17 @@ static void vmx_load_cpu_state(struct vcpu *v, struct hvm_hw_cpu *data) static void vmx_save_vmcs_ctxt(struct vcpu *v, struct hvm_hw_cpu *ctxt) { +if ( v == current ) +vmx_save_guest_msrs(v); vmx_save_guest_msrs() is simple enough that the if is probably not necessary here (we can just call the function directly, regardless of what v holds). Avoiding an MSR access is perhaps worth the conditional. Fair enough. Thanks, Razvan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2] x86/vmx: correctly gather gs_shadow value for current vCPU
On 02/05/2019 07:20, Razvan Cojocaru wrote: > On 5/2/19 2:52 AM, Tamas K Lengyel wrote: >> Currently the gs_shadow value is only cached when the vCPU is being scheduled >> out by Xen. Reporting this (usually) stale value through vm_event is >> incorrect, >> since it doesn't represent the actual state of the vCPU at the time the event >> was recorded. This prevents vm_event subscribers from correctly finding >> kernel >> structures in the guest when it is trapped while in ring3. >> >> Refresh shadow_gs value when the context being saved is for the current vCPU. >> >> Signed-off-by: Tamas K Lengyel >> Cc: Razvan Cojocaru >> Cc: Jun Nakajima >> Cc: Kevin Tian >> Cc: Jan Beulich >> Cc: Andrew Cooper >> Cc: Wei Liu >> Cc: Roger Pau Monne >> --- >> v2: move fix to hvm so vm_event doesn't have to know specifics >> --- >> xen/arch/x86/hvm/vmx/vmx.c | 5 + >> 1 file changed, 5 insertions(+) >> >> diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c >> index 283eb7b34d..5154ecc2a8 100644 >> --- a/xen/arch/x86/hvm/vmx/vmx.c >> +++ b/xen/arch/x86/hvm/vmx/vmx.c >> @@ -779,12 +779,17 @@ static void vmx_load_cpu_state(struct vcpu *v, struct >> hvm_hw_cpu *data) >> >> static void vmx_save_vmcs_ctxt(struct vcpu *v, struct hvm_hw_cpu *ctxt) >> { >> +if ( v == current ) >> +vmx_save_guest_msrs(v); > vmx_save_guest_msrs() is simple enough that the if is probably not > necessary here (we can just call the function directly, regardless of > what v holds). Why? Doing that would fully corrupt v's state when called in remote context, as you'd clobber v's gs_shadow which whatever the value was from the current vcpu. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2] x86/vmx: correctly gather gs_shadow value for current vCPU
On 5/2/19 11:45 AM, Andrew Cooper wrote: On 02/05/2019 07:20, Razvan Cojocaru wrote: On 5/2/19 2:52 AM, Tamas K Lengyel wrote: Currently the gs_shadow value is only cached when the vCPU is being scheduled out by Xen. Reporting this (usually) stale value through vm_event is incorrect, since it doesn't represent the actual state of the vCPU at the time the event was recorded. This prevents vm_event subscribers from correctly finding kernel structures in the guest when it is trapped while in ring3. Refresh shadow_gs value when the context being saved is for the current vCPU. Signed-off-by: Tamas K Lengyel Cc: Razvan Cojocaru Cc: Jun Nakajima Cc: Kevin Tian Cc: Jan Beulich Cc: Andrew Cooper Cc: Wei Liu Cc: Roger Pau Monne --- v2: move fix to hvm so vm_event doesn't have to know specifics --- xen/arch/x86/hvm/vmx/vmx.c | 5 + 1 file changed, 5 insertions(+) diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c index 283eb7b34d..5154ecc2a8 100644 --- a/xen/arch/x86/hvm/vmx/vmx.c +++ b/xen/arch/x86/hvm/vmx/vmx.c @@ -779,12 +779,17 @@ static void vmx_load_cpu_state(struct vcpu *v, struct hvm_hw_cpu *data) static void vmx_save_vmcs_ctxt(struct vcpu *v, struct hvm_hw_cpu *ctxt) { +if ( v == current ) +vmx_save_guest_msrs(v); vmx_save_guest_msrs() is simple enough that the if is probably not necessary here (we can just call the function directly, regardless of what v holds). Why? Doing that would fully corrupt v's state when called in remote context, as you'd clobber v's gs_shadow which whatever the value was from the current vcpu. Good point, I've missed that. Thanks, Razvan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] x86/vm_event: add gdtr_base to the vm_event structure
On 02/05/2019 09:25, Razvan Cojocaru wrote: > On 5/2/19 2:57 AM, Tamas K Lengyel wrote: >> Receiving this register is useful for introspecting 32-bit Windows >> when the >> event being trapped happened while in ring3. >> >> Signed-off-by: Tamas K Lengyel >> Cc: Razvan Cojocaru >> Cc: Tamas K Lengyel >> Cc: Jan Beulich >> Cc: Andrew Cooper >> Cc: Wei Liu >> Cc: Roger Pau Monne >> --- >> xen/arch/x86/vm_event.c | 5 + >> xen/include/public/vm_event.h | 3 ++- >> 2 files changed, 7 insertions(+), 1 deletion(-) >> >> diff --git a/xen/arch/x86/vm_event.c b/xen/arch/x86/vm_event.c >> index 51c3493b1d..873788e32f 100644 >> --- a/xen/arch/x86/vm_event.c >> +++ b/xen/arch/x86/vm_event.c >> @@ -179,6 +179,10 @@ static void vm_event_pack_segment_register(enum >> x86_segment segment, >> reg->es_sel = seg.sel; >> break; >> + case x86_seg_gdtr: >> + reg->gdtr_base = seg.base; >> + break; > > Please also add limit, ar, sel, like the others do. In Xen, we model GDTR/IDTR just like all other segments in the segment cache for convenience/consistency, including faking up of some default attributes. However, there is no such thing as a selector or access rights for them, and the VMCS lacks the fields, while the VMCB marks the fields as reserved. It is almost certainly not worth wasting the space in the ring. ~Andrew P.S. If you consult the state-after-reset table in both SDMs, it does state that Access Rights exist, and default to Present and R/W, but this isn't a field which can be changed at any point. I suspect real pipelines model GDTR/LDTR just like the other segments for simplification of the segmentation logic, which is why I chose to do the same in Xen. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] x86/vm_event: add gdtr_base to the vm_event structure
>>> On 02.05.19 at 10:25, wrote: > On 5/2/19 2:57 AM, Tamas K Lengyel wrote: >> Receiving this register is useful for introspecting 32-bit Windows when the >> event being trapped happened while in ring3. >> >> Signed-off-by: Tamas K Lengyel >> Cc: Razvan Cojocaru >> Cc: Tamas K Lengyel >> Cc: Jan Beulich >> Cc: Andrew Cooper >> Cc: Wei Liu >> Cc: Roger Pau Monne >> --- >> xen/arch/x86/vm_event.c | 5 + >> xen/include/public/vm_event.h | 3 ++- >> 2 files changed, 7 insertions(+), 1 deletion(-) >> >> diff --git a/xen/arch/x86/vm_event.c b/xen/arch/x86/vm_event.c >> index 51c3493b1d..873788e32f 100644 >> --- a/xen/arch/x86/vm_event.c >> +++ b/xen/arch/x86/vm_event.c >> @@ -179,6 +179,10 @@ static void vm_event_pack_segment_register(enum > x86_segment segment, >> reg->es_sel = seg.sel; >> break; >> >> +case x86_seg_gdtr: >> +reg->gdtr_base = seg.base; >> +break; > > Please also add limit, ar, sel, like the others do. There's no ar or sel for GDT (and IDT). Instead, because of ... > In addition to > making this modification look less strange (since, in contrast to the > function name, nothing is packed for gdtr_base), it will also save > future work for applications wanting to use gdtr which also require > backwards compatibility with previous vm_event versions. > > As you know, for each such modification we need to have a separate > vm_event_vX header in our applications so that we're ready to create a > ring buffer using requests and replies of the right size, and also to be > able to properly interpret the ring buffer data, so the least frequent > changes to the vm_event struct, the better. ... this I wonder whether the IDT details shouldn't get added at the same time. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] xen/arm: skip first page when RAM starts at 0x0
Hi Jan, On 5/2/19 8:30 AM, Jan Beulich wrote: On 02.05.19 at 00:44, wrote: Hi Jan, I have a question on the PDX code. The PDX initialization is a bit different between x86 and ARM, but it follows roughly the same pattern, look at xen/arch/x86/srat.c:srat_parse_regions (I take that is where things happen on x86) and xen/arch/arm/setup.c:init_pdx. Mask is initialized calling pdx_init_mask on a start address, then a loop fills in the rest of the mask calling pdx_region_mask, based on the memory regions present. I wrote a small unit test of the ARM PDX initialization and while I was playing with addresses and values I noticed that actually if I simply skip pdx_init_mask and just initialize the mask to 0 (mask = 0) in init_pdx, the rest of the function always calculates the right mask. In fact, there are cases where initializing the mask to a value causes the rest of the code to miss some potential compressions. While initializing the mask to 0 leads to more optimizations. I can provide specific examples if you are curious. Before I make any changes to that code, I would like to understand a bit better why things are done that way today. Do you know why the mask is initialized to pdx_init_mask(start-of-ram)? Well, it is not the start-of-ram on Arm. It is whatever is the start of bank 0. This is because the firmware table (such as DT) may not require ordering and we don't order banks in Xen. So it may be possible the PDX will not compress if the banks are not ordered in the firmware tables. I'm confused, and hence I'm perhaps misunderstanding your question. To me it looks like you're re-asking a question already answered. On x86 we don't want to squash out any of the low 32 bits, because of the special address ranges that live below 4Gb. Hence we invoke pdx_init_mask(first-block-at-or-above-4Gb). Note it's not start-of-ram, as you've said. I think what Stefano is asking is why pdx_init_mask(...) is invoked with the first block address rather than 4GB (or even 0 thought I don't think this is right). By using the first block address, the PDX will not be able to compress any bits between 0 and the MSB 1' in the first block address. In other word, for a base address 0x2 (8GB), the initial mask will be 0x1. Stefano and I were wondering whether it would instead be possible to create the initial mask with pdx_init_mask(4GB) or pdx_init_mask(1GB) (I.e the maximum contiguous range the buddy allocator can allocate). Cheers, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] x86/vm_event: add gdtr_base to the vm_event structure
On 5/2/19 11:52 AM, Andrew Cooper wrote: On 02/05/2019 09:25, Razvan Cojocaru wrote: On 5/2/19 2:57 AM, Tamas K Lengyel wrote: Receiving this register is useful for introspecting 32-bit Windows when the event being trapped happened while in ring3. Signed-off-by: Tamas K Lengyel Cc: Razvan Cojocaru Cc: Tamas K Lengyel Cc: Jan Beulich Cc: Andrew Cooper Cc: Wei Liu Cc: Roger Pau Monne --- xen/arch/x86/vm_event.c | 5 + xen/include/public/vm_event.h | 3 ++- 2 files changed, 7 insertions(+), 1 deletion(-) diff --git a/xen/arch/x86/vm_event.c b/xen/arch/x86/vm_event.c index 51c3493b1d..873788e32f 100644 --- a/xen/arch/x86/vm_event.c +++ b/xen/arch/x86/vm_event.c @@ -179,6 +179,10 @@ static void vm_event_pack_segment_register(enum x86_segment segment, reg->es_sel = seg.sel; break; + case x86_seg_gdtr: + reg->gdtr_base = seg.base; + break; Please also add limit, ar, sel, like the others do. In Xen, we model GDTR/IDTR just like all other segments in the segment cache for convenience/consistency, including faking up of some default attributes. However, there is no such thing as a selector or access rights for them, and the VMCS lacks the fields, while the VMCB marks the fields as reserved. It is almost certainly not worth wasting the space in the ring. Right, I got carried away there: I saw gdtr_limit and didn't check that if the rest is available. Thanks, Razvan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] xen/arm: skip first page when RAM starts at 0x0
>>> On 02.05.19 at 11:02, wrote: > On 5/2/19 8:30 AM, Jan Beulich wrote: > On 02.05.19 at 00:44, wrote: >>> Hi Jan, I have a question on the PDX code. >>> >>> The PDX initialization is a bit different between x86 and ARM, but it >>> follows roughly the same pattern, look at >>> xen/arch/x86/srat.c:srat_parse_regions (I take that is where things >>> happen on x86) and xen/arch/arm/setup.c:init_pdx. >>> >>> Mask is initialized calling pdx_init_mask on a start address, then a >>> loop fills in the rest of the mask calling pdx_region_mask, based on the >>> memory regions present. >>> >>> I wrote a small unit test of the ARM PDX initialization and while I was >>> playing with addresses and values I noticed that actually if I simply >>> skip pdx_init_mask and just initialize the mask to 0 (mask = 0) in >>> init_pdx, the rest of the function always calculates the right mask. >>> >>> In fact, there are cases where initializing the mask to a value causes >>> the rest of the code to miss some potential compressions. While >>> initializing the mask to 0 leads to more optimizations. I can provide >>> specific examples if you are curious. >>> >>> Before I make any changes to that code, I would like to understand a bit >>> better why things are done that way today. Do you know why the mask is >>> initialized to pdx_init_mask(start-of-ram)? > > Well, it is not the start-of-ram on Arm. It is whatever is the start of > bank 0. This is because the firmware table (such as DT) may not require > ordering and we don't order banks in Xen. > > So it may be possible the PDX will not compress if the banks are not > ordered in the firmware tables. Even more so a reason not to use bank 0's start address. >> I'm confused, and hence I'm perhaps misunderstanding your >> question. To me it looks like you're re-asking a question already >> answered. On x86 we don't want to squash out any of the low >> 32 bits, because of the special address ranges that live below >> 4Gb. Hence we invoke pdx_init_mask(first-block-at-or-above-4Gb). >> Note it's not start-of-ram, as you've said. > > I think what Stefano is asking is why pdx_init_mask(...) is invoked with > the first block address rather than 4GB (or even 0 thought I don't think > this is right). > > By using the first block address, the PDX will not be able to compress > any bits between 0 and the MSB 1' in the first block address. In other > word, for a base address 0x2 (8GB), the initial mask will be > 0x1. > > Stefano and I were wondering whether it would instead be possible to > create the initial mask with pdx_init_mask(4GB) or pdx_init_mask(1GB) > (I.e the maximum contiguous range the buddy allocator can allocate). That's indeed an option - it's just that I've yet to see an x86 system where there's a hole starting at 4Gb. What's better in that case can probably be judged only once run into such a case. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [RFC PATCH 2/7] x86/boot: Remove gratuitous call back into low-memory code
On 02/05/2019 09:14, Jan Beulich wrote: On 01.05.19 at 13:17, wrote: >> We appear to have implemented a memcpy() in the low-memory trampoline >> which we then call into from __start_xen(), for no adequately defined >> reason. > May I suggest that in cases like this you look at the commit > introducing the function? It supplies a reason: > > "Add a new raw e820 table for common purpose and copy the BIOS buffer > to it. Doing the copying in assembly avoids the need to export the > symbols for the BIOS E820 buffer and number of entries." I completely agree with David here. That description is completely insufficient for justifying why the logic was done that way in the end, and I would not have accepted the patch in that form at all. To be clear. I understand (and agree) why having our main copy of the e820 table in the trampoline is a bad thing, and moving the main copy out of the trampoline is fine. What isn't fine is why, in the adjusted world, we call back into what appears to be the trampoline, but isn't, to get access to data which the calling code can already access. In particular... > I have to admit that I see value in this, as it avoids introduction > of wrong accesses to these variables. ...this reasoning is bogus. We're either accessing the data itself, or the memcpy function, but there is no possible way to programatically avoid "wrong" access into the trampoline, because we're still accessing it. Therefore, I don't understand what possible benefit not exporting the data has in the first place, and why complicating it with a call to a function (which isn't actually executing where it appears to live), is considered a benefit. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [xen-4.11-testing test] 135436: regressions - FAIL
osstest service owner writes ("[xen-4.11-testing test] 135436: regressions - FAIL"): > flight 135436 xen-4.11-testing real [real] > http://logs.test-lab.xenproject.org/osstest/logs/135436/ > > Regressions :-( > > Tests which did not succeed and are blocking, > including tests which could not be run: > test-amd64-amd64-xl-qcow216 guest-saverestore.2 fail REGR. vs. > 134998 This is the known qcow problem. I have force pushed this. Ian. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] linux 4.19 xenstore memory allocation failure Re: [linux-4.19 test] 135420: regressions - FAIL
On Wed, May 01, 2019 at 10:47:49AM +0100, Ian Jackson wrote: > Roger Pau Monne writes ("Re: [Xen-devel] linux 4.19 xenstore memory > allocation failure Re: [linux-4.19 test] 135420: regressions - FAIL"): > > On Tue, Apr 30, 2019 at 03:33:00PM +0100, Ian Jackson wrote: > > > I will leave answering this to the blkfront/linux folks... > > > > I think those allocations used to be small enough that kcalloc was > > likely fine. Now with multiple rings, and multiple pages per ring > > those have grown to a point where kcalloc is not fine anymore. I will > > prepare a patch to switch to kvcalloc. > > Thanks. > > FYI this same issue was reported by osstest in > Subject: [linux-linus test] 135426: regressions - FAIL > ie on linux master. > > ISTM that this patch you propose will have to go to stable branches > too ? I agree. > > > I would have hoped that it would result in something other than a > > > hang. At worst, blkfront ought to go into a state where it *knows* > > > that it is utterly broken and reports this properly. > > > > I haven't yet checked all the possible error paths, but the ones I've > > looked at use xenbus_dev_fatal which switches the device state to > > closing and writes the error message into xenstore. > > What if you can't write to xenstore ? Can we at least have a copy in > the kernel log ? There might be other errors besides this memory > exhaustion, surely. There's a call to dev_err also, which should print the same error that's written to xenstore on the console. That however requires the memory allocation of page in order to format the string to be printed (see xenbus_va_dev_error). Roger. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [RFC PATCH 2/7] x86/boot: Remove gratuitous call back into low-memory code
> On 02/05/2019 09:14, Jan Beulich wrote: > On 01.05.19 at 13:17, wrote: >>> We appear to have implemented a memcpy() in the low-memory trampoline >>> which we then call into from __start_xen(), for no adequately defined >>> reason. >> May I suggest that in cases like this you look at the commit >> introducing the function? It supplies a reason: >> >> "Add a new raw e820 table for common purpose and copy the BIOS buffer >> to it. Doing the copying in assembly avoids the need to export the >> symbols for the BIOS E820 buffer and number of entries." > > I completely agree with David here. That description is completely > insufficient for justifying why the logic was done that way in the end, > and I would not have accepted the patch in that form at all. > > To be clear. I understand (and agree) why having our main copy of the > e820 table in the trampoline is a bad thing, and moving the main copy > out of the trampoline is fine. > > What isn't fine is why, in the adjusted world, we call back into what > appears to be the trampoline, but isn't, to get access to data which the > calling code can already access. In particular... > >> I have to admit that I see value in this, as it avoids introduction >> of wrong accesses to these variables. > > ...this reasoning is bogus. We're either accessing the data itself, or > the memcpy function, but there is no possible way to programatically > avoid "wrong" access into the trampoline, because we're still accessing > it. > > Therefore, I don't understand what possible benefit not exporting the > data has in the first place, and why complicating it with a call to a > function (which isn't actually executing where it appears to live), is > considered a benefit. Note that after bringing these two gratuitously differently handled symbols in line with everything else in the boot trampoline, a later patch in the series puts all this stuff into its own data section which is copied back up to the Xen image at the end of the real mode phase of boot, and all the pointer gymnastics for all of them go away completely -- dwmw2 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [RFC PATCH 2/7] x86/boot: Remove gratuitous call back into low-memory code
>>> On 02.05.19 at 11:23, wrote: > On 02/05/2019 09:14, Jan Beulich wrote: > On 01.05.19 at 13:17, wrote: >>> We appear to have implemented a memcpy() in the low-memory trampoline >>> which we then call into from __start_xen(), for no adequately defined >>> reason. >> May I suggest that in cases like this you look at the commit >> introducing the function? It supplies a reason: >> >> "Add a new raw e820 table for common purpose and copy the BIOS buffer >> to it. Doing the copying in assembly avoids the need to export the >> symbols for the BIOS E820 buffer and number of entries." > > I completely agree with David here. That description is completely > insufficient for justifying why the logic was done that way in the end, > and I would not have accepted the patch in that form at all. > > To be clear. I understand (and agree) why having our main copy of the > e820 table in the trampoline is a bad thing, and moving the main copy > out of the trampoline is fine. > > What isn't fine is why, in the adjusted world, we call back into what > appears to be the trampoline, but isn't, to get access to data which the > calling code can already access. In particular... > >> I have to admit that I see value in this, as it avoids introduction >> of wrong accesses to these variables. > > ...this reasoning is bogus. We're either accessing the data itself, or > the memcpy function, but there is no possible way to programatically > avoid "wrong" access into the trampoline, because we're still accessing it. > > Therefore, I don't understand what possible benefit not exporting the > data has in the first place, and why complicating it with a call to a > function (which isn't actually executing where it appears to live), is > considered a benefit. Without having gone back to the old thread, I think part of the motivation to avoid such direct data accesses was that an initial version of the patch actually broke things by introducing such accesses. Apart from that, despite not being a heavy C++ user, I very much appreciate the language's fundamental desire to avoid direct accesses to data elements, providing functions (methods) to do so instead. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [linux-3.18 test] 135441: regressions - FAIL
flight 135441 linux-3.18 real [real] http://logs.test-lab.xenproject.org/osstest/logs/135441/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: build-arm64-pvopsbroken in 135415 build-arm64-pvops 4 host-install(4) broken in 135415 REGR. vs. 128858 test-amd64-i386-libvirt-xsm 7 xen-boot fail REGR. vs. 128858 test-amd64-amd64-xl-credit1 7 xen-boot fail REGR. vs. 128858 test-amd64-i386-xl-xsm7 xen-boot fail REGR. vs. 128858 test-amd64-i386-xl-raw7 xen-boot fail REGR. vs. 128858 test-amd64-i386-xl-qemuu-ovmf-amd64 7 xen-boot fail REGR. vs. 128858 test-amd64-i386-libvirt-pair 10 xen-boot/src_hostfail REGR. vs. 128858 test-amd64-i386-libvirt-pair 11 xen-boot/dst_hostfail REGR. vs. 128858 test-amd64-amd64-xl-qemuu-ovmf-amd64 7 xen-boot fail REGR. vs. 128858 test-amd64-amd64-xl-qemuu-debianhvm-amd64 7 xen-bootfail REGR. vs. 128858 test-amd64-amd64-xl-multivcpu 7 xen-bootfail REGR. vs. 128858 test-amd64-i386-examine 8 reboot fail REGR. vs. 128858 test-amd64-amd64-xl-shadow7 xen-boot fail REGR. vs. 128858 test-amd64-amd64-pair10 xen-boot/src_hostfail REGR. vs. 128858 test-amd64-amd64-i386-pvgrub 7 xen-boot fail REGR. vs. 128858 test-amd64-amd64-pair11 xen-boot/dst_hostfail REGR. vs. 128858 test-amd64-amd64-xl-xsm 7 xen-boot fail REGR. vs. 128858 test-amd64-amd64-libvirt 7 xen-boot fail REGR. vs. 128858 test-amd64-amd64-amd64-pvgrub 7 xen-bootfail REGR. vs. 128858 test-amd64-amd64-xl 7 xen-boot fail REGR. vs. 128858 test-amd64-amd64-qemuu-nested-intel 7 xen-boot fail REGR. vs. 128858 test-amd64-i386-libvirt 7 xen-boot fail REGR. vs. 128858 test-amd64-i386-pair 10 xen-boot/src_hostfail REGR. vs. 128858 test-amd64-i386-pair 11 xen-boot/dst_hostfail REGR. vs. 128858 test-amd64-i386-xl-qemuu-debianhvm-amd64-shadow 7 xen-boot fail REGR. vs. 128858 test-amd64-amd64-xl-pvshim7 xen-boot fail REGR. vs. 128858 test-amd64-amd64-libvirt-xsm 7 xen-boot fail REGR. vs. 128858 test-amd64-i386-freebsd10-i386 7 xen-boot fail REGR. vs. 128858 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 7 xen-boot fail REGR. vs. 128858 test-armhf-armhf-libvirt16 guest-start/debian.repeat fail REGR. vs. 128858 test-amd64-amd64-xl-qcow2 17 guest-localmigrate/x10 fail in 135415 REGR. vs. 128858 Tests which are failing intermittently (not blocking): test-armhf-armhf-xl 7 xen-boot fail in 135415 pass in 135441 test-armhf-armhf-libvirt 12 guest-start fail in 135415 pass in 135441 test-amd64-amd64-xl-qcow216 guest-saverestore.2fail pass in 135415 Regressions which are regarded as allowable (not blocking): test-amd64-amd64-xl-rtds 7 xen-boot fail REGR. vs. 128858 Tests which did not succeed, but are not blocking: test-arm64-arm64-libvirt-xsm 1 build-check(1) blocked n/a test-arm64-arm64-xl-xsm 1 build-check(1) blocked n/a test-arm64-arm64-examine 1 build-check(1) blocked n/a test-arm64-arm64-xl-credit2 1 build-check(1) blocked n/a test-arm64-arm64-xl 1 build-check(1) blocked n/a test-arm64-arm64-xl-credit1 1 build-check(1) blocked n/a test-armhf-armhf-libvirt-raw 10 debian-di-install fail in 135415 like 128841 test-armhf-armhf-xl-vhd 12 migrate-support-check fail in 135415 never pass test-armhf-armhf-xl-vhd 13 saverestore-support-check fail in 135415 never pass test-armhf-armhf-xl-vhd 10 debian-di-installfail like 128841 test-armhf-armhf-libvirt 14 saverestore-support-checkfail like 128858 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail like 128858 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 128858 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 128858 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 128858 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 128858 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 10 debian-hvm-install fail never pass test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict 10 debian-hvm-install fail never pass test-amd64-amd64-xl-pvhv2-intel 12 guest-start fail never pass test-amd64-amd64-xl-pvhv2-amd 12 guest-start fail never pass test-amd64-i386-xl-pvshim12 guest-start fail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-ch
[Xen-devel] [linux-linus test] 135443: regressions - FAIL
flight 135443 linux-linus real [real] http://logs.test-lab.xenproject.org/osstest/logs/135443/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-amd64-xl-multivcpu 18 guest-localmigrate/x10 fail REGR. vs. 133580 test-arm64-arm64-examine11 examine-serial/bootloader fail REGR. vs. 133580 test-amd64-amd64-xl-qcow216 guest-saverestore.2 fail REGR. vs. 133580 build-armhf-pvops 6 kernel-build fail REGR. vs. 133580 Tests which did not succeed, but are not blocking: test-armhf-armhf-xl 1 build-check(1) blocked n/a test-armhf-armhf-examine 1 build-check(1) blocked n/a test-armhf-armhf-xl-credit1 1 build-check(1) blocked n/a test-armhf-armhf-xl-rtds 1 build-check(1) blocked n/a test-armhf-armhf-xl-credit2 1 build-check(1) blocked n/a test-armhf-armhf-xl-arndale 1 build-check(1) blocked n/a test-armhf-armhf-xl-cubietruck 1 build-check(1) blocked n/a test-armhf-armhf-libvirt-raw 1 build-check(1) blocked n/a test-armhf-armhf-xl-multivcpu 1 build-check(1) blocked n/a test-armhf-armhf-xl-vhd 1 build-check(1) blocked n/a test-armhf-armhf-libvirt 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 133580 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 133580 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 133580 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 133580 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stopfail like 133580 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 133580 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail never pass test-amd64-i386-libvirt 13 migrate-support-checkfail never pass test-amd64-i386-xl-pvshim12 guest-start fail never pass test-amd64-i386-libvirt-xsm 13 migrate-support-checkfail never pass test-amd64-amd64-libvirt 13 migrate-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail never pass test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2 fail never pass test-arm64-arm64-xl-credit1 13 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 14 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit2 13 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 14 saverestore-support-checkfail never pass test-arm64-arm64-xl-xsm 13 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 14 saverestore-support-checkfail never pass test-arm64-arm64-xl 13 migrate-support-checkfail never pass test-arm64-arm64-xl 14 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail never pass test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop fail never pass test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop fail never pass test-amd64-amd64-xl-qemut-win10-i386 10 windows-installfail never pass test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass test-amd64-i386-xl-qemut-win10-i386 10 windows-install fail never pass version targeted for testing: linuxf2bc9c908dfe3f56fe4ca4d92e5c5be80963b973 baseline version: linux736706bee3298208343a76096370e4f6a5c55915 Last test of basis 133580 2019-03-04 19:53:09 Z 58 days Failing since133605 2019-03-05 20:03:14 Z 57 days 30 attempts Testing same since 135443 2019-04-30 22:53:38 Z1 days1 attempts 2357 people touched revisions under test, not listing them all jobs: build-amd64-xsm pass build-arm64-xsm pass build-i386-xsm pass build-amd64 pass build-arm64 pass build-armhf pass build-i386 pass build-amd64-libvirt pass build
Re: [Xen-devel] linux 4.19 xenstore memory allocation failure Re: [linux-4.19 test] 135420: regressions - FAIL
Roger Pau Monne writes ("Re: [Xen-devel] linux 4.19 xenstore memory allocation failure Re: [linux-4.19 test] 135420: regressions - FAIL"): > On Wed, May 01, 2019 at 10:47:49AM +0100, Ian Jackson wrote: > > What if you can't write to xenstore ? Can we at least have a copy in > > the kernel log ? There might be other errors besides this memory > > exhaustion, surely. > > There's a call to dev_err also, which should print the same error > that's written to xenstore on the console. That however requires the > memory allocation of page in order to format the string to be printed > (see xenbus_va_dev_error). Can we assume that memory exhaustion will always result in some message from the memory allocator ? If so then this new log message would be a useful addition for cases *other* than a complete lack of any available free page. Eg, foolishly trying a large kcalloc allocation, or some error not related to lack of memory at all. Ian. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2] x86/vmx: correctly gather gs_shadow value for current vCPU
On 02/05/2019 00:52, Tamas K Lengyel wrote: > diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c > index 283eb7b34d..5154ecc2a8 100644 > --- a/xen/arch/x86/hvm/vmx/vmx.c > +++ b/xen/arch/x86/hvm/vmx/vmx.c > @@ -779,12 +779,17 @@ static void vmx_load_cpu_state(struct vcpu *v, struct > hvm_hw_cpu *data) > > static void vmx_save_vmcs_ctxt(struct vcpu *v, struct hvm_hw_cpu *ctxt) > { > +if ( v == current ) > +vmx_save_guest_msrs(v); > + > vmx_save_cpu_state(v, ctxt); > vmx_vmcs_save(v, ctxt); > } > > static int vmx_load_vmcs_ctxt(struct vcpu *v, struct hvm_hw_cpu *ctxt) > { > +ASSERT(v != current); I'd leave a comment along the lines of /* Not currently safe to use in current context. */ Can be fixed up on commit. This version is much cleaner, architecturally speaking, so Reviewed-by: Andrew Cooper I'll drop the previous version out of x86-next. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] build-amd64-xen-freebsd (Re: [freebsd-master test] 135317: regressions - FAIL)
Roger Pau Monn� writes ("Re: [Xen-devel] [freebsd-master test] 135317: regressions - FAIL"): > On Wed, May 01, 2019 at 11:50:44AM +0100, Ian Jackson wrote: > > I guess this must be a host-specific FreeBSD kernel bug ? Roger, are > > you investigating ? > > Hm, I'm not sure I follow why this is host-specific. It has happened > on both fiano1 and godello1. AFAICT this is a regression in the > FreeBSD kernel. I thought it must be host-specific because I thought build-amd64-xen-freebsd would be done with the previous, anointed, version of freebsd. But in fact it is done with the freshly built freebsd version. So this reasoning was wrong. > Do you know if osstest has started a bisection of this? I'm not seeing > anything on the summary page. It has completed it. See attached. Ian. --- Begin Message --- branch xen-unstable xenbranch xen-unstable job build-amd64-xen-freebsd testid host-install(5) Tree: freebsd git://github.com/freebsd/freebsd.git Tree: qemuu git://xenbits.xen.org/qemu-xen.git Tree: xen git://xenbits.xen.org/xen.git *** Found and reproduced problem changeset *** Bug is in tree: freebsd git://github.com/freebsd/freebsd.git Bug introduced: d61e108233bfdb3dfc507938f2a839b9884f053d Bug not present: 070cf1ede1850d8c1824181e258b6ec1ac293255 Last fail repro: http://logs.test-lab.xenproject.org/osstest/logs/135357/ commit d61e108233bfdb3dfc507938f2a839b9884f053d Author: kevans Date: Thu Apr 25 12:44:08 2019 + tun/tap: close race between destroy/ioctl handler It seems that there should be a better way to handle this, but this seems to be the more common approach and it should likely get replaced in all of the places it happens... Basically, thread 1 is in the process of destroying the tun/tap while thread 2 is executing one of the ioctls that requires the tun/tap mutex and the mutex is destroyed before the ioctl handler can acquire it. This is only one of the races described/found in PR 233955. PR: 233955 Reviewed by:ae MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D20027 For bisection revision-tuple graph see: http://logs.test-lab.xenproject.org/osstest/results/bisect/freebsd-master/build-amd64-xen-freebsd.host-install(5).html Revision IDs in each graph node refer, respectively, to the Trees above. Running cs-bisection-step '--graph-out=/home/logs/results/bisect/freebsd-master/build-amd64-xen-freebsd.host-install(5)' --summary-out=tmp/135357.bisection-summary --basis-template=135233 --blessings=real,real-bisect freebsd-master build-amd64-xen-freebsd 'host-install(5)' Searching for failure / basis pass: 135317 fail [host=godello1] / 135233 ok. Failure / basis pass flights: 135317 / 135233 (tree in basispass but not in latest: seabios) Tree: freebsd git://github.com/freebsd/freebsd.git Tree: qemuu git://xenbits.xen.org/qemu-xen.git Tree: xen git://xenbits.xen.org/xen.git Latest 4284b348ee30b1fa32b59632063e08537834f86b de5b678ca4dcdfa83e322491d478d66df56c1986 cb70a26f78848fe45f593f7ebc9cfaac760a791b Basis pass b58321507702a1125aed58ddc320b560b1bffc71 de5b678ca4dcdfa83e322491d478d66df56c1986 cb70a26f78848fe45f593f7ebc9cfaac760a791b Generating revisions with ./adhoc-revtuple-generator git://github.com/freebsd/freebsd.git#b58321507702a1125aed58ddc320b560b1bffc71-4284b348ee30b1fa32b59632063e08537834f86b git://xenbits.xen.org/qemu-xen.git#de5b678ca4dcdfa83e322491d478d66df56c1986-de5b678ca4dcdfa83e322491d478d66df56c1986 git://xenbits.xen.org/xen.git#cb70a26f78848fe45f593f7ebc9cfaac760a791b-cb70a26f78848fe45f593f7ebc9cfaac760a791b Auto packing the repository in background for optimum performance. See "git help gc" for manual housekeeping. error: The last gc run reported the following. Please correct the root cause and remove gc.log. Automatic cleanup will not be performed until the file is removed. warning: There are too many unreachable loose objects; run 'git prune' to remove them. Auto packing the repository in background for optimum performance. See "git help gc" for manual housekeeping. error: The last gc run reported the following. Please correct the root cause and remove gc.log. Automatic cleanup will not be performed until the file is removed. warning: There are too many unreachable loose objects; run 'git prune' to remove them. Use of uninitialized value $parents in array dereference at ./adhoc-revtuple-generator line 465. Use of uninitialized value in concatenation (.) or string at ./adhoc-revtuple-generator line 465. Use of uninitialized value $parents in array dereference at ./adhoc-revtuple-generator line 465. Use of uninitialized value in concatenation (.) or string at ./adhoc-revtuple-generator line 465. Loaded 1012 nodes in revision graph Searching for test results: 135233 pass b58321507702a1125aed58ddc320b560b1bffc71 de5b678ca4dcdfa83e322491d47
Re: [Xen-devel] linux 4.19 xenstore memory allocation failure Re: [linux-4.19 test] 135420: regressions - FAIL
On Thu, May 02, 2019 at 11:42:04AM +0100, Ian Jackson wrote: > Roger Pau Monne writes ("Re: [Xen-devel] linux 4.19 xenstore memory > allocation failure Re: [linux-4.19 test] 135420: regressions - FAIL"): > > On Wed, May 01, 2019 at 10:47:49AM +0100, Ian Jackson wrote: > > > What if you can't write to xenstore ? Can we at least have a copy in > > > the kernel log ? There might be other errors besides this memory > > > exhaustion, surely. > > > > There's a call to dev_err also, which should print the same error > > that's written to xenstore on the console. That however requires the > > memory allocation of page in order to format the string to be printed > > (see xenbus_va_dev_error). > > Can we assume that memory exhaustion will always result in some > message from the memory allocator ? If so then this new log message I'm not sure I understand to what new log message you are referring to. The dev_err call is already present in xenbus_va_dev_error, so everything that's attempted to write to xenstore should also be printed on the console. > would be a useful addition for cases *other* than a complete lack of > any available free page. Eg, foolishly trying a large kcalloc > allocation, or some error not related to lack of memory at all. If there's no real memory shortage a failure to attach a frontend should result in a message being written to both xenstore and the console with the current Linux code AFAICT provided xenbus_dev_{fatal/error} or xenbus_switch_fatal is used. Thanks, Roger. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] build-amd64-xen-freebsd (Re: [freebsd-master test] 135317: regressions - FAIL)
On Thu, May 02, 2019 at 11:50:59AM +0100, Ian Jackson wrote: > Roger Pau Monn� writes ("Re: [Xen-devel] [freebsd-master test] 135317: > regressions - FAIL"): > > On Wed, May 01, 2019 at 11:50:44AM +0100, Ian Jackson wrote: > > > I guess this must be a host-specific FreeBSD kernel bug ? Roger, are > > > you investigating ? > > > > Hm, I'm not sure I follow why this is host-specific. It has happened > > on both fiano1 and godello1. AFAICT this is a regression in the > > FreeBSD kernel. > > I thought it must be host-specific because I thought > build-amd64-xen-freebsd would be done with the previous, anointed, > version of freebsd. But in fact it is done with the freshly built > freebsd version. So this reasoning was wrong. > > > Do you know if osstest has started a bisection of this? I'm not seeing > > anything on the summary page. > > It has completed it. See attached. Oh thanks! I've already reported this upstream. Is there anyway I could get Cc'ed on bisections of freebsd-* branches? I already get CC'ed on normal flight reports. Thanks, Roger. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH 0/9] XSA-292 follow-up
Various CR3 and PCID related adjustments, first and foremost an almost full re-write of switch_cr3_cr4() (in patch 2). 1: x86: adjust cr3_pcid() return type 2: x86: limit the amount of TLB flushing in switch_cr3_cr4() 3: x86/mm: honor opt_pcid also for 32-bit PV domains 4: x86/HVM: move NOFLUSH handling out of hvm_set_cr3() 5: x86/HVM: refuse CR3 loads with reserved (upper) bits set 6: x86/HVM: relax shadow mode check in hvm_set_cr3() 7: x86/HVM: cosmetics to hvm_set_cr3() 8: x86/CPUID: drop INVPCID dependency on PCID 9: x86: PCID is unused when !PV Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [RFC PATCH 2/7] x86/boot: Remove gratuitous call back into low-memory code
On Thu, 2019-05-02 at 10:23 +0100, Andrew Cooper wrote: > ...this reasoning is bogus. We're either accessing the data itself, or > the memcpy function, but there is no possible way to programatically > avoid "wrong" access into the trampoline, because we're still accessing it. Just to be clear, now I'm back at a real computer: yes there is a way to avoid "wrong" access into the trampoline from the 64-bit C code. It's the 6th patch in my series, for which this one is a necessary preliminary cleanup. http://git.infradead.org/users/dwmw2/xen.git/commitdiff/f54044fb3ad5d It rips out all those horrid bootsym() pointer gymnastics in the 64-bit code, except for the limited cases where we really do need to put things in place for the permanently AP/wakeup trampolines. It puts all these boot-discovered variables into their own section, then copies them back into the Xen image when the 16-bit boot code has finished running, to be accessed from there ever after. Which also paves the way for the no-real-mode boot not actually putting *anything* into low memory during boot at all. This allows us to clean up the EFI code to stop having to do that, too. smime.p7s Description: S/MIME cryptographic signature ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH] x86/boot: Fix latent memory corruption with early_boot_opts_t
c/s ebb26b509f "xen/x86: make VGA support selectable" added an #ifdef CONFIG_VIDEO into the middle the backing space for early_boot_opts_t, but didn't adjust the structure definition in cmdline.c This only functions correctly because the affected fields are at the end of the structure, and cmdline.c doesn't write to them in this case. To retain the slimming effect of compiling out CONFIG_VIDEO, adjust cmdline.c with enough #ifdef-ary to make C's idea of the structure match the declaration in asm. This requires adding __maybe_unused annotations to two helper functions. Signed-off-by: Andrew Cooper --- CC: Jan Beulich CC: Wei Liu CC: Roger Pau Monné CC: David Woodhouse This needs backporting to Xen 4.11 Discovered while reviewing David's "Clean up x86_64 boot code" series. --- xen/arch/x86/boot/cmdline.c | 14 ++ xen/arch/x86/boot/defs.h| 1 + 2 files changed, 11 insertions(+), 4 deletions(-) diff --git a/xen/arch/x86/boot/cmdline.c b/xen/arch/x86/boot/cmdline.c index 51b0659..fc11c6d 100644 --- a/xen/arch/x86/boot/cmdline.c +++ b/xen/arch/x86/boot/cmdline.c @@ -40,10 +40,12 @@ typedef struct __packed { u8 opt_edd; u8 opt_edid; u8 padding; +#ifdef CONFIG_VIDEO u16 boot_vid_mode; u16 vesa_width; u16 vesa_height; u16 vesa_depth; +#endif } early_boot_opts_t; /* @@ -127,7 +129,8 @@ static size_t strcspn(const char *s, const char *reject) return count; } -static unsigned int strtoui(const char *s, const char *stop, const char **next) +static unsigned int __maybe_unused strtoui( +const char *s, const char *stop, const char **next) { char base = 10, l; unsigned long long res = 0; @@ -176,7 +179,7 @@ static int strmaxcmp(const char *cs, const char *ct, const char *_delim_chars) return strncmp(cs, ct, max(strcspn(cs, _delim_chars), strlen(ct))); } -static int strsubcmp(const char *cs, const char *ct) +static int __maybe_unused strsubcmp(const char *cs, const char *ct) { return strncmp(cs, ct, strlen(ct)); } @@ -241,6 +244,7 @@ static u8 edid_parse(const char *cmdline) return !strmaxcmp(c, "no", delim_chars); } +#ifdef CONFIG_VIDEO static u16 rows2vmode(unsigned int rows) { switch ( rows ) @@ -328,6 +332,7 @@ static void vga_parse(const char *cmdline, early_boot_opts_t *ebo) ebo->boot_vid_mode = tmp; } } +#endif void __stdcall cmdline_parse_early(const char *cmdline, early_boot_opts_t *ebo) { @@ -338,6 +343,7 @@ void __stdcall cmdline_parse_early(const char *cmdline, early_boot_opts_t *ebo) ebo->opt_edd = edd_parse(cmdline); ebo->opt_edid = edid_parse(cmdline); -if ( IS_ENABLED(CONFIG_VIDEO) ) -vga_parse(cmdline, ebo); +#ifdef CONFIG_VIDEO +vga_parse(cmdline, ebo); +#endif } diff --git a/xen/arch/x86/boot/defs.h b/xen/arch/x86/boot/defs.h index 05921a6..f57ad53 100644 --- a/xen/arch/x86/boot/defs.h +++ b/xen/arch/x86/boot/defs.h @@ -23,6 +23,7 @@ #include "../../../include/xen/stdbool.h" #define __packed __attribute__((__packed__)) +#define __maybe_unused __attribute__((__unused__)) #define __stdcall __attribute__((__stdcall__)) #define NULL ((void *)0) -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH 1/9] x86: adjust cr3_pcid() return type
There's no need for it to be 64 bits wide - only the low twelve bits of CR3 hold the PCID. Signed-off-by: Jan Beulich --- a/xen/arch/x86/flushtlb.c +++ b/xen/arch/x86/flushtlb.c @@ -103,7 +103,8 @@ static void do_tlb_flush(void) void switch_cr3_cr4(unsigned long cr3, unsigned long cr4) { -unsigned long flags, old_cr4, old_pcid; +unsigned long flags, old_cr4; +unsigned int old_pcid; u32 t; /* This non-reentrant function is sometimes called in interrupt context. */ --- a/xen/include/asm-x86/processor.h +++ b/xen/include/asm-x86/processor.h @@ -292,7 +292,7 @@ static inline unsigned long cr3_pa(unsig return cr3 & X86_CR3_ADDR_MASK; } -static inline unsigned long cr3_pcid(unsigned long cr3) +static inline unsigned int cr3_pcid(unsigned long cr3) { return cr3 & X86_CR3_PCID_MASK; } ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH 2/9] x86: limit the amount of TLB flushing in switch_cr3_cr4()
We really need to flush the TLB just once, if we do so with or after the CR3 write. The only case where two flushes are unavoidable is when we mean to turn off CR4.PGE (perhaps just temporarily; see the code comment). Signed-off-by: Jan Beulich --- a/xen/arch/x86/flushtlb.c +++ b/xen/arch/x86/flushtlb.c @@ -104,82 +104,65 @@ static void do_tlb_flush(void) void switch_cr3_cr4(unsigned long cr3, unsigned long cr4) { unsigned long flags, old_cr4; -unsigned int old_pcid; u32 t; +/* Throughout this function we make this assumption: */ +ASSERT(!(cr4 & X86_CR4_PCIDE) || !(cr4 & X86_CR4_PGE)); + /* This non-reentrant function is sometimes called in interrupt context. */ local_irq_save(flags); t = pre_flush(); old_cr4 = read_cr4(); -if ( old_cr4 & X86_CR4_PGE ) +ASSERT(!(old_cr4 & X86_CR4_PCIDE) || !(old_cr4 & X86_CR4_PGE)); + +/* + * We need to write CR4 before CR3 if we're about to enable PCIDE, at the + * very least when the new PCID is non-zero. + * + * As we also need to do two CR4 writes in total when PGE is enabled and + * is to remain enabled, do the one temporarily turning off the bit right + * here as well. + * + * The only TLB flushing effect we depend on here is in case we move from + * PGE set to PCIDE set, where we want global page entries gone (and none + * to re-appear) after this write. + */ +if ( !(old_cr4 & X86_CR4_PCIDE) && + ((cr4 & X86_CR4_PCIDE) || (cr4 & old_cr4 & X86_CR4_PGE)) ) { -/* - * X86_CR4_PGE set means PCID is inactive. - * We have to purge the TLB via flipping cr4.pge. - */ old_cr4 = cr4 & ~X86_CR4_PGE; write_cr4(old_cr4); } -else if ( use_invpcid ) -{ -/* - * Flushing the TLB via INVPCID is necessary only in case PCIDs are - * in use, which is true only with INVPCID being available. - * Without PCID usage the following write_cr3() will purge the TLB - * (we are in the cr4.pge off path) of all entries. - * Using invpcid_flush_all_nonglobals() seems to be faster than - * invpcid_flush_all(), so use that. - */ -invpcid_flush_all_nonglobals(); - -/* - * CR4.PCIDE needs to be set before the CR3 write below. Otherwise - * - the CR3 write will fault when CR3.NOFLUSH is set (which is the - * case normally), - * - the subsequent CR4 write will fault if CR3.PCID != 0. - */ -if ( (old_cr4 & X86_CR4_PCIDE) < (cr4 & X86_CR4_PCIDE) ) -{ -write_cr4(cr4); -old_cr4 = cr4; -} -} /* - * If we don't change PCIDs, the CR3 write below needs to flush this very - * PCID, even when a full flush was performed above, as we are currently - * accumulating TLB entries again from the old address space. - * NB: Clearing the bit when we don't use PCID is benign (as it is clear - * already in that case), but allows the if() to be more simple. + * If the CR4 write is to turn off PCIDE, we don't need the CR3 write to + * flush anything, as that transition is a full flush itself. */ -old_pcid = cr3_pcid(read_cr3()); -if ( old_pcid == cr3_pcid(cr3) ) -cr3 &= ~X86_CR3_NOFLUSH; - +if ( (old_cr4 & X86_CR4_PCIDE) > (cr4 & X86_CR4_PCIDE) ) +cr3 |= X86_CR3_NOFLUSH; write_cr3(cr3); if ( old_cr4 != cr4 ) write_cr4(cr4); /* - * Make sure no TLB entries related to the old PCID created between - * flushing the TLB and writing the new %cr3 value remain in the TLB. - * - * The write to CR4 just above has performed a wider flush in certain - * cases, which therefore get excluded here. Since that write is - * conditional, note in particular that it won't be skipped if PCIDE - * transitions from 1 to 0. This is because the CR4 write further up will - * have been skipped in this case, as PCIDE and PGE won't both be set at - * the same time. - * - * Note also that PGE is always clear in old_cr4. + * PGE | PCIDE | flush at + * --+---+ + * 0->0 | 0->0 | CR3 write + * 0->0 | 0->1 | n/a (see 1st CR4 write) + * 0->x | 1->0 | CR4 write + * x->1 | x->1 | n/a + * 0->0 | 1->1 | INVPCID + * 0->1 | 0->0 | CR3 and CR4 writes + * 1->0 | 0->0 | CR4 write + * 1->0 | 0->1 | n/a (see 1st CR4 write) + * 1->1 | 0->0 | n/a (see 1st CR4 write) + * 1->x | 1->x | n/a */ -if ( old_pcid != cr3_pcid(cr3) && - !(cr4 & X86_CR4_PGE) && - (old_cr4 & X86_CR4_PCIDE) <= (cr4 & X86_CR4_PCIDE) ) -invpcid_flush_single_context(old_pcid); +if ( cr4 & X86_CR4_PCIDE ) +invpcid_flush_all_nonglobals(); post_flush(t); ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https:/
[Xen-devel] [PATCH 3/9] x86/mm: honor opt_pcid also for 32-bit PV domains
I can't see any technical or performance reason why we should treat 32-bit PV different from 64-bit PV in this regard. Signed-off-by: Jan Beulich --- a/xen/arch/x86/pv/domain.c +++ b/xen/arch/x86/pv/domain.c @@ -180,7 +180,24 @@ int switch_compat(struct domain *d) d->arch.x87_fip_width = 4; d->arch.pv.xpti = false; -d->arch.pv.pcid = false; + +if ( use_invpcid && cpu_has_pcid ) +switch ( ACCESS_ONCE(opt_pcid) ) +{ +case PCID_OFF: +case PCID_XPTI: +d->arch.pv.pcid = false; +break; + +case PCID_ALL: +case PCID_NOXPTI: +d->arch.pv.pcid = true; +break; + +default: +ASSERT_UNREACHABLE(); +break; +} return 0; @@ -312,7 +329,7 @@ int pv_domain_initialise(struct domain * d->arch.pv.xpti = is_hardware_domain(d) ? opt_xpti_hwdom : opt_xpti_domu; -if ( !is_pv_32bit_domain(d) && use_invpcid && cpu_has_pcid ) +if ( use_invpcid && cpu_has_pcid ) switch ( ACCESS_ONCE(opt_pcid) ) { case PCID_OFF: ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH 4/9] x86/HVM: move NOFLUSH handling out of hvm_set_cr3()
The bit is meaningful only for MOV-to-CR3 insns, not anywhere else, in particular not when loading nested guest state. Signed-off-by: Jan Beulich --- a/xen/arch/x86/hvm/emulate.c +++ b/xen/arch/x86/hvm/emulate.c @@ -2072,6 +2072,8 @@ static int hvmemul_write_cr( HVMTRACE_LONG_2D(CR_WRITE, reg, TRC_PAR_LONG(val)); switch ( reg ) { +bool noflush; + case 0: rc = hvm_set_cr0(val, true); break; @@ -2082,7 +2084,10 @@ static int hvmemul_write_cr( break; case 3: -rc = hvm_set_cr3(val, true); +noflush = hvm_pcid_enabled(current) && (val & X86_CR3_NOFLUSH); +if ( noflush ) +val &= ~X86_CR3_NOFLUSH; +rc = hvm_set_cr3(val, noflush, true); break; case 4: --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -2053,12 +2053,17 @@ int hvm_mov_to_cr(unsigned int cr, unsig switch ( cr ) { +bool noflush; + case 0: rc = hvm_set_cr0(val, true); break; case 3: -rc = hvm_set_cr3(val, true); +noflush = hvm_pcid_enabled(curr) && (val & X86_CR3_NOFLUSH); +if ( noflush ) +val &= ~X86_CR3_NOFLUSH; +rc = hvm_set_cr3(val, noflush, true); break; case 4: @@ -2276,12 +2281,11 @@ int hvm_set_cr0(unsigned long value, boo return X86EMUL_OKAY; } -int hvm_set_cr3(unsigned long value, bool may_defer) +int hvm_set_cr3(unsigned long value, bool noflush, bool may_defer) { struct vcpu *v = current; struct page_info *page; unsigned long old = v->arch.hvm.guest_cr[3]; -bool noflush = false; if ( may_defer && unlikely(v->domain->arch.monitor.write_ctrlreg_enabled & monitor_ctrlreg_bitmask(VM_EVENT_X86_CR3)) ) @@ -2293,17 +2297,12 @@ int hvm_set_cr3(unsigned long value, boo /* The actual write will occur in hvm_do_resume(), if permitted. */ v->arch.vm_event->write_data.do_write.cr3 = 1; v->arch.vm_event->write_data.cr3 = value; +v->arch.vm_event->write_data.cr3_noflush = noflush; return X86EMUL_OKAY; } } -if ( hvm_pcid_enabled(v) ) /* Clear the noflush bit. */ -{ -noflush = value & X86_CR3_NOFLUSH; -value &= ~X86_CR3_NOFLUSH; -} - if ( hvm_paging_enabled(v) && !paging_mode_hap(v->domain) && (value != v->arch.hvm.guest_cr[3]) ) { @@ -2998,7 +2997,7 @@ void hvm_task_switch( if ( task_switch_load_seg(x86_seg_ldtr, tss.ldt, new_cpl, 0) ) goto out; -rc = hvm_set_cr3(tss.cr3, true); +rc = hvm_set_cr3(tss.cr3, false, true); if ( rc == X86EMUL_EXCEPTION ) hvm_inject_hw_exception(TRAP_gp_fault, 0); if ( rc != X86EMUL_OKAY ) --- a/xen/arch/x86/hvm/svm/nestedsvm.c +++ b/xen/arch/x86/hvm/svm/nestedsvm.c @@ -324,7 +324,7 @@ static int nsvm_vcpu_hostrestore(struct v->arch.guest_table = pagetable_null(); /* hvm_set_cr3() below sets v->arch.hvm.guest_cr[3] for us. */ } -rc = hvm_set_cr3(n1vmcb->_cr3, true); +rc = hvm_set_cr3(n1vmcb->_cr3, false, true); if ( rc == X86EMUL_EXCEPTION ) hvm_inject_hw_exception(TRAP_gp_fault, 0); if (rc != X86EMUL_OKAY) @@ -584,7 +584,7 @@ static int nsvm_vmcb_prepare4vmrun(struc nestedsvm_vmcb_set_nestedp2m(v, ns_vmcb, n2vmcb); /* hvm_set_cr3() below sets v->arch.hvm.guest_cr[3] for us. */ -rc = hvm_set_cr3(ns_vmcb->_cr3, true); +rc = hvm_set_cr3(ns_vmcb->_cr3, false, true); if ( rc == X86EMUL_EXCEPTION ) hvm_inject_hw_exception(TRAP_gp_fault, 0); if (rc != X86EMUL_OKAY) @@ -598,7 +598,7 @@ static int nsvm_vmcb_prepare4vmrun(struc * we assume it intercepts page faults. */ /* hvm_set_cr3() below sets v->arch.hvm.guest_cr[3] for us. */ -rc = hvm_set_cr3(ns_vmcb->_cr3, true); +rc = hvm_set_cr3(ns_vmcb->_cr3, false, true); if ( rc == X86EMUL_EXCEPTION ) hvm_inject_hw_exception(TRAP_gp_fault, 0); if (rc != X86EMUL_OKAY) --- a/xen/arch/x86/hvm/vm_event.c +++ b/xen/arch/x86/hvm/vm_event.c @@ -110,7 +110,7 @@ void hvm_vm_event_do_resume(struct vcpu if ( unlikely(w->do_write.cr3) ) { -if ( hvm_set_cr3(w->cr3, false) == X86EMUL_EXCEPTION ) +if ( hvm_set_cr3(w->cr3, w->cr3_noflush, false) == X86EMUL_EXCEPTION ) hvm_inject_hw_exception(TRAP_gp_fault, 0); w->do_write.cr3 = 0; --- a/xen/arch/x86/hvm/vmx/vvmx.c +++ b/xen/arch/x86/hvm/vmx/vvmx.c @@ -1028,7 +1028,7 @@ static void load_shadow_guest_state(stru if ( rc == X86EMUL_EXCEPTION ) hvm_inject_hw_exception(TRAP_gp_fault, 0); -rc = hvm_set_cr3(get_vvmcs(v, GUEST_CR3), true); +rc = hvm_set_cr3(get_vvmcs(v, GUEST_CR3), false, true); if ( rc == X86EMUL_EXCEPTION ) hvm_inject_hw_exception(TRAP_gp_fault, 0); @@ -1242,7 +1242,7 @@ static void load_vvmcs
[Xen-devel] [PATCH 5/9] x86/HVM: refuse CR3 loads with reserved (upper) bits set
While bits 11 and below are, it not used for other purposes, reserved but ignored, bits beyond physical address width are supposed to raise exceptions (at least in the non-nested case; I'm not convinced the current nested SVM/VMX behavior of raising #GP(0) here is correct, but that's not the subject of this change). Introduce currd as a local variable, and replace other v->domain instances at the same time. Signed-off-by: Jan Beulich --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -1003,6 +1003,13 @@ static int hvm_load_cpu_ctxt(struct doma return -EINVAL; } +if ( ctxt.cr3 & ~((1UL << d->arch.cpuid->extd.maxphysaddr) - 1) ) +{ +printk(XENLOG_G_ERR "HVM%d restore: bad CR3 %#" PRIx64 "\n", + d->domain_id, ctxt.cr3); +return X86EMUL_EXCEPTION; +} + if ( (ctxt.flags & ~XEN_X86_FPU_INITIALISED) != 0 ) { gprintk(XENLOG_ERR, "bad flags value in CPU context: %#x\n", @@ -2284,10 +2291,19 @@ int hvm_set_cr0(unsigned long value, boo int hvm_set_cr3(unsigned long value, bool noflush, bool may_defer) { struct vcpu *v = current; +struct domain *currd = v->domain; struct page_info *page; unsigned long old = v->arch.hvm.guest_cr[3]; -if ( may_defer && unlikely(v->domain->arch.monitor.write_ctrlreg_enabled & +if ( value & ~((1UL << currd->arch.cpuid->extd.maxphysaddr) - 1) ) +{ +HVM_DBG_LOG(DBG_LEVEL_1, +"Attempt to set reserved CR3 bit(s): %lx", +value); +return X86EMUL_EXCEPTION; +} + +if ( may_defer && unlikely(currd->arch.monitor.write_ctrlreg_enabled & monitor_ctrlreg_bitmask(VM_EVENT_X86_CR3)) ) { ASSERT(v->arch.vm_event); @@ -2303,13 +2319,12 @@ int hvm_set_cr3(unsigned long value, boo } } -if ( hvm_paging_enabled(v) && !paging_mode_hap(v->domain) && +if ( hvm_paging_enabled(v) && !paging_mode_hap(currd) && (value != v->arch.hvm.guest_cr[3]) ) { /* Shadow-mode CR3 change. Check PDBR and update refcounts. */ HVM_DBG_LOG(DBG_LEVEL_VMMU, "CR3 value = %lx", value); -page = get_page_from_gfn(v->domain, value >> PAGE_SHIFT, - NULL, P2M_ALLOC); +page = get_page_from_gfn(currd, value >> PAGE_SHIFT, NULL, P2M_ALLOC); if ( !page ) goto bad_cr3; @@ -2325,7 +2340,7 @@ int hvm_set_cr3(unsigned long value, boo bad_cr3: gdprintk(XENLOG_ERR, "Invalid CR3\n"); -domain_crash(v->domain); +domain_crash(currd); return X86EMUL_UNHANDLEABLE; } ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH 6/9] x86/HVM: relax shadow mode check in hvm_set_cr3()
There's no need to re-obtain a page reference if only bits not affecting the address change. Signed-off-by: Jan Beulich --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -2320,7 +2320,7 @@ int hvm_set_cr3(unsigned long value, boo } if ( hvm_paging_enabled(v) && !paging_mode_hap(currd) && - (value != v->arch.hvm.guest_cr[3]) ) + ((value ^ v->arch.hvm.guest_cr[3]) >> PAGE_SHIFT) ) { /* Shadow-mode CR3 change. Check PDBR and update refcounts. */ HVM_DBG_LOG(DBG_LEVEL_VMMU, "CR3 value = %lx", value); ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH 7/9] x86/HVM: cosmetics to hvm_set_cr3()
Eliminate the not really useful local variable "old". Reduce the scope of "page". Rename the latched "current". Signed-off-by: Jan Beulich --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -2290,10 +2290,8 @@ int hvm_set_cr0(unsigned long value, boo int hvm_set_cr3(unsigned long value, bool noflush, bool may_defer) { -struct vcpu *v = current; -struct domain *currd = v->domain; -struct page_info *page; -unsigned long old = v->arch.hvm.guest_cr[3]; +struct vcpu *curr = current; +struct domain *currd = curr->domain; if ( value & ~((1UL << currd->arch.cpuid->extd.maxphysaddr) - 1) ) { @@ -2306,36 +2304,38 @@ int hvm_set_cr3(unsigned long value, boo if ( may_defer && unlikely(currd->arch.monitor.write_ctrlreg_enabled & monitor_ctrlreg_bitmask(VM_EVENT_X86_CR3)) ) { -ASSERT(v->arch.vm_event); +ASSERT(curr->arch.vm_event); -if ( hvm_monitor_crX(CR3, value, old) ) +if ( hvm_monitor_crX(CR3, value, curr->arch.hvm.guest_cr[3]) ) { /* The actual write will occur in hvm_do_resume(), if permitted. */ -v->arch.vm_event->write_data.do_write.cr3 = 1; -v->arch.vm_event->write_data.cr3 = value; -v->arch.vm_event->write_data.cr3_noflush = noflush; +curr->arch.vm_event->write_data.do_write.cr3 = 1; +curr->arch.vm_event->write_data.cr3 = value; +curr->arch.vm_event->write_data.cr3_noflush = noflush; return X86EMUL_OKAY; } } -if ( hvm_paging_enabled(v) && !paging_mode_hap(currd) && - ((value ^ v->arch.hvm.guest_cr[3]) >> PAGE_SHIFT) ) +if ( hvm_paging_enabled(curr) && !paging_mode_hap(currd) && + ((value ^ curr->arch.hvm.guest_cr[3]) >> PAGE_SHIFT) ) { /* Shadow-mode CR3 change. Check PDBR and update refcounts. */ +struct page_info *page; + HVM_DBG_LOG(DBG_LEVEL_VMMU, "CR3 value = %lx", value); page = get_page_from_gfn(currd, value >> PAGE_SHIFT, NULL, P2M_ALLOC); if ( !page ) goto bad_cr3; -put_page(pagetable_get_page(v->arch.guest_table)); -v->arch.guest_table = pagetable_from_page(page); +put_page(pagetable_get_page(curr->arch.guest_table)); +curr->arch.guest_table = pagetable_from_page(page); HVM_DBG_LOG(DBG_LEVEL_VMMU, "Update CR3 value = %lx", value); } -v->arch.hvm.guest_cr[3] = value; -paging_update_cr3(v, noflush); +curr->arch.hvm.guest_cr[3] = value; +paging_update_cr3(curr, noflush); return X86EMUL_OKAY; bad_cr3: ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH 8/9] x86/CPUID: drop INVPCID dependency on PCID
PCID validly depends on LM, as it can be enabled in Long Mode only. INVPCID, otoh, can be used not only without PCID enabled, but also outside of Long Mode altogether. In both cases its functionality is simply restricted to PCID 0, which is sort of expected as no other PCID can be activated there. Signed-off-by: Jan Beulich --- a/xen/tools/gen-cpuid.py +++ b/xen/tools/gen-cpuid.py @@ -217,10 +217,6 @@ def crunch_numbers(state): # # SSE4_2: [POPCNT] -# The INVPCID instruction depends on PCID infrastructure being -# available. -PCID: [INVPCID], - # XSAVE is an extra set of instructions for state management, but # doesn't constitue new state itself. Some of the dependent features # are instructions built on top of base XSAVE, while others are new ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH 9/9] x86: PCID is unused when !PV
This allows in particular some streamlining of the TLB flushing code paths. Signed-off-by: Jan Beulich --- a/xen/arch/x86/flushtlb.c +++ b/xen/arch/x86/flushtlb.c @@ -24,6 +24,11 @@ #define WRAP_MASK (0x03FFU) #endif +#ifndef CONFIG_PV +# undef X86_CR4_PCIDE +# define X86_CR4_PCIDE 0 +#endif + u32 tlbflush_clock = 1U; DEFINE_PER_CPU(u32, tlbflush_time); --- a/xen/include/asm-x86/processor.h +++ b/xen/include/asm-x86/processor.h @@ -294,7 +294,11 @@ static inline unsigned long cr3_pa(unsig static inline unsigned int cr3_pcid(unsigned long cr3) { +#ifdef CONFIG_PV return cr3 & X86_CR3_PCID_MASK; +#else +return 0; +#endif } static inline unsigned long read_cr4(void) @@ -306,8 +310,12 @@ static inline void write_cr4(unsigned lo { struct cpu_info *info = get_cpu_info(); +#ifdef CONFIG_PV /* No global pages in case of PCIDs enabled! */ ASSERT(!(val & X86_CR4_PGE) || !(val & X86_CR4_PCIDE)); +#else +ASSERT(!(val & X86_CR4_PCIDE)); +#endif /* * On hardware supporting FSGSBASE, the value in %cr4 is the kernel's --- a/xen/include/asm-x86/pv/domain.h +++ b/xen/include/asm-x86/pv/domain.h @@ -50,8 +50,13 @@ */ static inline unsigned long get_pcid_bits(const struct vcpu *v, bool is_xpti) { +#ifdef CONFIG_PV return X86_CR3_NOFLUSH | (is_xpti ? PCID_PV_XPTI : 0) | ((v->arch.flags & TF_kernel_mode) ? PCID_PV_PRIV : PCID_PV_USER); +#else +ASSERT_UNREACHABLE(); +return 0; +#endif } #ifdef CONFIG_PV ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] linux 4.19 xenstore memory allocation failure Re: [linux-4.19 test] 135420: regressions - FAIL
Roger Pau Monne writes ("Re: [Xen-devel] linux 4.19 xenstore memory allocation failure Re: [linux-4.19 test] 135420: regressions - FAIL"): > On Thu, May 02, 2019 at 11:42:04AM +0100, Ian Jackson wrote: > > Can we assume that memory exhaustion will always result in some > > message from the memory allocator ? If so then this new log message > > I'm not sure I understand to what new log message you are referring > to. The dev_err call is already present in xenbus_va_dev_error, so > everything that's attempted to write to xenstore should also be > printed on the console. Oh, I misunderstood. I thought you were talking about a hypothetical new dev_err call. Does that mean that you think in this case it tried to write a message to the console and that too failed due to lack of memory ? In which case it probably did the best it could. Thanks, Ian. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH] VT-d: suppress individual flushes during hwdom setup
There's an invocation of iommu_flush_all() immediately afterwards. Signed-off-by: Jan Beulich --- a/xen/drivers/passthrough/vtd/iommu.c +++ b/xen/drivers/passthrough/vtd/iommu.c @@ -1310,8 +1310,11 @@ static void __hwdom_init intel_iommu_hwd setup_hwdom_pci_devices(d, setup_hwdom_device); setup_hwdom_rmrr(d); + /* Make sure workarounds are applied before enabling the IOMMU(s). */ +this_cpu(iommu_dont_flush_iotlb) = true; arch_iommu_hwdom_init(d); +this_cpu(iommu_dont_flush_iotlb) = false; if ( iommu_flush_all() ) printk(XENLOG_WARNING VTDPREFIX ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH] x86/HVM: p2m_ram_ro is incompatible with device pass-through
The write-discard property of the type can't be represented in IOMMU page table entries. Make sure the respective checks / tracking can't race, by utilizing the domain lock. The other sides of the sharing/ paging/log-dirty exclusion checks should subsequently perhaps also be put under that lock then. Take the opportunity and also convert neighboring bool_t to bool in struct hvm_domain. Signed-off-by: Jan Beulich --- a/xen/arch/x86/hvm/dm.c +++ b/xen/arch/x86/hvm/dm.c @@ -255,16 +255,32 @@ static int set_mem_type(struct domain *d mem_type = array_index_nospec(data->mem_type, ARRAY_SIZE(memtype)); -if ( mem_type == HVMMEM_ioreq_server ) +switch ( mem_type ) { unsigned int flags; +case HVMMEM_ioreq_server: if ( !hap_enabled(d) ) return -EOPNOTSUPP; /* Do not change to HVMMEM_ioreq_server if no ioreq server mapped. */ if ( !p2m_get_ioreq_server(d, &flags) ) return -EINVAL; + +break; + +case HVMMEM_ram_ro: +/* p2m_ram_ro can't be represented in IOMMU mappings. */ +domain_lock(d); +if ( has_iommu_pt(d) ) +rc = -EXDEV; +d->arch.hvm.p2m_ram_ro_used = true; +domain_unlock(d); + +if ( rc ) +return rc; + +break; } while ( iter < data->nr ) --- a/xen/drivers/passthrough/pci.c +++ b/xen/drivers/passthrough/pci.c @@ -1450,17 +1450,36 @@ static int assign_device(struct domain * if ( !iommu_enabled || !hd->platform_ops ) return 0; -/* Prevent device assign if mem paging or mem sharing have been - * enabled for this domain */ -if ( unlikely(d->arch.hvm.mem_sharing_enabled || - vm_event_check_ring(d->vm_event_paging) || +domain_lock(d); + +/* + * Prevent device assignment if any of + * - mem paging + * - mem sharing + * - the p2m_ram_ro type + * - global log-dirty mode + * are in use by this domain. + */ +if ( unlikely(vm_event_check_ring(d->vm_event_paging) || +#ifdef CONFIG_HVM + (is_hvm_domain(d) && + (d->arch.hvm.mem_sharing_enabled || +d->arch.hvm.p2m_ram_ro_used)) || +#endif p2m_get_hostp2m(d)->global_logdirty) ) +{ +domain_unlock(d); return -EXDEV; +} if ( !pcidevs_trylock() ) +{ +domain_unlock(d); return -ERESTART; +} rc = iommu_construct(d); +domain_unlock(d); if ( rc ) { pcidevs_unlock(); --- a/xen/include/asm-x86/hvm/domain.h +++ b/xen/include/asm-x86/hvm/domain.h @@ -156,10 +156,11 @@ struct hvm_domain { struct viridian_domain *viridian; -bool_t hap_enabled; -bool_t mem_sharing_enabled; -bool_t qemu_mapcache_invalidate; -bool_t is_s3_suspended; +bool hap_enabled; +bool mem_sharing_enabled; +bool p2m_ram_ro_used; +bool qemu_mapcache_invalidate; +bool is_s3_suspended; /* * TSC value that VCPUs use to calculate their tsc_offset value. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [qemu-upstream-4.10-testing bisection] complete build-amd64
branch xen-4.10-testing xenbranch xen-4.10-testing job build-amd64 testid xen-build Tree: ovmf git://xenbits.xen.org/osstest/ovmf.git Tree: qemu git://xenbits.xen.org/qemu-xen-traditional.git Tree: qemuu git://xenbits.xen.org/qemu-xen.git Tree: xen git://xenbits.xen.org/xen.git *** Found and reproduced problem changeset *** Bug is in tree: ovmf git://xenbits.xen.org/osstest/ovmf.git Bug introduced: 20029ca22baaeb9418c1fd9df88d12d32d585cb6 Bug not present: 947f3737abf65fda63f3ffd97fddfa6986986868 Last fail repro: http://logs.test-lab.xenproject.org/osstest/logs/135551/ (Revision log too long, omitted.) For bisection revision-tuple graph see: http://logs.test-lab.xenproject.org/osstest/results/bisect/qemu-upstream-4.10-testing/build-amd64.xen-build.html Revision IDs in each graph node refer, respectively, to the Trees above. Running cs-bisection-step --graph-out=/home/logs/results/bisect/qemu-upstream-4.10-testing/build-amd64.xen-build --summary-out=tmp/135551.bisection-summary --basis-template=124921 --blessings=real,real-bisect qemu-upstream-4.10-testing build-amd64 xen-build Searching for failure / basis pass: 135417 fail [host=rimava1] / 135156 [host=albana1] 135014 [host=albana1] 134939 [host=albana1] 134791 [host=italia1] 134678 [host=godello0] 124921 [host=debina1] 118021 [host=godello1] 117963 [host=godello1] 117761 [host=godello0] 117730 [host=godello0] 117345 [host=godello0] 117287 [host=fiano0] 116755 [host=huxelrebe0] template as basis? using template as basis. Failure / basis pass flights: 135417 / 124921 (tree with no url: minios) (tree with no url: seabios) Tree: ovmf git://xenbits.xen.org/osstest/ovmf.git Tree: qemu git://xenbits.xen.org/qemu-xen-traditional.git Tree: qemuu git://xenbits.xen.org/qemu-xen.git Tree: xen git://xenbits.xen.org/xen.git Latest 20029ca22baaeb9418c1fd9df88d12d32d585cb6 c8ea0457495342c417c3dc033bba25148b279f60 04a43f76e2d73e8387bd3e3bd439ef8c6d69d361 b2bbd342576576eb8a165a6abf9559d772ee242b Basis pass 947f3737abf65fda63f3ffd97fddfa6986986868 c8ea0457495342c417c3dc033bba25148b279f60 6ea4cef2bd717045ac0e84b52a5b1b7716feb0c2 eeb15764adbfe44e9f11a68e2444f4ba12b3cf1d Generating revisions with ./adhoc-revtuple-generator git://xenbits.xen.org/osstest/ovmf.git#947f3737abf65fda63f3ffd97fddfa6986986868-20029ca22baaeb9418c1fd9df88d12d32d585cb6 git://xenbits.xen.org/qemu-xen-traditional.git#c8ea0457495342c417c3dc033bba25148b279f60-c8ea0457495342c417c3dc033bba25148b279f60 git://xenbits.xen.org/qemu-xen.git#6ea4cef2bd717045ac0e84b52a5b1b7716feb0c2-04a43f76e2d73e8387bd3e3bd439ef8c6d69d361 git://xenbits.xen.org/xen.git#eeb15764adbfe44e9f11a68e2444f4ba12b3cf1d-b2bbd342\ 576576eb8a165a6abf9559d772ee242b adhoc-revtuple-generator: tree discontiguous: ovmf Loaded 8898 nodes in revision graph Searching for test results: 124921 [host=debina1] 134678 [host=godello0] 134791 [host=italia1] 135014 [host=albana1] 134939 [host=albana1] 135156 [host=albana1] 135445 pass 947f3737abf65fda63f3ffd97fddfa6986986868 c8ea0457495342c417c3dc033bba25148b279f60 6ea4cef2bd717045ac0e84b52a5b1b7716feb0c2 eeb15764adbfe44e9f11a68e2444f4ba12b3cf1d 135417 fail 20029ca22baaeb9418c1fd9df88d12d32d585cb6 c8ea0457495342c417c3dc033bba25148b279f60 04a43f76e2d73e8387bd3e3bd439ef8c6d69d361 b2bbd342576576eb8a165a6abf9559d772ee242b 135488 fail 20029ca22baaeb9418c1fd9df88d12d32d585cb6 c8ea0457495342c417c3dc033bba25148b279f60 04a43f76e2d73e8387bd3e3bd439ef8c6d69d361 b2bbd342576576eb8a165a6abf9559d772ee242b 135465 fail 20029ca22baaeb9418c1fd9df88d12d32d585cb6 c8ea0457495342c417c3dc033bba25148b279f60 04a43f76e2d73e8387bd3e3bd439ef8c6d69d361 b2bbd342576576eb8a165a6abf9559d772ee242b 135500 pass 947f3737abf65fda63f3ffd97fddfa6986986868 c8ea0457495342c417c3dc033bba25148b279f60 6ea4cef2bd717045ac0e84b52a5b1b7716feb0c2 d616c1b18d27761f572927bf1f79ba27273afe9a 135487 pass 947f3737abf65fda63f3ffd97fddfa6986986868 c8ea0457495342c417c3dc033bba25148b279f60 6ea4cef2bd717045ac0e84b52a5b1b7716feb0c2 eeb15764adbfe44e9f11a68e2444f4ba12b3cf1d 135490 pass 947f3737abf65fda63f3ffd97fddfa6986986868 c8ea0457495342c417c3dc033bba25148b279f60 6ea4cef2bd717045ac0e84b52a5b1b7716feb0c2 61dc0159b69bd3eec109188386c8b13fbdfed7b2 135515 pass 947f3737abf65fda63f3ffd97fddfa6986986868 c8ea0457495342c417c3dc033bba25148b279f60 dc33057be1fc39b3fee2f67a7f2ac1379d150dab 7842419a6b85edb4a5b9bee8b1179de4c8b84b60 135522 pass 947f3737abf65fda63f3ffd97fddfa6986986868 c8ea0457495342c417c3dc033bba25148b279f60 04a43f76e2d73e8387bd3e3bd439ef8c6d69d361 b2bbd342576576eb8a165a6abf9559d772ee242b 135520 pass 947f3737abf65fda63f3ffd97fddfa6986986868 c8ea0457495342c417c3dc033bba25148b279f60 c84fdba657d43dc39910c8bfc9524a00a3e5c84c 7842419a6b85edb4a5b9bee8b1179de4c8b84b60 135511 pass 947f3737abf65fda63f3ffd97fddfa6986986868 c8ea0457495342c417c3dc033bba25148b279f60 8a0df40718c2a2fce0dd4b3801b4921d7a1333f3 f6f1e948873ed601d3d743a2586619b763acd084 135
Re: [Xen-devel] [PATCH] x86/boot: Fix latent memory corruption with early_boot_opts_t
>>> On 02.05.19 at 13:52, wrote: > c/s ebb26b509f "xen/x86: make VGA support selectable" added an #ifdef > CONFIG_VIDEO into the middle the backing space for early_boot_opts_t, > but didn't adjust the structure definition in cmdline.c > > This only functions correctly because the affected fields are at the end > of the structure, and cmdline.c doesn't write to them in this case. > > To retain the slimming effect of compiling out CONFIG_VIDEO, adjust > cmdline.c with enough #ifdef-ary to make C's idea of the structure match > the declaration in asm. This requires adding __maybe_unused annotations > to two helper functions. > > Signed-off-by: Andrew Cooper Reviewed-by: Jan Beulich with a remark and a question: > --- a/xen/arch/x86/boot/cmdline.c > +++ b/xen/arch/x86/boot/cmdline.c > @@ -40,10 +40,12 @@ typedef struct __packed { > u8 opt_edd; > u8 opt_edid; > u8 padding; > +#ifdef CONFIG_VIDEO > u16 boot_vid_mode; > u16 vesa_width; > u16 vesa_height; > u16 vesa_depth; > +#endif Since apparently the "Keep in sync" comment in trampoline.S wasn't sufficient, and since - with what said commit did - the comment now looks unrelated to these data items (for there being a blank line in between now) perhaps this should be accompanied by both a START and END marker? And perhaps the comment next to vesa_size should also get corrected to say "width x height x depth". My R-b stands if you decide to fold these in. > --- a/xen/arch/x86/boot/defs.h > +++ b/xen/arch/x86/boot/defs.h > @@ -23,6 +23,7 @@ > #include "../../../include/xen/stdbool.h" > > #define __packed __attribute__((__packed__)) > +#define __maybe_unused __attribute__((__unused__)) > #define __stdcall__attribute__((__stdcall__)) Purely out of curiosity (I don't really care about the ordering here as long as the set doesn't meaningfully grow): Based on what did you decide this best goes between the two existing ones? Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] x86/HVM: p2m_ram_ro is incompatible with device pass-through
> -Original Message- > From: Jan Beulich [mailto:jbeul...@suse.com] > Sent: 02 May 2019 13:29 > To: xen-devel > Cc: Andrew Cooper ; Paul Durrant > ; Roger Pau Monne > ; Wei Liu ; George Dunlap > > Subject: [PATCH] x86/HVM: p2m_ram_ro is incompatible with device pass-through > > The write-discard property of the type can't be represented in IOMMU > page table entries. Make sure the respective checks / tracking can't > race, by utilizing the domain lock. The other sides of the sharing/ > paging/log-dirty exclusion checks should subsequently perhaps also be > put under that lock then. > > Take the opportunity and also convert neighboring bool_t to bool in > struct hvm_domain. > > Signed-off-by: Jan Beulich > > --- a/xen/arch/x86/hvm/dm.c > +++ b/xen/arch/x86/hvm/dm.c > @@ -255,16 +255,32 @@ static int set_mem_type(struct domain *d > > mem_type = array_index_nospec(data->mem_type, ARRAY_SIZE(memtype)); > > -if ( mem_type == HVMMEM_ioreq_server ) > +switch ( mem_type ) > { > unsigned int flags; > > +case HVMMEM_ioreq_server: > if ( !hap_enabled(d) ) > return -EOPNOTSUPP; > > /* Do not change to HVMMEM_ioreq_server if no ioreq server mapped. */ > if ( !p2m_get_ioreq_server(d, &flags) ) > return -EINVAL; > + > +break; > + > +case HVMMEM_ram_ro: > +/* p2m_ram_ro can't be represented in IOMMU mappings. */ > +domain_lock(d); > +if ( has_iommu_pt(d) ) > +rc = -EXDEV; > +d->arch.hvm.p2m_ram_ro_used = true; Do you really want to set this to true even if the op will fail? Paul > +domain_unlock(d); > + > +if ( rc ) > +return rc; > + > +break; > } > > while ( iter < data->nr ) > --- a/xen/drivers/passthrough/pci.c > +++ b/xen/drivers/passthrough/pci.c > @@ -1450,17 +1450,36 @@ static int assign_device(struct domain * > if ( !iommu_enabled || !hd->platform_ops ) > return 0; > > -/* Prevent device assign if mem paging or mem sharing have been > - * enabled for this domain */ > -if ( unlikely(d->arch.hvm.mem_sharing_enabled || > - vm_event_check_ring(d->vm_event_paging) || > +domain_lock(d); > + > +/* > + * Prevent device assignment if any of > + * - mem paging > + * - mem sharing > + * - the p2m_ram_ro type > + * - global log-dirty mode > + * are in use by this domain. > + */ > +if ( unlikely(vm_event_check_ring(d->vm_event_paging) || > +#ifdef CONFIG_HVM > + (is_hvm_domain(d) && > + (d->arch.hvm.mem_sharing_enabled || > +d->arch.hvm.p2m_ram_ro_used)) || > +#endif >p2m_get_hostp2m(d)->global_logdirty) ) > +{ > +domain_unlock(d); > return -EXDEV; > +} > > if ( !pcidevs_trylock() ) > +{ > +domain_unlock(d); > return -ERESTART; > +} > > rc = iommu_construct(d); > +domain_unlock(d); > if ( rc ) > { > pcidevs_unlock(); > --- a/xen/include/asm-x86/hvm/domain.h > +++ b/xen/include/asm-x86/hvm/domain.h > @@ -156,10 +156,11 @@ struct hvm_domain { > > struct viridian_domain *viridian; > > -bool_t hap_enabled; > -bool_t mem_sharing_enabled; > -bool_t qemu_mapcache_invalidate; > -bool_t is_s3_suspended; > +bool hap_enabled; > +bool mem_sharing_enabled; > +bool p2m_ram_ro_used; > +bool qemu_mapcache_invalidate; > +bool is_s3_suspended; > > /* > * TSC value that VCPUs use to calculate their tsc_offset value. > > > ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] x86/HVM: p2m_ram_ro is incompatible with device pass-through
>>> On 02.05.19 at 14:47, wrote: >> From: Jan Beulich [mailto:jbeul...@suse.com] >> Sent: 02 May 2019 13:29 >> >> --- a/xen/arch/x86/hvm/dm.c >> +++ b/xen/arch/x86/hvm/dm.c >> @@ -255,16 +255,32 @@ static int set_mem_type(struct domain *d >> >> mem_type = array_index_nospec(data->mem_type, ARRAY_SIZE(memtype)); >> >> -if ( mem_type == HVMMEM_ioreq_server ) >> +switch ( mem_type ) >> { >> unsigned int flags; >> >> +case HVMMEM_ioreq_server: >> if ( !hap_enabled(d) ) >> return -EOPNOTSUPP; >> >> /* Do not change to HVMMEM_ioreq_server if no ioreq server mapped. >> */ >> if ( !p2m_get_ioreq_server(d, &flags) ) >> return -EINVAL; >> + >> +break; >> + >> +case HVMMEM_ram_ro: >> +/* p2m_ram_ro can't be represented in IOMMU mappings. */ >> +domain_lock(d); >> +if ( has_iommu_pt(d) ) >> +rc = -EXDEV; >> +d->arch.hvm.p2m_ram_ro_used = true; > > Do you really want to set this to true even if the op will fail? Oh, good point - there should be an "else" there. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] x86/HVM: p2m_ram_ro is incompatible with device pass-through
On 02/05/2019 13:29, Jan Beulich wrote: > --- a/xen/drivers/passthrough/pci.c > +++ b/xen/drivers/passthrough/pci.c > @@ -1450,17 +1450,36 @@ static int assign_device(struct domain * > if ( !iommu_enabled || !hd->platform_ops ) > return 0; > > -/* Prevent device assign if mem paging or mem sharing have been > - * enabled for this domain */ > -if ( unlikely(d->arch.hvm.mem_sharing_enabled || > - vm_event_check_ring(d->vm_event_paging) || > +domain_lock(d); > + > +/* > + * Prevent device assignment if any of > + * - mem paging > + * - mem sharing > + * - the p2m_ram_ro type > + * - global log-dirty mode XenServer has working live migration with GPUs, which this change would regress. Behind the scenes, we combine Xen's logdirty bitmap with one provided by the GPU, to ensure that all dirty updates are tracked. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] x86/vm_event: add gdtr_base to the vm_event structure
On Thu, May 2, 2019 at 2:57 AM Jan Beulich wrote: > > >>> On 02.05.19 at 10:25, wrote: > > On 5/2/19 2:57 AM, Tamas K Lengyel wrote: > >> Receiving this register is useful for introspecting 32-bit Windows when the > >> event being trapped happened while in ring3. > >> > >> Signed-off-by: Tamas K Lengyel > >> Cc: Razvan Cojocaru > >> Cc: Tamas K Lengyel > >> Cc: Jan Beulich > >> Cc: Andrew Cooper > >> Cc: Wei Liu > >> Cc: Roger Pau Monne > >> --- > >> xen/arch/x86/vm_event.c | 5 + > >> xen/include/public/vm_event.h | 3 ++- > >> 2 files changed, 7 insertions(+), 1 deletion(-) > >> > >> diff --git a/xen/arch/x86/vm_event.c b/xen/arch/x86/vm_event.c > >> index 51c3493b1d..873788e32f 100644 > >> --- a/xen/arch/x86/vm_event.c > >> +++ b/xen/arch/x86/vm_event.c > >> @@ -179,6 +179,10 @@ static void vm_event_pack_segment_register(enum > > x86_segment segment, > >> reg->es_sel = seg.sel; > >> break; > >> > >> +case x86_seg_gdtr: > >> +reg->gdtr_base = seg.base; > >> +break; > > > > Please also add limit, ar, sel, like the others do. > > There's no ar or sel for GDT (and IDT). Instead, because of ... > > > In addition to > > making this modification look less strange (since, in contrast to the > > function name, nothing is packed for gdtr_base), it will also save > > future work for applications wanting to use gdtr which also require > > backwards compatibility with previous vm_event versions. > > > > As you know, for each such modification we need to have a separate > > vm_event_vX header in our applications so that we're ready to create a > > ring buffer using requests and replies of the right size, and also to be > > able to properly interpret the ring buffer data, so the least frequent > > changes to the vm_event struct, the better. > > ... this I wonder whether the IDT details shouldn't get added at > the same time. The churn of the header is not that big of a deal - I do the same in LibVMI and it's a largely copy-paste job that takes perhaps ~5m to add (see https://github.com/tklengyel/libvmi/commit/7509ce56d0408dbec4e374b8780da69a7bd212e8). So that should not be a factor in deciding whether we add a needed extra piece or not. Since the vm_event ABI is changing for Xen 4.13 now the ABI version will only be bumped once for 4.13. So we should be able to add/remove/shuffle whatever we want while 4.13 merge window is open. That said I don't have a use for idt and gdtr_limit that warrants having to receive it via the vm_event structure - those pieces are always just a hypercall away if needed. So in the vm_event structure I tend to just put the registers needed often enough to warrant avoiding that extra hypercall. But if Razvan says he has a use for them, I can add them here. Tamas ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 4/9] x86/HVM: move NOFLUSH handling out of hvm_set_cr3()
> -Original Message- > From: Jan Beulich [mailto:jbeul...@suse.com] > Sent: 02 May 2019 13:20 > To: xen-devel > Cc: Andrew Cooper ; Paul Durrant > ; Roger Pau Monne > ; Wei Liu ; George Dunlap > > Subject: [PATCH 4/9] x86/HVM: move NOFLUSH handling out of hvm_set_cr3() > > The bit is meaningful only for MOV-to-CR3 insns, not anywhere else, in > particular not when loading nested guest state. > > Signed-off-by: Jan Beulich > > --- a/xen/arch/x86/hvm/emulate.c > +++ b/xen/arch/x86/hvm/emulate.c > @@ -2072,6 +2072,8 @@ static int hvmemul_write_cr( > HVMTRACE_LONG_2D(CR_WRITE, reg, TRC_PAR_LONG(val)); > switch ( reg ) > { > +bool noflush; > + Why introduce 'noflush' with this scope when it could be limited to 'case 3:', although... > case 0: > rc = hvm_set_cr0(val, true); > break; > @@ -2082,7 +2084,10 @@ static int hvmemul_write_cr( > break; > > case 3: > -rc = hvm_set_cr3(val, true); > +noflush = hvm_pcid_enabled(current) && (val & X86_CR3_NOFLUSH); > +if ( noflush ) > +val &= ~X86_CR3_NOFLUSH; ... can't you just code this as: if ( hvm_pcid_enabled(current) ) val &= ~X86_CR3_NOFLUSH; ? Paul > +rc = hvm_set_cr3(val, noflush, true); > break; > > case 4: > --- a/xen/arch/x86/hvm/hvm.c > +++ b/xen/arch/x86/hvm/hvm.c > @@ -2053,12 +2053,17 @@ int hvm_mov_to_cr(unsigned int cr, unsig > > switch ( cr ) > { > +bool noflush; > + > case 0: > rc = hvm_set_cr0(val, true); > break; > > case 3: > -rc = hvm_set_cr3(val, true); > +noflush = hvm_pcid_enabled(curr) && (val & X86_CR3_NOFLUSH); > +if ( noflush ) > +val &= ~X86_CR3_NOFLUSH; > +rc = hvm_set_cr3(val, noflush, true); > break; > > case 4: > @@ -2276,12 +2281,11 @@ int hvm_set_cr0(unsigned long value, boo > return X86EMUL_OKAY; > } > > -int hvm_set_cr3(unsigned long value, bool may_defer) > +int hvm_set_cr3(unsigned long value, bool noflush, bool may_defer) > { > struct vcpu *v = current; > struct page_info *page; > unsigned long old = v->arch.hvm.guest_cr[3]; > -bool noflush = false; > > if ( may_defer && unlikely(v->domain->arch.monitor.write_ctrlreg_enabled > & > monitor_ctrlreg_bitmask(VM_EVENT_X86_CR3)) ) > @@ -2293,17 +2297,12 @@ int hvm_set_cr3(unsigned long value, boo > /* The actual write will occur in hvm_do_resume(), if permitted. > */ > v->arch.vm_event->write_data.do_write.cr3 = 1; > v->arch.vm_event->write_data.cr3 = value; > +v->arch.vm_event->write_data.cr3_noflush = noflush; > > return X86EMUL_OKAY; > } > } > > -if ( hvm_pcid_enabled(v) ) /* Clear the noflush bit. */ > -{ > -noflush = value & X86_CR3_NOFLUSH; > -value &= ~X86_CR3_NOFLUSH; > -} > - > if ( hvm_paging_enabled(v) && !paging_mode_hap(v->domain) && > (value != v->arch.hvm.guest_cr[3]) ) > { > @@ -2998,7 +2997,7 @@ void hvm_task_switch( > if ( task_switch_load_seg(x86_seg_ldtr, tss.ldt, new_cpl, 0) ) > goto out; > > -rc = hvm_set_cr3(tss.cr3, true); > +rc = hvm_set_cr3(tss.cr3, false, true); > if ( rc == X86EMUL_EXCEPTION ) > hvm_inject_hw_exception(TRAP_gp_fault, 0); > if ( rc != X86EMUL_OKAY ) > --- a/xen/arch/x86/hvm/svm/nestedsvm.c > +++ b/xen/arch/x86/hvm/svm/nestedsvm.c > @@ -324,7 +324,7 @@ static int nsvm_vcpu_hostrestore(struct > v->arch.guest_table = pagetable_null(); > /* hvm_set_cr3() below sets v->arch.hvm.guest_cr[3] for us. */ > } > -rc = hvm_set_cr3(n1vmcb->_cr3, true); > +rc = hvm_set_cr3(n1vmcb->_cr3, false, true); > if ( rc == X86EMUL_EXCEPTION ) > hvm_inject_hw_exception(TRAP_gp_fault, 0); > if (rc != X86EMUL_OKAY) > @@ -584,7 +584,7 @@ static int nsvm_vmcb_prepare4vmrun(struc > nestedsvm_vmcb_set_nestedp2m(v, ns_vmcb, n2vmcb); > > /* hvm_set_cr3() below sets v->arch.hvm.guest_cr[3] for us. */ > -rc = hvm_set_cr3(ns_vmcb->_cr3, true); > +rc = hvm_set_cr3(ns_vmcb->_cr3, false, true); > if ( rc == X86EMUL_EXCEPTION ) > hvm_inject_hw_exception(TRAP_gp_fault, 0); > if (rc != X86EMUL_OKAY) > @@ -598,7 +598,7 @@ static int nsvm_vmcb_prepare4vmrun(struc > * we assume it intercepts page faults. > */ > /* hvm_set_cr3() below sets v->arch.hvm.guest_cr[3] for us. */ > -rc = hvm_set_cr3(ns_vmcb->_cr3, true); > +rc = hvm_set_cr3(ns_vmcb->_cr3, false, true); > if ( rc == X86EMUL_EXCEPTION ) > hvm_inject_hw_exception(TRAP_gp_fault, 0); > if (rc != X86EMUL_OKAY) > --- a/xen/arch/x86/hvm/vm_event.c > +++ b/xen/arch/x86/hvm/vm_event.c > @@ -110,7 +110,7 @@ void hvm_vm_event_do_resume(struct vcpu > > if ( unlik
Re: [Xen-devel] [PATCH] VT-d: suppress individual flushes during hwdom setup
On Thu, May 02, 2019 at 06:28:06AM -0600, Jan Beulich wrote: > There's an invocation of iommu_flush_all() immediately afterwards. > > Signed-off-by: Jan Beulich > > --- a/xen/drivers/passthrough/vtd/iommu.c > +++ b/xen/drivers/passthrough/vtd/iommu.c > @@ -1310,8 +1310,11 @@ static void __hwdom_init intel_iommu_hwd > > setup_hwdom_pci_devices(d, setup_hwdom_device); > setup_hwdom_rmrr(d); > + > /* Make sure workarounds are applied before enabling the IOMMU(s). */ > +this_cpu(iommu_dont_flush_iotlb) = true; > arch_iommu_hwdom_init(d); > +this_cpu(iommu_dont_flush_iotlb) = false; Don't you want to also avoid flushes in setup_hwdom_rmrr and setup_hwdom_pci_devices? Thanks, Roger. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] VT-d: suppress individual flushes during hwdom setup
> -Original Message- > From: Xen-devel [mailto:xen-devel-boun...@lists.xenproject.org] On Behalf Of > Jan Beulich > Sent: 02 May 2019 13:28 > To: xen-devel > Cc: Kevin Tian > Subject: [Xen-devel] [PATCH] VT-d: suppress individual flushes during hwdom > setup > > There's an invocation of iommu_flush_all() immediately afterwards. > > Signed-off-by: Jan Beulich > > --- a/xen/drivers/passthrough/vtd/iommu.c > +++ b/xen/drivers/passthrough/vtd/iommu.c > @@ -1310,8 +1310,11 @@ static void __hwdom_init intel_iommu_hwd > > setup_hwdom_pci_devices(d, setup_hwdom_device); > setup_hwdom_rmrr(d); > + > /* Make sure workarounds are applied before enabling the IOMMU(s). */ > +this_cpu(iommu_dont_flush_iotlb) = true; > arch_iommu_hwdom_init(d); > +this_cpu(iommu_dont_flush_iotlb) = false; There should be no need for this. arch_iommu_hwdom_init() is using iommu_map(), which no longer does implicit flushing. Paul > > if ( iommu_flush_all() ) > printk(XENLOG_WARNING VTDPREFIX > > > > ___ > Xen-devel mailing list > Xen-devel@lists.xenproject.org > https://lists.xenproject.org/mailman/listinfo/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2] x86/vmx: correctly gather gs_shadow value for current vCPU
On Thu, May 2, 2019 at 4:46 AM Andrew Cooper wrote: > > On 02/05/2019 00:52, Tamas K Lengyel wrote: > > diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c > > index 283eb7b34d..5154ecc2a8 100644 > > --- a/xen/arch/x86/hvm/vmx/vmx.c > > +++ b/xen/arch/x86/hvm/vmx/vmx.c > > @@ -779,12 +779,17 @@ static void vmx_load_cpu_state(struct vcpu *v, struct > > hvm_hw_cpu *data) > > > > static void vmx_save_vmcs_ctxt(struct vcpu *v, struct hvm_hw_cpu *ctxt) > > { > > +if ( v == current ) > > +vmx_save_guest_msrs(v); > > + > > vmx_save_cpu_state(v, ctxt); > > vmx_vmcs_save(v, ctxt); > > } > > > > static int vmx_load_vmcs_ctxt(struct vcpu *v, struct hvm_hw_cpu *ctxt) > > { > > +ASSERT(v != current); > > I'd leave a comment along the lines of /* Not currently safe to use in > current context. */ > > Can be fixed up on commit. > > This version is much cleaner, architecturally speaking, so Reviewed-by: > Andrew Cooper > > I'll drop the previous version out of x86-next. Thanks, Tamas ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] x86/HVM: p2m_ram_ro is incompatible with device pass-through
>>> On 02.05.19 at 14:59, wrote: > On 02/05/2019 13:29, Jan Beulich wrote: >> --- a/xen/drivers/passthrough/pci.c >> +++ b/xen/drivers/passthrough/pci.c >> @@ -1450,17 +1450,36 @@ static int assign_device(struct domain * >> if ( !iommu_enabled || !hd->platform_ops ) >> return 0; >> >> -/* Prevent device assign if mem paging or mem sharing have been >> - * enabled for this domain */ >> -if ( unlikely(d->arch.hvm.mem_sharing_enabled || >> - vm_event_check_ring(d->vm_event_paging) || >> +domain_lock(d); >> + >> +/* >> + * Prevent device assignment if any of >> + * - mem paging >> + * - mem sharing >> + * - the p2m_ram_ro type >> + * - global log-dirty mode > > XenServer has working live migration with GPUs, which this change would > regress. > > Behind the scenes, we combine Xen's logdirty bitmap with one provided by > the GPU, to ensure that all dirty updates are tracked. But this says nothing for the patch here. As long as it doesn't work in the staging tree, it should be prevented. In XenServer you could then patch out that check and comment line together with whatever other changes you have to make for thins to work. Alternatively you could submit a patch (or more) to make it work in staging too, at which point I'd have to re-base the one here. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 4/9] x86/HVM: move NOFLUSH handling out of hvm_set_cr3()
>>> On 02.05.19 at 15:07, wrote: >> From: Jan Beulich [mailto:jbeul...@suse.com] >> Sent: 02 May 2019 13:20 >> >> --- a/xen/arch/x86/hvm/emulate.c >> +++ b/xen/arch/x86/hvm/emulate.c >> @@ -2072,6 +2072,8 @@ static int hvmemul_write_cr( >> HVMTRACE_LONG_2D(CR_WRITE, reg, TRC_PAR_LONG(val)); >> switch ( reg ) >> { >> +bool noflush; >> + > > Why introduce 'noflush' with this scope when it could be limited to 'case > 3:', although... Because this would entail introducing another set of braces, and I pretty much dislike these case-block braces: They either don't properly indent (as we do commonly), or they needlessly increase indentation of the enclosed block. Hence my general preference of switch-scope local variables. >> @@ -2082,7 +2084,10 @@ static int hvmemul_write_cr( >> break; >> >> case 3: >> -rc = hvm_set_cr3(val, true); >> +noflush = hvm_pcid_enabled(current) && (val & X86_CR3_NOFLUSH); >> +if ( noflush ) >> +val &= ~X86_CR3_NOFLUSH; > > ... can't you just code this as: > > if ( hvm_pcid_enabled(current) ) > val &= ~X86_CR3_NOFLUSH; > > ? Because of ... >> +rc = hvm_set_cr3(val, noflush, true); ... this further use of "noflush" (alongside the adjusted "val"). Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] x86/vm_event: add gdtr_base to the vm_event structure
On 5/2/19 4:09 PM, Tamas K Lengyel wrote: On Thu, May 2, 2019 at 2:57 AM Jan Beulich wrote: On 02.05.19 at 10:25, wrote: On 5/2/19 2:57 AM, Tamas K Lengyel wrote: Receiving this register is useful for introspecting 32-bit Windows when the event being trapped happened while in ring3. Signed-off-by: Tamas K Lengyel Cc: Razvan Cojocaru Cc: Tamas K Lengyel Cc: Jan Beulich Cc: Andrew Cooper Cc: Wei Liu Cc: Roger Pau Monne --- xen/arch/x86/vm_event.c | 5 + xen/include/public/vm_event.h | 3 ++- 2 files changed, 7 insertions(+), 1 deletion(-) diff --git a/xen/arch/x86/vm_event.c b/xen/arch/x86/vm_event.c index 51c3493b1d..873788e32f 100644 --- a/xen/arch/x86/vm_event.c +++ b/xen/arch/x86/vm_event.c @@ -179,6 +179,10 @@ static void vm_event_pack_segment_register(enum x86_segment segment, reg->es_sel = seg.sel; break; +case x86_seg_gdtr: +reg->gdtr_base = seg.base; +break; Please also add limit, ar, sel, like the others do. There's no ar or sel for GDT (and IDT). Instead, because of ... In addition to making this modification look less strange (since, in contrast to the function name, nothing is packed for gdtr_base), it will also save future work for applications wanting to use gdtr which also require backwards compatibility with previous vm_event versions. As you know, for each such modification we need to have a separate vm_event_vX header in our applications so that we're ready to create a ring buffer using requests and replies of the right size, and also to be able to properly interpret the ring buffer data, so the least frequent changes to the vm_event struct, the better. ... this I wonder whether the IDT details shouldn't get added at the same time. The churn of the header is not that big of a deal - I do the same in LibVMI and it's a largely copy-paste job that takes perhaps ~5m to add (see https://github.com/tklengyel/libvmi/commit/7509ce56d0408dbec4e374b8780da69a7bd212e8). So that should not be a factor in deciding whether we add a needed extra piece or not. Since the vm_event ABI is changing for Xen 4.13 now the ABI version will only be bumped once for 4.13. So we should be able to add/remove/shuffle whatever we want while 4.13 merge window is open. That said I don't have a use for idt and gdtr_limit that warrants having to receive it via the vm_event structure - those pieces are always just a hypercall away if needed. So in the vm_event structure I tend to just put the registers needed often enough to warrant avoiding that extra hypercall. But if Razvan says he has a use for them, I can add them here. We do use them both - idtr and gdtr, both base and limit, but we are indeed getting them via hypercall now. Considering that, since we did add gdtr_base I think it would be great if we could also add gdtr_limit. Adding idtr as well would _theoretically_ be a nice speed optimization (removing the need for a hypercall), but it might also be a speed pessimization generally applicable to increasing the vm_event size (since a VCPU with no more space left in the vm_event ring buffer will be blocked more and go through more locking logic). My point is, at the moment it's fine to skip idtr if it's not required by Jan, but if we do add either then let's please add both _base and _limit. In a hopefully near future we'll be able to use as much memory as necessary for storing vm_event data and just cache everything in the ring buffer, avoing all the "get context" hypercalls. Thanks, Razvan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 4/9] x86/HVM: move NOFLUSH handling out of hvm_set_cr3()
> -Original Message- > From: Jan Beulich [mailto:jbeul...@suse.com] > Sent: 02 May 2019 14:23 > To: Paul Durrant > Cc: Andrew Cooper ; George Dunlap > ; Roger Pau > Monne ; Wei Liu ; xen-devel de...@lists.xenproject.org> > Subject: RE: [PATCH 4/9] x86/HVM: move NOFLUSH handling out of hvm_set_cr3() > > >>> On 02.05.19 at 15:07, wrote: > >> From: Jan Beulich [mailto:jbeul...@suse.com] > >> Sent: 02 May 2019 13:20 > >> > >> --- a/xen/arch/x86/hvm/emulate.c > >> +++ b/xen/arch/x86/hvm/emulate.c > >> @@ -2072,6 +2072,8 @@ static int hvmemul_write_cr( > >> HVMTRACE_LONG_2D(CR_WRITE, reg, TRC_PAR_LONG(val)); > >> switch ( reg ) > >> { > >> +bool noflush; > >> + > > > > Why introduce 'noflush' with this scope when it could be limited to 'case > > 3:', although... > > Because this would entail introducing another set of braces, and > I pretty much dislike these case-block braces: They either don't > properly indent (as we do commonly), or they needlessly increase > indentation of the enclosed block. Hence my general preference > of switch-scope local variables. > > >> @@ -2082,7 +2084,10 @@ static int hvmemul_write_cr( > >> break; > >> > >> case 3: > >> -rc = hvm_set_cr3(val, true); > >> +noflush = hvm_pcid_enabled(current) && (val & X86_CR3_NOFLUSH); > >> +if ( noflush ) > >> +val &= ~X86_CR3_NOFLUSH; > > > > ... can't you just code this as: > > > > if ( hvm_pcid_enabled(current) ) > > val &= ~X86_CR3_NOFLUSH; > > > > ? > > Because of ... > > >> +rc = hvm_set_cr3(val, noflush, true); > > ... this further use of "noflush" (alongside the adjusted "val"). > Ah, missed that... I'd still go for the tighter scope though, but then I don't mind the extra braces. Paul > Jan > ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH] x86/boot: Annotate the Real Mode entry points
... because its already hard enough to follow. Cross reference the locations in C which set the entrypoints up, and state the alignment requirements and entry conditions. Drop a redundant .align 16, and panic() in do_boot_cpu() if the AP trampoline isn't set up properly rather than blindly continuing an letting the APs execute junk, or shifting part of the address into unrelated fields in ICR. Signed-off-by: Andrew Cooper --- CC: Jan Beulich CC: Wei Liu CC: Roger Pau Monné CC: David Woodhouse --- xen/arch/x86/boot/trampoline.S | 6 ++ xen/arch/x86/boot/wakeup.S | 10 +- xen/arch/x86/smpboot.c | 5 - 3 files changed, 19 insertions(+), 2 deletions(-) diff --git a/xen/arch/x86/boot/trampoline.S b/xen/arch/x86/boot/trampoline.S index 125bdb5..cac0f3e1 100644 --- a/xen/arch/x86/boot/trampoline.S +++ b/xen/arch/x86/boot/trampoline.S @@ -38,6 +38,12 @@ .code16 +/* + * do_boot_cpu() programs the Startup-IPI to point here. Due to the SIPI + * format, the relocated entrypoint must be 4k aligned. + * + * It is entered in Real Mode, with %cs = wakeup_start >> 4 and %ip = 0. + */ GLOBAL(trampoline_realmode_entry) mov %cs,%ax mov %ax,%ds diff --git a/xen/arch/x86/boot/wakeup.S b/xen/arch/x86/boot/wakeup.S index 89df261..e3cb9e0 100644 --- a/xen/arch/x86/boot/wakeup.S +++ b/xen/arch/x86/boot/wakeup.S @@ -2,7 +2,15 @@ #define wakesym(sym) (sym - wakeup_start) -.align 16 +/* + * acpi_sleep_prepare() programs the S3 wakeup vector to point here. + * + * The ACPI spec says that we shall be entered in Real Mode with: + * %cs = wakeup_start >> 4 + * %ip = wakeup_start & 0xf + * + * As wakeup_start is 16-byte aligned, %ip is 0 in practice. + */ ENTRY(wakeup_start) cli cld diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c index b7a0a4a..4f65c8d 100644 --- a/xen/arch/x86/smpboot.c +++ b/xen/arch/x86/smpboot.c @@ -548,9 +548,12 @@ static int do_boot_cpu(int apicid, int cpu) booting_cpu = cpu; -/* start_eip had better be page-aligned! */ start_eip = setup_trampoline(); +/* start_eip needs be page aligned, and below the 1M boundary. */ +if ( start_eip & ~0xff000 ) +panic("AP trampoline %#lx not suitably positioned\n", start_eip); + /* So we see what's up */ if ( opt_cpu_info ) printk("Booting processor %d/%d eip %lx\n", -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] x86/boot: Fix latent memory corruption with early_boot_opts_t
On 02/05/2019 13:44, Jan Beulich wrote: On 02.05.19 at 13:52, wrote: >> c/s ebb26b509f "xen/x86: make VGA support selectable" added an #ifdef >> CONFIG_VIDEO into the middle the backing space for early_boot_opts_t, >> but didn't adjust the structure definition in cmdline.c >> >> This only functions correctly because the affected fields are at the end >> of the structure, and cmdline.c doesn't write to them in this case. >> >> To retain the slimming effect of compiling out CONFIG_VIDEO, adjust >> cmdline.c with enough #ifdef-ary to make C's idea of the structure match >> the declaration in asm. This requires adding __maybe_unused annotations >> to two helper functions. >> >> Signed-off-by: Andrew Cooper > Reviewed-by: Jan Beulich > with a remark and a question: > >> --- a/xen/arch/x86/boot/cmdline.c >> +++ b/xen/arch/x86/boot/cmdline.c >> @@ -40,10 +40,12 @@ typedef struct __packed { >> u8 opt_edd; >> u8 opt_edid; >> u8 padding; >> +#ifdef CONFIG_VIDEO >> u16 boot_vid_mode; >> u16 vesa_width; >> u16 vesa_height; >> u16 vesa_depth; >> +#endif > Since apparently the "Keep in sync" comment in trampoline.S > wasn't sufficient, and since - with what said commit did - the > comment now looks unrelated to these data items (for there > being a blank line in between now) perhaps this should be > accompanied by both a START and END marker? I've got a followup patch which cleans up the ASM, but I don't want to interfere with the work that David is currently preparing. > And perhaps the comment next to vesa_size should also > get corrected to say "width x height x depth". I had already addressed this, and the fact that boot_vid_mode has never needed to be global (ever since its introduction). > >> --- a/xen/arch/x86/boot/defs.h >> +++ b/xen/arch/x86/boot/defs.h >> @@ -23,6 +23,7 @@ >> #include "../../../include/xen/stdbool.h" >> >> #define __packed__attribute__((__packed__)) >> +#define __maybe_unused __attribute__((__unused__)) >> #define __stdcall __attribute__((__stdcall__)) > Purely out of curiosity (I don't really care about the ordering > here as long as the set doesn't meaningfully grow): Based on > what did you decide this best goes between the two existing > ones? I forgot to sort the lines. Fixed. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] VT-d: suppress individual flushes during hwdom setup
>>> On 02.05.19 at 15:08, wrote: > On Thu, May 02, 2019 at 06:28:06AM -0600, Jan Beulich wrote: >> There's an invocation of iommu_flush_all() immediately afterwards. >> >> Signed-off-by: Jan Beulich >> >> --- a/xen/drivers/passthrough/vtd/iommu.c >> +++ b/xen/drivers/passthrough/vtd/iommu.c >> @@ -1310,8 +1310,11 @@ static void __hwdom_init intel_iommu_hwd >> >> setup_hwdom_pci_devices(d, setup_hwdom_device); >> setup_hwdom_rmrr(d); >> + >> /* Make sure workarounds are applied before enabling the IOMMU(s). */ >> +this_cpu(iommu_dont_flush_iotlb) = true; >> arch_iommu_hwdom_init(d); >> +this_cpu(iommu_dont_flush_iotlb) = false; > > Don't you want to also avoid flushes in setup_hwdom_rmrr and > setup_hwdom_pci_devices? We probably could, but the gain would be much lower because there are far fewer pages involved there. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] x86/vm_event: add gdtr_base to the vm_event structure
>>> On 02.05.19 at 15:23, wrote: > My point is, at the moment it's fine to skip idtr if it's not required > by Jan, but if we do add either then let's please add both _base and _limit. No, it's not a requirement I mean to put up (and I'm not in the position to either, as I'm not the maintainer of the interface). It just seems inconsistent to me to have one but not the other. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] x86/vm_event: add gdtr_base to the vm_event structure
>>> On 02.05.19 at 15:09, wrote: > That said I don't have a use for idt and gdtr_limit that warrants > having to receive it via the vm_event structure So what use if the GDT base without the limit? Are you silently assuming all presently loaded selectors are (still) within limits? Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] VT-d: suppress individual flushes during hwdom setup
>>> On 02.05.19 at 15:12, wrote: >> From: Xen-devel [mailto:xen-devel-boun...@lists.xenproject.org] On Behalf Of > Jan Beulich >> Sent: 02 May 2019 13:28 >> >> --- a/xen/drivers/passthrough/vtd/iommu.c >> +++ b/xen/drivers/passthrough/vtd/iommu.c >> @@ -1310,8 +1310,11 @@ static void __hwdom_init intel_iommu_hwd >> >> setup_hwdom_pci_devices(d, setup_hwdom_device); >> setup_hwdom_rmrr(d); >> + >> /* Make sure workarounds are applied before enabling the IOMMU(s). */ >> +this_cpu(iommu_dont_flush_iotlb) = true; >> arch_iommu_hwdom_init(d); >> +this_cpu(iommu_dont_flush_iotlb) = false; > > There should be no need for this. arch_iommu_hwdom_init() is using > iommu_map(), which no longer does implicit flushing. Oh, good point. I should have dropped this patch (dating back to October last year) when your respective change had landed. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] x86/HVM: p2m_ram_ro is incompatible with device pass-through
On 02/05/2019 14:16, Jan Beulich wrote: On 02.05.19 at 14:59, wrote: >> On 02/05/2019 13:29, Jan Beulich wrote: >>> --- a/xen/drivers/passthrough/pci.c >>> +++ b/xen/drivers/passthrough/pci.c >>> @@ -1450,17 +1450,36 @@ static int assign_device(struct domain * >>> if ( !iommu_enabled || !hd->platform_ops ) >>> return 0; >>> >>> -/* Prevent device assign if mem paging or mem sharing have been >>> - * enabled for this domain */ >>> -if ( unlikely(d->arch.hvm.mem_sharing_enabled || >>> - vm_event_check_ring(d->vm_event_paging) || >>> +domain_lock(d); >>> + >>> +/* >>> + * Prevent device assignment if any of >>> + * - mem paging >>> + * - mem sharing >>> + * - the p2m_ram_ro type >>> + * - global log-dirty mode >> XenServer has working live migration with GPUs, which this change would >> regress. >> >> Behind the scenes, we combine Xen's logdirty bitmap with one provided by >> the GPU, to ensure that all dirty updates are tracked. > But this says nothing for the patch here. Yes it does. There is nothing inherent about global log-dirty mode which is incompatible with passthrough when the IOMMU pagetables are not shared with EPT. > As long as it doesn't work in the staging tree, it should be prevented. ... but it does work. > In XenServer > you could then patch out that check and comment line together > with whatever other changes you have to make for thins to > work. Everything is upstream in the various projects, other than the vendor's closed source library and kernel driver. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] mm/pgtable: Drop pgtable_t variable from pte_fn_t functions
On Thu, May 02, 2019 at 06:48:46PM +0530, Anshuman Khandual wrote: > Drop the pgtable_t variable from all implementation for pte_fn_t as none of > them use it. apply_to_pte_range() should stop computing it as well. Should > help us save some cycles. You didn't add Martin Schwidefsky for some reason. He introduced it originally for s390 for sub-page page tables back in 2008 (commit 2f569afd9c). I think he should confirm that he no longer needs it. > --- > - Boot tested on arm64 and x86 platforms. > - Build tested on multiple platforms with their defconfig > > arch/arm/kernel/efi.c | 3 +-- > arch/arm/mm/dma-mapping.c | 3 +-- > arch/arm/mm/pageattr.c | 3 +-- > arch/arm64/kernel/efi.c| 3 +-- > arch/arm64/mm/pageattr.c | 3 +-- > arch/x86/xen/mmu_pv.c | 3 +-- > drivers/gpu/drm/i915/i915_mm.c | 3 +-- > drivers/xen/gntdev.c | 6 ++ > drivers/xen/privcmd.c | 6 ++ > drivers/xen/xlate_mmu.c| 3 +-- > include/linux/mm.h | 3 +-- > mm/memory.c| 5 + > mm/vmalloc.c | 2 +- > 13 files changed, 15 insertions(+), 31 deletions(-) > > diff --git a/arch/arm/kernel/efi.c b/arch/arm/kernel/efi.c > index 9f43ba012d10..b1f142a01f2f 100644 > --- a/arch/arm/kernel/efi.c > +++ b/arch/arm/kernel/efi.c > @@ -11,8 +11,7 @@ > #include > #include > > -static int __init set_permissions(pte_t *ptep, pgtable_t token, > - unsigned long addr, void *data) > +static int __init set_permissions(pte_t *ptep, unsigned long addr, void > *data) > { > efi_memory_desc_t *md = data; > pte_t pte = *ptep; > diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c > index 43f46aa7ef33..739286511a18 100644 > --- a/arch/arm/mm/dma-mapping.c > +++ b/arch/arm/mm/dma-mapping.c > @@ -496,8 +496,7 @@ void __init dma_contiguous_remap(void) > } > } > > -static int __dma_update_pte(pte_t *pte, pgtable_t token, unsigned long addr, > - void *data) > +static int __dma_update_pte(pte_t *pte, unsigned long addr, void *data) > { > struct page *page = virt_to_page(addr); > pgprot_t prot = *(pgprot_t *)data; > diff --git a/arch/arm/mm/pageattr.c b/arch/arm/mm/pageattr.c > index 1403cb4a0c3d..c8b500940e1f 100644 > --- a/arch/arm/mm/pageattr.c > +++ b/arch/arm/mm/pageattr.c > @@ -22,8 +22,7 @@ struct page_change_data { > pgprot_t clear_mask; > }; > > -static int change_page_range(pte_t *ptep, pgtable_t token, unsigned long > addr, > - void *data) > +static int change_page_range(pte_t *ptep, unsigned long addr, void *data) > { > struct page_change_data *cdata = data; > pte_t pte = *ptep; > diff --git a/arch/arm64/kernel/efi.c b/arch/arm64/kernel/efi.c > index 4f9acb5fbe97..230cff073a08 100644 > --- a/arch/arm64/kernel/efi.c > +++ b/arch/arm64/kernel/efi.c > @@ -86,8 +86,7 @@ int __init efi_create_mapping(struct mm_struct *mm, > efi_memory_desc_t *md) > return 0; > } > > -static int __init set_permissions(pte_t *ptep, pgtable_t token, > - unsigned long addr, void *data) > +static int __init set_permissions(pte_t *ptep, unsigned long addr, void > *data) > { > efi_memory_desc_t *md = data; > pte_t pte = READ_ONCE(*ptep); > diff --git a/arch/arm64/mm/pageattr.c b/arch/arm64/mm/pageattr.c > index 6cd645edcf35..0be077628b21 100644 > --- a/arch/arm64/mm/pageattr.c > +++ b/arch/arm64/mm/pageattr.c > @@ -27,8 +27,7 @@ struct page_change_data { > > bool rodata_full __ro_after_init = > IS_ENABLED(CONFIG_RODATA_FULL_DEFAULT_ENABLED); > > -static int change_page_range(pte_t *ptep, pgtable_t token, unsigned long > addr, > - void *data) > +static int change_page_range(pte_t *ptep, unsigned long addr, void *data) > { > struct page_change_data *cdata = data; > pte_t pte = READ_ONCE(*ptep); > diff --git a/arch/x86/xen/mmu_pv.c b/arch/x86/xen/mmu_pv.c > index a21e1734fc1f..308a6195fd26 100644 > --- a/arch/x86/xen/mmu_pv.c > +++ b/arch/x86/xen/mmu_pv.c > @@ -2702,8 +2702,7 @@ struct remap_data { > struct mmu_update *mmu_update; > }; > > -static int remap_area_pfn_pte_fn(pte_t *ptep, pgtable_t token, > - unsigned long addr, void *data) > +static int remap_area_pfn_pte_fn(pte_t *ptep, unsigned long addr, void *data) > { > struct remap_data *rmd = data; > pte_t pte = pte_mkspecial(mfn_pte(*rmd->pfn, rmd->prot)); > diff --git a/drivers/gpu/drm/i915/i915_mm.c b/drivers/gpu/drm/i915/i915_mm.c > index e4935dd1fd37..c23bb29e6d3e 100644 > --- a/drivers/gpu/drm/i915/i915_mm.c > +++ b/drivers/gpu/drm/i915/i915_mm.c > @@ -35,8 +35,7 @@ struct remap_pfn { > pgprot_t prot; > }; > > -static int remap_pfn(pte_t *pte, pgtable_t token, > - unsigned long addr, void *data) > +static int remap_pfn(pte_t *pte, unsigned long addr, void *data) > { > str
Re: [Xen-devel] [PATCH] x86/boot: Annotate the Real Mode entry points
>>> On 02.05.19 at 15:27, wrote: > --- a/xen/arch/x86/boot/trampoline.S > +++ b/xen/arch/x86/boot/trampoline.S > @@ -38,6 +38,12 @@ > > .code16 > > +/* > + * do_boot_cpu() programs the Startup-IPI to point here. Due to the SIPI > + * format, the relocated entrypoint must be 4k aligned. > + * > + * It is entered in Real Mode, with %cs = wakeup_start >> 4 and %ip = 0. > + */ > GLOBAL(trampoline_realmode_entry) The reference to wakeup_start looks to be a copy-and-paste (or alike) mistake here. > --- a/xen/arch/x86/smpboot.c > +++ b/xen/arch/x86/smpboot.c > @@ -548,9 +548,12 @@ static int do_boot_cpu(int apicid, int cpu) > > booting_cpu = cpu; > > -/* start_eip had better be page-aligned! */ > start_eip = setup_trampoline(); > > +/* start_eip needs be page aligned, and below the 1M boundary. */ > +if ( start_eip & ~0xff000 ) > +panic("AP trampoline %#lx not suitably positioned\n", start_eip); Seeing what setup_trampoline() really does, I'm not convinced a panic() is of much value. The page-alignment should be possible to check at build time, and the below-1M requirement should be guaranteed by us allocating low memory space in the first place. Nevertheless I won't insist, so with the earlier comment corrected Reviewed-by: Jan Beulich Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] x86/vm_event: add gdtr_base to the vm_event structure
On Thu, May 2, 2019 at 7:30 AM Jan Beulich wrote: > > >>> On 02.05.19 at 15:09, wrote: > > That said I don't have a use for idt and gdtr_limit that warrants > > having to receive it via the vm_event structure > > So what use if the GDT base without the limit? Are you silently > assuming all presently loaded selectors are (still) within limits? On 32-bit Windows the KPCR's address is cached at gdtr_base + 0x30 while in ring3. In ring0 we can just use fs_base for that. At the moment I still just cache the KPCR location on every MOV-TO-CR3 but that became an issue with recent versions of Windows10 implementing Meltdown mitigations because it leads to extreme performance degradation in the guest (opening an app takes ~20s). So now I just try to find the KPCR based on the registers reported in each vm_event. We use the KPCR to quickly find thread/process base addresses to gather info relevant to introspection. Tamas Tamas ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] x86/HVM: p2m_ram_ro is incompatible with device pass-through
>>> On 02.05.19 at 15:42, wrote: > On 02/05/2019 14:16, Jan Beulich wrote: > On 02.05.19 at 14:59, wrote: >>> On 02/05/2019 13:29, Jan Beulich wrote: --- a/xen/drivers/passthrough/pci.c +++ b/xen/drivers/passthrough/pci.c @@ -1450,17 +1450,36 @@ static int assign_device(struct domain * if ( !iommu_enabled || !hd->platform_ops ) return 0; -/* Prevent device assign if mem paging or mem sharing have been - * enabled for this domain */ -if ( unlikely(d->arch.hvm.mem_sharing_enabled || - vm_event_check_ring(d->vm_event_paging) || +domain_lock(d); + +/* + * Prevent device assignment if any of + * - mem paging + * - mem sharing + * - the p2m_ram_ro type + * - global log-dirty mode >>> XenServer has working live migration with GPUs, which this change would >>> regress. >>> >>> Behind the scenes, we combine Xen's logdirty bitmap with one provided by >>> the GPU, to ensure that all dirty updates are tracked. >> But this says nothing for the patch here. > > Yes it does. Well, okay, then the wording of your reply plus me just adding a comment for a pre-existing check has mislead me. > There is nothing inherent about global log-dirty mode which is > incompatible with passthrough when the IOMMU pagetables are not shared > with EPT. > >> As long as it doesn't work in the staging tree, it should be prevented. > > ... but it does work. Note how (as said above) the patch does _not_ add any ->global_logdirty check, it merely adds a comment enumerating everything that's getting checked. I'm afraid I don't see how adding a comment to state how things are can regress anything. The only check the patch adds is that of the new p2m_ram_ro_used flag. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH] mm/pgtable: Drop pgtable_t variable from pte_fn_t functions
Drop the pgtable_t variable from all implementation for pte_fn_t as none of them use it. apply_to_pte_range() should stop computing it as well. Should help us save some cycles. Signed-off-by: Anshuman Khandual Cc: Ard Biesheuvel Cc: Russell King Cc: Catalin Marinas Cc: Will Deacon Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Andrew Morton Cc: Michal Hocko Cc: Matthew Wilcox Cc: Logan Gunthorpe Cc: "Kirill A. Shutemov" Cc: Dan Williams Cc: Cc: Mike Rapoport Cc: x...@kernel.org Cc: linux-...@vger.kernel.org Cc: linux-arm-ker...@lists.infradead.org Cc: linux-ker...@vger.kernel.org Cc: xen-devel@lists.xenproject.org Cc: intel-...@lists.freedesktop.org Cc: dri-de...@lists.freedesktop.org Cc: linux...@kvack.org --- - Boot tested on arm64 and x86 platforms. - Build tested on multiple platforms with their defconfig arch/arm/kernel/efi.c | 3 +-- arch/arm/mm/dma-mapping.c | 3 +-- arch/arm/mm/pageattr.c | 3 +-- arch/arm64/kernel/efi.c| 3 +-- arch/arm64/mm/pageattr.c | 3 +-- arch/x86/xen/mmu_pv.c | 3 +-- drivers/gpu/drm/i915/i915_mm.c | 3 +-- drivers/xen/gntdev.c | 6 ++ drivers/xen/privcmd.c | 6 ++ drivers/xen/xlate_mmu.c| 3 +-- include/linux/mm.h | 3 +-- mm/memory.c| 5 + mm/vmalloc.c | 2 +- 13 files changed, 15 insertions(+), 31 deletions(-) diff --git a/arch/arm/kernel/efi.c b/arch/arm/kernel/efi.c index 9f43ba012d10..b1f142a01f2f 100644 --- a/arch/arm/kernel/efi.c +++ b/arch/arm/kernel/efi.c @@ -11,8 +11,7 @@ #include #include -static int __init set_permissions(pte_t *ptep, pgtable_t token, - unsigned long addr, void *data) +static int __init set_permissions(pte_t *ptep, unsigned long addr, void *data) { efi_memory_desc_t *md = data; pte_t pte = *ptep; diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index 43f46aa7ef33..739286511a18 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -496,8 +496,7 @@ void __init dma_contiguous_remap(void) } } -static int __dma_update_pte(pte_t *pte, pgtable_t token, unsigned long addr, - void *data) +static int __dma_update_pte(pte_t *pte, unsigned long addr, void *data) { struct page *page = virt_to_page(addr); pgprot_t prot = *(pgprot_t *)data; diff --git a/arch/arm/mm/pageattr.c b/arch/arm/mm/pageattr.c index 1403cb4a0c3d..c8b500940e1f 100644 --- a/arch/arm/mm/pageattr.c +++ b/arch/arm/mm/pageattr.c @@ -22,8 +22,7 @@ struct page_change_data { pgprot_t clear_mask; }; -static int change_page_range(pte_t *ptep, pgtable_t token, unsigned long addr, - void *data) +static int change_page_range(pte_t *ptep, unsigned long addr, void *data) { struct page_change_data *cdata = data; pte_t pte = *ptep; diff --git a/arch/arm64/kernel/efi.c b/arch/arm64/kernel/efi.c index 4f9acb5fbe97..230cff073a08 100644 --- a/arch/arm64/kernel/efi.c +++ b/arch/arm64/kernel/efi.c @@ -86,8 +86,7 @@ int __init efi_create_mapping(struct mm_struct *mm, efi_memory_desc_t *md) return 0; } -static int __init set_permissions(pte_t *ptep, pgtable_t token, - unsigned long addr, void *data) +static int __init set_permissions(pte_t *ptep, unsigned long addr, void *data) { efi_memory_desc_t *md = data; pte_t pte = READ_ONCE(*ptep); diff --git a/arch/arm64/mm/pageattr.c b/arch/arm64/mm/pageattr.c index 6cd645edcf35..0be077628b21 100644 --- a/arch/arm64/mm/pageattr.c +++ b/arch/arm64/mm/pageattr.c @@ -27,8 +27,7 @@ struct page_change_data { bool rodata_full __ro_after_init = IS_ENABLED(CONFIG_RODATA_FULL_DEFAULT_ENABLED); -static int change_page_range(pte_t *ptep, pgtable_t token, unsigned long addr, - void *data) +static int change_page_range(pte_t *ptep, unsigned long addr, void *data) { struct page_change_data *cdata = data; pte_t pte = READ_ONCE(*ptep); diff --git a/arch/x86/xen/mmu_pv.c b/arch/x86/xen/mmu_pv.c index a21e1734fc1f..308a6195fd26 100644 --- a/arch/x86/xen/mmu_pv.c +++ b/arch/x86/xen/mmu_pv.c @@ -2702,8 +2702,7 @@ struct remap_data { struct mmu_update *mmu_update; }; -static int remap_area_pfn_pte_fn(pte_t *ptep, pgtable_t token, -unsigned long addr, void *data) +static int remap_area_pfn_pte_fn(pte_t *ptep, unsigned long addr, void *data) { struct remap_data *rmd = data; pte_t pte = pte_mkspecial(mfn_pte(*rmd->pfn, rmd->prot)); diff --git a/drivers/gpu/drm/i915/i915_mm.c b/drivers/gpu/drm/i915/i915_mm.c index e4935dd1fd37..c23bb29e6d3e 100644 --- a/drivers/gpu/drm/i915/i915_mm.c +++ b/drivers/gpu/drm/i915/i915_mm.c @@ -35,8 +35,7 @@ struct remap_pfn { pgprot_t prot; }; -static int remap_pfn(pte_t *pte, pgtable_t token, -unsigned long addr, void *data) +static
Re: [Xen-devel] [PATCH] mm/pgtable: Drop pgtable_t variable from pte_fn_t functions
On 05/02/2019 07:16 PM, Matthew Wilcox wrote: > On Thu, May 02, 2019 at 06:48:46PM +0530, Anshuman Khandual wrote: >> Drop the pgtable_t variable from all implementation for pte_fn_t as none of >> them use it. apply_to_pte_range() should stop computing it as well. Should >> help us save some cycles. > You didn't add Martin Schwidefsky for some reason. He introduced scripts/get_maintainer.pl did not list the email but anyways I should have added it from git blame. Thanks for adding his email to the thread. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] Xen GCC coverage ARM64 testing - Unexpected Trap: Data Abort
Hi All, Please be aware that we have tried Xen ARM64 build with CONFIG_COVERAGE feature enabled. The build environment is next: Xen Versions tested: xen-4.12-stable, xen-4.13-staging Board: H3ULCB, R-Car H3 Ver.2.0 Poky: Yocto Project Reference Distro 2.4.2 Compiler: aarch64-poky-linux-gcc (Linaro GCC 7.2-2017.11) 7.2.1 Both Xen versions (4.12 and staging) return "Unexpected Trap: Data Abort" issue in case of 'xencov reset' or 'xencov read' calls: root@h3ulcb:~# xencov reset (XEN) Data Abort Trap. Syndrome=0x7 (XEN) Walking Hypervisor VA 0x361700 on CPU3 via TTBR 0x78266000 (XEN) 0TH[0x0] = 0x78265f7f (XEN) 1ST[0x0] = 0x78262f7f (XEN) 2ND[0x1] = 0x00407825ff7f (XEN) 3RD[0x161] = 0x0060781e1f7e (XEN) CPU3: Unexpected Trap: Data Abort Attaching the next log files (zipped in xen_with_config_coverage_logs.zip) with the details: - all the run-time exception details (rcarh3_config_coverage_trap.log); - xen package build log file with compilation options (compilation.log); - xen hypervisor .config file used for the build (xen_dot_config.log); Please share any comments or ideas about the issue. Thanks, Viktor Mitin <> ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] x86/boot: Annotate the Real Mode entry points
On 02/05/2019 14:50, Jan Beulich wrote: On 02.05.19 at 15:27, wrote: >> --- a/xen/arch/x86/boot/trampoline.S >> +++ b/xen/arch/x86/boot/trampoline.S >> @@ -38,6 +38,12 @@ >> >> .code16 >> >> +/* >> + * do_boot_cpu() programs the Startup-IPI to point here. Due to the SIPI >> + * format, the relocated entrypoint must be 4k aligned. >> + * >> + * It is entered in Real Mode, with %cs = wakeup_start >> 4 and %ip = 0. >> + */ >> GLOBAL(trampoline_realmode_entry) > The reference to wakeup_start looks to be a copy-and-paste > (or alike) mistake here. Oops, indeed. Fixed. > >> --- a/xen/arch/x86/smpboot.c >> +++ b/xen/arch/x86/smpboot.c >> @@ -548,9 +548,12 @@ static int do_boot_cpu(int apicid, int cpu) >> >> booting_cpu = cpu; >> >> -/* start_eip had better be page-aligned! */ >> start_eip = setup_trampoline(); >> >> +/* start_eip needs be page aligned, and below the 1M boundary. */ >> +if ( start_eip & ~0xff000 ) >> +panic("AP trampoline %#lx not suitably positioned\n", start_eip); > Seeing what setup_trampoline() really does, I'm not > convinced a panic() is of much value. The page-alignment > should be possible to check at build time, and the below-1M > requirement should be guaranteed by us allocating low > memory space in the first place. Sadly it cant. We have a number of alignment issues (well - confusions at least), most obviously that trampoline_{start,end} in the linked Xen image has no particular alignment, but the relocated trampoline_start has a 4k requirement as a consequence of its alias with trampoline_realmode_entry. All it takes is one error in the 32bit asm which relocates the trampoline and this ends up exploding in a way which can only be detected at runtime. > Nevertheless I won't insist, > so with the earlier comment corrected > Reviewed-by: Jan Beulich Thanks, ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [RFC PATCH 2/2] xen/device-tree: Add ability to handle nodes with interrupts-extended prop
From: Oleksandr Tyshchenko Xen expects to see "interrupts" property when parsing host device-tree. But, there are cases when some device nodes contain "interrupts-extended" property instead. The good example here is arch timer node for R-Car Gen3/Gen2 family, which is mandatory device for Xen usage on ARM. And without ability to handle such nodes, Xen fails to operate: (XEN) (XEN) Panic on CPU 0: (XEN) Timer: Unable to retrieve IRQ 0 from the device tree (XEN) Signed-off-by: Oleksandr Tyshchenko --- xen/common/device_tree.c | 32 +--- 1 file changed, 29 insertions(+), 3 deletions(-) diff --git a/xen/common/device_tree.c b/xen/common/device_tree.c index 65862b5..00ada6e 100644 --- a/xen/common/device_tree.c +++ b/xen/common/device_tree.c @@ -987,9 +987,19 @@ unsigned int dt_number_of_irq(const struct dt_device_node *device) const struct dt_device_node *p; const __be32 *intspec, *tmp; u32 intsize, intlen; +int intnum; dt_dprintk("dt_irq_number: dev=%s\n", device->full_name); +/* Try the new-style interrupts-extended first */ +intnum = dt_count_phandle_with_args(device, "interrupts-extended", +"#interrupt-cells"); +if ( intnum > 0 ) +{ +dt_dprintk(" intnum=%d\n", intnum); +return intnum; +} + /* Get the interrupts property */ intspec = dt_get_property(device, "interrupts", &intlen); if ( intspec == NULL ) @@ -1420,10 +1430,29 @@ int dt_device_get_raw_irq(const struct dt_device_node *device, const __be32 *intspec, *tmp, *addr; u32 intsize, intlen; int res = -EINVAL; +struct dt_phandle_args args; +int i; dt_dprintk("dt_device_get_raw_irq: dev=%s, index=%u\n", device->full_name, index); +/* Get the reg property (if any) */ +addr = dt_get_property(device, "reg", NULL); + +/* Try the new-style interrupts-extended first */ +res = dt_parse_phandle_with_args(device, "interrupts-extended", + "#interrupt-cells", index, &args); +if ( !res ) +{ +dt_dprintk(" intspec=%d intsize=%d\n", args.args[0], args.args_count); + +for ( i = 0; i < args.args_count; i++ ) +args.args[i] = cpu_to_be32(args.args[i]); + +return dt_irq_map_raw(args.np, args.args, args.args_count, + addr, out_irq); +} + /* Get the interrupts property */ intspec = dt_get_property(device, "interrupts", &intlen); if ( intspec == NULL ) @@ -1432,9 +1461,6 @@ int dt_device_get_raw_irq(const struct dt_device_node *device, dt_dprintk(" intspec=%d intlen=%d\n", be32_to_cpup(intspec), intlen); -/* Get the reg property (if any) */ -addr = dt_get_property(device, "reg", NULL); - /* Look for the interrupt parent. */ p = dt_irq_find_parent(device); if ( p == NULL ) -- 2.7.4 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [RFC PATCH 0/2] Add ability to handle nodes with interrupts-extended property
From: Oleksandr Tyshchenko Hello, all. The purpose of this small series is to add minimal required support for Xen to be able to handle device-tree nodes with "interrupts-extended" property [1]. The reason: Xen expects to see "interrupts" property when parsing host device-tree. But, there are cases when some device nodes contain "interrupts-extended" property instead. The good example here is arch timer node for R-Car Gen3/Gen2 family [2], which is mandatory device for Xen usage on ARM. And without ability to handle such nodes, Xen fails to operate: (XEN) (XEN) Panic on CPU 0: (XEN) Timer: Unable to retrieve IRQ 0 from the device tree (XEN) -- Preliminary tested on R-Car Gen3 based board. Log (with debug enabled) shows that Xen recognized arch timer interrupts represented with "interrupts-extended" property: timer { compatible = "arm,armv8-timer"; interrupts-extended = <&gic GIC_PPI 13 (GIC_CPU_MASK_SIMPLE(8) | IRQ_TYPE_LEVEL_LOW)>, <&gic GIC_PPI 14 (GIC_CPU_MASK_SIMPLE(8) | IRQ_TYPE_LEVEL_LOW)>, <&gic GIC_PPI 11 (GIC_CPU_MASK_SIMPLE(8) | IRQ_TYPE_LEVEL_LOW)>, <&gic GIC_PPI 10 (GIC_CPU_MASK_SIMPLE(8) | IRQ_TYPE_LEVEL_LOW)>; }; ... (XEN) dt_device_get_raw_irq: dev=/timer, index=0 (XEN) intspec=1 intsize=3 (XEN) dt_irq_map_raw: par=/soc/interrupt-controller@f101,intspec=[0x0001 0x000d...],ointsize=3 (XEN) dt_irq_map_raw: ipar=/soc/interrupt-controller@f101, size=3 (XEN) -> addrsize=0 (XEN) -> got it ! (XEN) dt_device_get_raw_irq: dev=/timer, index=1 (XEN) intspec=1 intsize=3 (XEN) dt_irq_map_raw: par=/soc/interrupt-controller@f101,intspec=[0x0001 0x000e...],ointsize=3 (XEN) dt_irq_map_raw: ipar=/soc/interrupt-controller@f101, size=3 (XEN) -> addrsize=0 (XEN) -> got it ! (XEN) dt_device_get_raw_irq: dev=/timer, index=2 (XEN) intspec=1 intsize=3 (XEN) dt_irq_map_raw: par=/soc/interrupt-controller@f101,intspec=[0x0001 0x000b...],ointsize=3 (XEN) dt_irq_map_raw: ipar=/soc/interrupt-controller@f101, size=3 (XEN) -> addrsize=0 (XEN) -> got it ! (XEN) dt_device_get_raw_irq: dev=/timer, index=3 (XEN) intspec=1 intsize=3 (XEN) dt_irq_map_raw: par=/soc/interrupt-controller@f101,intspec=[0x0001 0x000a...],ointsize=3 (XEN) dt_irq_map_raw: ipar=/soc/interrupt-controller@f101, size=3 (XEN) -> addrsize=0 (XEN) -> got it ! (XEN) Generic Timer IRQ: phys=30 hyp=26 virt=27 Freq: 8333 KHz ... -- The first patch had Julien's R-B some time ago, but I dropped it. [1] https://elixir.bootlin.com/linux/v5.1-rc7/source/Documentation/devicetree/bindings/interrupt-controller/interrupts.txt [2] https://elixir.bootlin.com/linux/v5.1-rc7/source/arch/arm64/boot/dts/renesas/r8a7795.dtsi#L3185 https://elixir.bootlin.com/linux/v5.1-rc7/source/arch/arm/boot/dts/r8a7790.dtsi#L1856 Oleksandr Tyshchenko (2): xen/device-tree: Add dt_count_phandle_with_args helper xen/device-tree: Add ability to handle nodes with interrupts-extended prop xen/common/device_tree.c | 39 --- xen/include/xen/device_tree.h | 19 +++ 2 files changed, 55 insertions(+), 3 deletions(-) -- 2.7.4 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [RFC PATCH 1/2] xen/device-tree: Add dt_count_phandle_with_args helper
From: Oleksandr Tyshchenko Port Linux helper of_count_phandle_with_args for counting number of phandles in a property. Signed-off-by: Oleksandr Tyshchenko --- xen/common/device_tree.c | 7 +++ xen/include/xen/device_tree.h | 19 +++ 2 files changed, 26 insertions(+) diff --git a/xen/common/device_tree.c b/xen/common/device_tree.c index 8fc401d..65862b5 100644 --- a/xen/common/device_tree.c +++ b/xen/common/device_tree.c @@ -1663,6 +1663,13 @@ int dt_parse_phandle_with_args(const struct dt_device_node *np, index, out_args); } +int dt_count_phandle_with_args(const struct dt_device_node *np, + const char *list_name, + const char *cells_name) +{ +return __dt_parse_phandle_with_args(np, list_name, cells_name, 0, -1, NULL); +} + /** * unflatten_dt_node - Alloc and populate a device_node from the flat tree * @fdt: The parent device tree blob diff --git a/xen/include/xen/device_tree.h b/xen/include/xen/device_tree.h index 7408a6c..8315629 100644 --- a/xen/include/xen/device_tree.h +++ b/xen/include/xen/device_tree.h @@ -738,6 +738,25 @@ int dt_parse_phandle_with_args(const struct dt_device_node *np, const char *cells_name, int index, struct dt_phandle_args *out_args); +/** + * dt_count_phandle_with_args() - Find the number of phandles references in a property + * @np: pointer to a device tree node containing a list + * @list_name: property name that contains a list + * @cells_name: property name that specifies phandles' arguments count + * + * Returns the number of phandle + argument tuples within a property. It + * is a typical pattern to encode a list of phandle and variable + * arguments into a single property. The number of arguments is encoded + * by a property in the phandle-target node. For example, a gpios + * property would contain a list of GPIO specifies consisting of a + * phandle and 1 or more arguments. The number of arguments are + * determined by the #gpio-cells property in the node pointed to by the + * phandle. + */ +int dt_count_phandle_with_args(const struct dt_device_node *np, + const char *list_name, + const char *cells_name); + #ifdef CONFIG_DEVICE_TREE_DEBUG #define dt_dprintk(fmt, args...) \ printk(XENLOG_DEBUG fmt, ## args) -- 2.7.4 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] mm/pgtable: Drop pgtable_t variable from pte_fn_t functions
On Thu, 2 May 2019 06:46:23 -0700 Matthew Wilcox wrote: > On Thu, May 02, 2019 at 06:48:46PM +0530, Anshuman Khandual wrote: > > Drop the pgtable_t variable from all implementation for pte_fn_t as none of > > them use it. apply_to_pte_range() should stop computing it as well. Should > > help us save some cycles. > > You didn't add Martin Schwidefsky for some reason. He introduced > it originally for s390 for sub-page page tables back in 2008 (commit > 2f569afd9c). I think he should confirm that he no longer needs it. With its 2K pte tables s390 can not deal with a (struct page *) as a reference to a page table. But if there are no user of the apply_to_page_range() API left which actually make use of the token argument we can safely drop it. > > --- > > - Boot tested on arm64 and x86 platforms. > > - Build tested on multiple platforms with their defconfig > > > > arch/arm/kernel/efi.c | 3 +-- > > arch/arm/mm/dma-mapping.c | 3 +-- > > arch/arm/mm/pageattr.c | 3 +-- > > arch/arm64/kernel/efi.c| 3 +-- > > arch/arm64/mm/pageattr.c | 3 +-- > > arch/x86/xen/mmu_pv.c | 3 +-- > > drivers/gpu/drm/i915/i915_mm.c | 3 +-- > > drivers/xen/gntdev.c | 6 ++ > > drivers/xen/privcmd.c | 6 ++ > > drivers/xen/xlate_mmu.c| 3 +-- > > include/linux/mm.h | 3 +-- > > mm/memory.c| 5 + > > mm/vmalloc.c | 2 +- > > 13 files changed, 15 insertions(+), 31 deletions(-) > > > > diff --git a/arch/arm/kernel/efi.c b/arch/arm/kernel/efi.c > > index 9f43ba012d10..b1f142a01f2f 100644 > > --- a/arch/arm/kernel/efi.c > > +++ b/arch/arm/kernel/efi.c > > @@ -11,8 +11,7 @@ > > #include > > #include > > > > -static int __init set_permissions(pte_t *ptep, pgtable_t token, > > - unsigned long addr, void *data) > > +static int __init set_permissions(pte_t *ptep, unsigned long addr, void > > *data) > > { > > efi_memory_desc_t *md = data; > > pte_t pte = *ptep; > > diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c > > index 43f46aa7ef33..739286511a18 100644 > > --- a/arch/arm/mm/dma-mapping.c > > +++ b/arch/arm/mm/dma-mapping.c > > @@ -496,8 +496,7 @@ void __init dma_contiguous_remap(void) > > } > > } > > > > -static int __dma_update_pte(pte_t *pte, pgtable_t token, unsigned long > > addr, > > - void *data) > > +static int __dma_update_pte(pte_t *pte, unsigned long addr, void *data) > > { > > struct page *page = virt_to_page(addr); > > pgprot_t prot = *(pgprot_t *)data; > > diff --git a/arch/arm/mm/pageattr.c b/arch/arm/mm/pageattr.c > > index 1403cb4a0c3d..c8b500940e1f 100644 > > --- a/arch/arm/mm/pageattr.c > > +++ b/arch/arm/mm/pageattr.c > > @@ -22,8 +22,7 @@ struct page_change_data { > > pgprot_t clear_mask; > > }; > > > > -static int change_page_range(pte_t *ptep, pgtable_t token, unsigned long > > addr, > > - void *data) > > +static int change_page_range(pte_t *ptep, unsigned long addr, void *data) > > { > > struct page_change_data *cdata = data; > > pte_t pte = *ptep; > > diff --git a/arch/arm64/kernel/efi.c b/arch/arm64/kernel/efi.c > > index 4f9acb5fbe97..230cff073a08 100644 > > --- a/arch/arm64/kernel/efi.c > > +++ b/arch/arm64/kernel/efi.c > > @@ -86,8 +86,7 @@ int __init efi_create_mapping(struct mm_struct *mm, > > efi_memory_desc_t *md) > > return 0; > > } > > > > -static int __init set_permissions(pte_t *ptep, pgtable_t token, > > - unsigned long addr, void *data) > > +static int __init set_permissions(pte_t *ptep, unsigned long addr, void > > *data) > > { > > efi_memory_desc_t *md = data; > > pte_t pte = READ_ONCE(*ptep); > > diff --git a/arch/arm64/mm/pageattr.c b/arch/arm64/mm/pageattr.c > > index 6cd645edcf35..0be077628b21 100644 > > --- a/arch/arm64/mm/pageattr.c > > +++ b/arch/arm64/mm/pageattr.c > > @@ -27,8 +27,7 @@ struct page_change_data { > > > > bool rodata_full __ro_after_init = > > IS_ENABLED(CONFIG_RODATA_FULL_DEFAULT_ENABLED); > > > > -static int change_page_range(pte_t *ptep, pgtable_t token, unsigned long > > addr, > > - void *data) > > +static int change_page_range(pte_t *ptep, unsigned long addr, void *data) > > { > > struct page_change_data *cdata = data; > > pte_t pte = READ_ONCE(*ptep); > > diff --git a/arch/x86/xen/mmu_pv.c b/arch/x86/xen/mmu_pv.c > > index a21e1734fc1f..308a6195fd26 100644 > > --- a/arch/x86/xen/mmu_pv.c > > +++ b/arch/x86/xen/mmu_pv.c > > @@ -2702,8 +2702,7 @@ struct remap_data { > > struct mmu_update *mmu_update; > > }; > > > > -static int remap_area_pfn_pte_fn(pte_t *ptep, pgtable_t token, > > -unsigned long addr, void *data) > > +static int remap_area_pfn_pte_fn(pte_t *ptep, unsigned long addr, void > > *data) > > { > > struct remap_data *rmd = data; > > pte_t pte = pte_mkspecial(mfn_
Re: [Xen-devel] [PATCH] x86/HVM: p2m_ram_ro is incompatible with device pass-through
> -Original Message- > From: Jan Beulich [mailto:jbeul...@suse.com] > Sent: 02 May 2019 14:57 > To: Andrew Cooper > Cc: Paul Durrant ; Roger Pau Monne > ; Wei Liu > ; George Dunlap ; xen-devel > de...@lists.xenproject.org> > Subject: Re: [PATCH] x86/HVM: p2m_ram_ro is incompatible with device > pass-through > > >>> On 02.05.19 at 15:42, wrote: > > On 02/05/2019 14:16, Jan Beulich wrote: > > On 02.05.19 at 14:59, wrote: > >>> On 02/05/2019 13:29, Jan Beulich wrote: > --- a/xen/drivers/passthrough/pci.c > +++ b/xen/drivers/passthrough/pci.c > @@ -1450,17 +1450,36 @@ static int assign_device(struct domain * > if ( !iommu_enabled || !hd->platform_ops ) > return 0; > > -/* Prevent device assign if mem paging or mem sharing have been > - * enabled for this domain */ > -if ( unlikely(d->arch.hvm.mem_sharing_enabled || > - vm_event_check_ring(d->vm_event_paging) || > +domain_lock(d); > + > +/* > + * Prevent device assignment if any of > + * - mem paging > + * - mem sharing > + * - the p2m_ram_ro type > + * - global log-dirty mode > >>> XenServer has working live migration with GPUs, which this change would > >>> regress. > >>> > >>> Behind the scenes, we combine Xen's logdirty bitmap with one provided by > >>> the GPU, to ensure that all dirty updates are tracked. > >> But this says nothing for the patch here. > > > > Yes it does. > > Well, okay, then the wording of your reply plus me just adding > a comment for a pre-existing check has mislead me. > > > There is nothing inherent about global log-dirty mode which is > > incompatible with passthrough when the IOMMU pagetables are not shared > > with EPT. > > > >> As long as it doesn't work in the staging tree, it should be prevented. > > > > ... but it does work. > > Note how (as said above) the patch does _not_ add any > ->global_logdirty check, it merely adds a comment enumerating > everything that's getting checked. I'm afraid I don't see how > adding a comment to state how things are can regress anything. > > The only check the patch adds is that of the new > p2m_ram_ro_used flag. > Actually, since global_logdirty is somewhat transient should we not also have a check to prevent p2m_ram_logdirty from being set for a domain with assigned devices and shared EPT? Paul > Jan > ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] x86/HVM: p2m_ram_ro is incompatible with device pass-through
>>> On 02.05.19 at 16:12, wrote: > Actually, since global_logdirty is somewhat transient should we not also > have a check to prevent p2m_ram_logdirty from being set for a domain with > assigned devices and shared EPT? Probably (if we indeed don't), but imo not in this patch. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] x86/HVM: p2m_ram_ro is incompatible with device pass-through
> -Original Message- > From: Jan Beulich [mailto:jbeul...@suse.com] > Sent: 02 May 2019 15:25 > To: Paul Durrant > Cc: Andrew Cooper ; George Dunlap > ; Roger Pau > Monne ; Wei Liu ; xen-devel de...@lists.xenproject.org> > Subject: RE: [PATCH] x86/HVM: p2m_ram_ro is incompatible with device > pass-through > > >>> On 02.05.19 at 16:12, wrote: > > Actually, since global_logdirty is somewhat transient should we not also > > have a check to prevent p2m_ram_logdirty from being set for a domain with > > assigned devices and shared EPT? > > Probably (if we indeed don't), but imo not in this patch. > Yes, fair enough. Paul > Jan > ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH v2] x86: suppress XPTI-related TLB flushes when possible
When there's no XPTI-enabled PV domain at all, there's no need to issue respective TLB flushes. Hardwire opt_xpti_* to false when !PV, and record the creation of PV domains by bumping opt_xpti_* accordingly. As to the sticky opt_xpti_domu vs increment/decrement of opt_xpti_hwdom, this is done this way to avoid (a) widening the former variable, (b) any risk of a missed flush, which would result in an XSA if a DomU was able to exercise it, and (c) any races updating the variable. Fundamentally the TLB flush done when context switching out the domain's vCPU-s the last time before destroying the domain ought to be sufficient, so in principle DomU handling could be made match hwdom's. Signed-off-by: Jan Beulich --- v2: Add comment to spec_ctrl.h. Explain difference in accounting of DomU and hwdom. --- TBD: The hardwiring to false could be extended to opt_pv_l1tf_* and (for !HVM) opt_l1d_flush as well. --- a/xen/arch/x86/flushtlb.c +++ b/xen/arch/x86/flushtlb.c @@ -218,7 +218,7 @@ unsigned int flush_area_local(const void */ invpcid_flush_one(PCID_PV_PRIV, addr); invpcid_flush_one(PCID_PV_USER, addr); -if ( opt_xpti_hwdom || opt_xpti_domu ) +if ( opt_xpti_hwdom > 1 || opt_xpti_domu > 1 ) { invpcid_flush_one(PCID_PV_PRIV | PCID_PV_XPTI, addr); invpcid_flush_one(PCID_PV_USER | PCID_PV_XPTI, addr); --- a/xen/arch/x86/pv/domain.c +++ b/xen/arch/x86/pv/domain.c @@ -272,6 +272,9 @@ void pv_domain_destroy(struct domain *d) destroy_perdomain_mapping(d, GDT_LDT_VIRT_START, GDT_LDT_MBYTES << (20 - PAGE_SHIFT)); +opt_xpti_hwdom -= IS_ENABLED(CONFIG_LATE_HWDOM) && + !d->domain_id && opt_xpti_hwdom; + XFREE(d->arch.pv.cpuidmasks); FREE_XENHEAP_PAGE(d->arch.pv.gdt_ldt_l1tab); @@ -310,7 +313,16 @@ int pv_domain_initialise(struct domain * /* 64-bit PV guest by default. */ d->arch.is_32bit_pv = d->arch.has_32bit_shinfo = 0; -d->arch.pv.xpti = is_hardware_domain(d) ? opt_xpti_hwdom : opt_xpti_domu; +if ( is_hardware_domain(d) && opt_xpti_hwdom ) +{ +d->arch.pv.xpti = true; +++opt_xpti_hwdom; +} +if ( !is_hardware_domain(d) && opt_xpti_domu ) +{ +d->arch.pv.xpti = true; +opt_xpti_domu = 2; +} if ( !is_pv_32bit_domain(d) && use_invpcid && cpu_has_pcid ) switch ( ACCESS_ONCE(opt_pcid) ) --- a/xen/arch/x86/spec_ctrl.c +++ b/xen/arch/x86/spec_ctrl.c @@ -80,10 +80,12 @@ static int __init parse_spec_ctrl(const opt_eager_fpu = 0; +#ifdef CONFIG_PV if ( opt_xpti_hwdom < 0 ) opt_xpti_hwdom = 0; if ( opt_xpti_domu < 0 ) opt_xpti_domu = 0; +#endif if ( opt_smt < 0 ) opt_smt = 1; @@ -634,6 +636,7 @@ static __init void l1tf_calculations(uin : (3ul << (paddr_bits - 2; } +#ifdef CONFIG_PV int8_t __read_mostly opt_xpti_hwdom = -1; int8_t __read_mostly opt_xpti_domu = -1; @@ -700,6 +703,9 @@ static __init int parse_xpti(const char return rc; } custom_param("xpti", parse_xpti); +#else /* !CONFIG_PV */ +# define xpti_init_default(caps) ((void)(caps)) +#endif /* CONFIG_PV */ void __init init_speculation_mitigations(void) { --- a/xen/include/asm-x86/spec_ctrl.h +++ b/xen/include/asm-x86/spec_ctrl.h @@ -43,7 +43,18 @@ extern bool bsp_delay_spec_ctrl; extern uint8_t default_xen_spec_ctrl; extern uint8_t default_spec_ctrl_flags; +#ifdef CONFIG_PV +/* + * Values -1, 0, and 1 have the usual meaning of "not established yet", + * "disabled", and "enabled". Values larger than 1 indicate there's actually + * at least one such domain (or there has been). This way XPTI-specific TLB + * flushes can be avoided when no XPTI-enabled domain is/was active. + */ extern int8_t opt_xpti_hwdom, opt_xpti_domu; +#else +# define opt_xpti_hwdom false +# define opt_xpti_domu false +#endif extern int8_t opt_pv_l1tf_hwdom, opt_pv_l1tf_domu; ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [VOTE] tagging for operational messages sent to xen-devel@ (was Re: Xen 4.13 Development Update)
On 01/05/2019, 12:56, "Rich Persaud" wrote: > On May 1, 2019, at 14:37, Lars Kurth wrote: > > Rich, > as nobody replied to the mail, I am inclined to dismiss the proposal of ANN for now > Lars What do you think about the suggestion to apply a tag ("ANNOUNCE"?) for emails that are mirrored to xen-devel from the -announce mailing list? Jan commented on that suggestion in a separate thread. I think that's fine, although we may have to change how the announce list works, otherwise on that list we get "[xen-announce] [ANNOUNCE] ..." if messages are cross posted, which would be odd and will surely annoy some people Lars ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v3 1/3] x86/mm: Introduce altp2m_get_gfn_type_access
>>> On 09.04.19 at 14:03, wrote: > --- a/xen/arch/x86/mm/mem_access.c > +++ b/xen/arch/x86/mm/mem_access.c > @@ -265,31 +265,27 @@ int p2m_set_altp2m_mem_access(struct domain *d, struct > p2m_domain *hp2m, > unsigned int page_order; > unsigned long gfn_l = gfn_x(gfn); > int rc; > +bool copied_from_hostp2m; > > -mfn = ap2m->get_entry(ap2m, gfn, &t, &old_a, 0, NULL, NULL); > +mfn = altp2m_get_gfn_type_access(ap2m, gfn, &t, &old_a, &page_order, > &copied_from_hostp2m); Long line (also elsewhere). Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v3 2/3] x86/mm: Introduce altp2m_set_entry_by_page_order
>>> On 09.04.19 at 14:03, wrote: > --- a/xen/arch/x86/mm/mem_access.c > +++ b/xen/arch/x86/mm/mem_access.c > @@ -279,7 +279,7 @@ int p2m_set_altp2m_mem_access(struct domain *d, struct > p2m_domain *hp2m, > gfn_t gfn2 = _gfn(gfn_l & mask); > mfn_t mfn2 = _mfn(mfn_x(mfn) & mask); > > -/* Note: currently it is not safe to remap to a shared entry */ > + /* Note: currently it is not safe to remap to a shared entry */ Stray and bad (hard tab) change. But you've been meaning to re-send this anyway. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] Xen GCC coverage ARM64 testing - Unexpected Trap: Data Abort
Adding Xen maintainers to this email CC. Thanks On Thu, May 2, 2019 at 5:08 PM Viktor Mitin wrote: > > Hi All, > > Please be aware that we have tried Xen ARM64 build with > CONFIG_COVERAGE feature enabled. The build environment is next: > Xen Versions tested: xen-4.12-stable, xen-4.13-staging > Board: H3ULCB, R-Car H3 Ver.2.0 > Poky: Yocto Project Reference Distro 2.4.2 > Compiler: aarch64-poky-linux-gcc (Linaro GCC 7.2-2017.11) 7.2.1 > > Both Xen versions (4.12 and staging) return "Unexpected Trap: Data > Abort" issue in case of 'xencov reset' or 'xencov read' calls: > > root@h3ulcb:~# xencov reset > (XEN) Data Abort Trap. Syndrome=0x7 > (XEN) Walking Hypervisor VA 0x361700 on CPU3 via TTBR 0x78266000 > (XEN) 0TH[0x0] = 0x78265f7f > (XEN) 1ST[0x0] = 0x78262f7f > (XEN) 2ND[0x1] = 0x00407825ff7f > (XEN) 3RD[0x161] = 0x0060781e1f7e > (XEN) CPU3: Unexpected Trap: Data Abort > > Attaching the next log files (zipped in > xen_with_config_coverage_logs.zip) with the details: > - all the run-time exception details (rcarh3_config_coverage_trap.log); > - xen package build log file with compilation options (compilation.log); > - xen hypervisor .config file used for the build (xen_dot_config.log); > > Please share any comments or ideas about the issue. > > Thanks, > Viktor Mitin ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] Ping: [PATCH] AMD/IOMMU: adjust IOMMU list head initialization
>>> On 10.04.19 at 11:37, wrote: > Do this statically, which will allow accessing the (empty) list even > without having come through acpi_ivrs_init(). > > Signed-off-by: Jan Beulich > > --- a/xen/drivers/passthrough/amd/iommu_init.c > +++ b/xen/drivers/passthrough/amd/iommu_init.c > @@ -36,7 +36,7 @@ static struct tasklet amd_iommu_irq_task > unsigned int __read_mostly ivrs_bdf_entries; > u8 __read_mostly ivhd_type; > static struct radix_tree_root ivrs_maps; > -struct list_head amd_iommu_head; > +LIST_HEAD_READ_MOSTLY(amd_iommu_head); > struct table_struct device_table; > bool_t iommuv2_enabled; > > --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c > +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c > @@ -150,8 +150,6 @@ static void amd_iommu_setup_domain_devic > > int __init acpi_ivrs_init(void) > { > -INIT_LIST_HEAD(&amd_iommu_head); > - > if ( !iommu_enable && !iommu_intremap ) > return 0; > > > > > ___ > Xen-devel mailing list > Xen-devel@lists.xenproject.org > https://lists.xenproject.org/mailman/listinfo/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] Ping: [PATCH] AMD/IOMMU: don't open-code for_each_amd_iommu()
>>> On 05.04.19 at 09:01, wrote: > Signed-off-by: Jan Beulich > > --- a/xen/drivers/passthrough/amd/iommu_intr.c > +++ b/xen/drivers/passthrough/amd/iommu_intr.c > @@ -503,7 +503,7 @@ static struct amd_iommu *_find_iommu_for > { > struct amd_iommu *iommu; > > -list_for_each_entry ( iommu, &amd_iommu_head, list ) > +for_each_amd_iommu ( iommu ) > if ( iommu->seg == seg && iommu->bdf == bdf ) > return NULL; > ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2 01/10] xen: add a p2mt parameter to map_mmio_regions
>>> On 30.04.19 at 23:02, wrote: > --- a/xen/arch/x86/hvm/dom0_build.c > +++ b/xen/arch/x86/hvm/dom0_build.c > @@ -79,8 +79,11 @@ static int __init modify_identity_mmio(struct domain *d, > unsigned long pfn, > > for ( ; ; ) > { > -rc = map ? map_mmio_regions(d, _gfn(pfn), nr_pages, _mfn(pfn)) > - : unmap_mmio_regions(d, _gfn(pfn), nr_pages, _mfn(pfn)); > +if ( map ) > +rc = map_mmio_regions(d, _gfn(pfn), nr_pages, _mfn(pfn), > + p2m_mmio_direct); > +else > +rc = unmap_mmio_regions(d, _gfn(pfn), nr_pages, _mfn(pfn)); May I ask that you leave alone the use of the conditional operator here, and _just_ add the new argument? > --- a/xen/arch/x86/mm/p2m.c > +++ b/xen/arch/x86/mm/p2m.c > @@ -2264,12 +2264,16 @@ static unsigned int mmio_order(const struct domain *d, > int map_mmio_regions(struct domain *d, > gfn_t start_gfn, > unsigned long nr, > - mfn_t mfn) > + mfn_t mfn, > + p2m_type_t p2mt) > { > int ret = 0; > unsigned long i; > unsigned int iter, order; > > +if ( p2mt != p2m_mmio_direct ) > +return -EOPNOTSUPP; Considering this and ... > --- a/xen/common/domctl.c > +++ b/xen/common/domctl.c > @@ -927,6 +927,7 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) > u_domctl) > unsigned long nr_mfns = op->u.memory_mapping.nr_mfns; > unsigned long mfn_end = mfn + nr_mfns - 1; > int add = op->u.memory_mapping.add_mapping; > +p2m_type_t p2mt; > > ret = -EINVAL; > if ( mfn_end < mfn || /* wrap? */ > @@ -939,6 +940,10 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) > u_domctl) > /* Must break hypercall up as this could take a while. */ > if ( nr_mfns > 64 ) > break; > + > +p2mt = p2m_mmio_direct_dev; > +#else > +p2mt = p2m_mmio_direct; > #endif ... this, is there really value in adding the new parameter for x86? A wrapper macro of the same name could be used to strip the new last argument at all call sites (current and future ones). Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2 02/10] xen: rename un/map_mmio_regions to un/map_regions
>>> On 30.04.19 at 23:02, wrote: > Now that map_mmio_regions takes a p2mt parameter, there is no need to > keep "mmio" in the name. The p2mt parameter does a better job at > expressing what the mapping is about. Let's save the environment 5 > characters at a time. But as per the cover letter the purpose is to allow mapping iomem (which I take is just an alternative term for MMIO). Even if that's misleading, {,un}map_regions() is a little too unspecific for my taste. At which point at least the environment saving argument goes away ;-) As to the series as a whole, I guess you first want to come to an agreement with Julien. Only then it'll make sense to actually review the changes, I think. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2 03/10] xen: extend XEN_DOMCTL_memory_mapping to handle memory policy
>>> On 30.04.19 at 23:02, wrote: > --- a/xen/include/public/domctl.h > +++ b/xen/include/public/domctl.h > @@ -571,12 +571,24 @@ struct xen_domctl_bind_pt_irq { > */ > #define DPCI_ADD_MAPPING 1 > #define DPCI_REMOVE_MAPPING 0 > +/* > + * Default memory policy. Corresponds to: > + * Arm: MEMORY_POLICY_ARM_DEV_nGRE > + * x86: MEMORY_POLICY_X86_UC > + */ > +#define MEMORY_POLICY_DEFAULT0 > +/* x86 only. Memory type UNCACHABLE */ > +#define MEMORY_POLICY_X86_UC 0 I'm afraid this may end up misleading, as on NPT and in shadow mode we use UC- instead of UC afaics. Andrew, do you have an opinion either way what exactly should be stated here? Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] x86/boot: Annotate the Real Mode entry points
On Thu, 2019-05-02 at 15:08 +0100, Andrew Cooper wrote: > Sadly it cant. > > We have a number of alignment issues (well - confusions at least), most > obviously that trampoline_{start,end} in the linked Xen image has no > particular alignment, but the relocated trampoline_start has a 4k > requirement as a consequence of its alias with trampoline_realmode_entry. > > All it takes is one error in the 32bit asm which relocates the > trampoline and this ends up exploding in a way which can only be > detected at runtime. I'm relocating the permanent trampoline from 64-bit C code now, not in assembly. In the no-real-mode case (ignoring reloc(), qv) nothing is put into low memory until __start_xen() calls relocate_trampoline(). smime.p7s Description: S/MIME cryptographic signature ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] Xen GCC coverage ARM64 testing - Unexpected Trap: Data Abort
Hi, On 5/2/19 3:50 PM, Viktor Mitin wrote: Adding Xen maintainers to this email CC. Thanks On Thu, May 2, 2019 at 5:08 PM Viktor Mitin wrote: Hi All, Please be aware that we have tried Xen ARM64 build with CONFIG_COVERAGE feature enabled. The build environment is next: Xen Versions tested: xen-4.12-stable, xen-4.13-staging Board: H3ULCB, R-Car H3 Ver.2.0 Poky: Yocto Project Reference Distro 2.4.2 Compiler: aarch64-poky-linux-gcc (Linaro GCC 7.2-2017.11) 7.2.1 Both Xen versions (4.12 and staging) return "Unexpected Trap: Data Abort" issue in case of 'xencov reset' or 'xencov read' calls: root@h3ulcb:~# xencov reset (XEN) Data Abort Trap. Syndrome=0x7 Per the value, the syndrome is invalid. As I will not open a zip (see below why), could you post the full stack trace? (XEN) Walking Hypervisor VA 0x361700 on CPU3 via TTBR 0x78266000 (XEN) 0TH[0x0] = 0x78265f7f (XEN) 1ST[0x0] = 0x78262f7f (XEN) 2ND[0x1] = 0x00407825ff7f (XEN) 3RD[0x161] = 0x0060781e1f7e (XEN) CPU3: Unexpected Trap: Data Abort Attaching the next log files (zipped in xen_with_config_coverage_logs.zip) with the details: Please don't send a 54KB attachment on the mailing list. This is using up space for every one on the ML. Instead you should upload somewhere (e.g pastebin). But I am afraid, I am not going to open any archive sent on the mailing list. Please upload file separately. However - all the run-time exception details (rcarh3_config_coverage_trap.log); - xen package build log file with compilation options (compilation.log); This is not necessary. - xen hypervisor .config file used for the build (xen_dot_config.log); Please share any comments or ideas about the issue. GCOV on Arm has never been tested. So it might be possible there might be some issues with it. Cheers, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [qemu-mainline test] 135448: regressions - FAIL
flight 135448 qemu-mainline real [real] http://logs.test-lab.xenproject.org/osstest/logs/135448/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: build-i386-xsm6 xen-buildfail REGR. vs. 135251 build-amd64-xsm 6 xen-buildfail REGR. vs. 135251 build-arm64-xsm 6 xen-buildfail REGR. vs. 135251 build-amd64 6 xen-buildfail REGR. vs. 135251 build-arm64 6 xen-buildfail REGR. vs. 135251 build-armhf 6 xen-buildfail REGR. vs. 135251 build-i3866 xen-buildfail REGR. vs. 135251 Tests which did not succeed, but are not blocking: test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 1 build-check(1) blocked n/a test-amd64-amd64-xl 1 build-check(1) blocked n/a test-amd64-amd64-qemuu-nested-intel 1 build-check(1) blocked n/a test-amd64-amd64-libvirt-pair 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-debianhvm-amd64-shadow 1 build-check(1) blocked n/a test-amd64-amd64-qemuu-nested-amd 1 build-check(1) blocked n/a test-armhf-armhf-xl-vhd 1 build-check(1) blocked n/a test-armhf-armhf-libvirt 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-debianhvm-amd64-shadow 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-ovmf-amd64 1 build-check(1) blocked n/a test-amd64-i386-xl-pvshim 1 build-check(1) blocked n/a test-arm64-arm64-xl-credit1 1 build-check(1) blocked n/a test-armhf-armhf-xl-credit1 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-ws16-amd64 1 build-check(1) blocked n/a test-amd64-amd64-amd64-pvgrub 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-ws16-amd64 1 build-check(1) blocked n/a test-amd64-amd64-xl-shadow1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-debianhvm-i386-xsm 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-debianhvm-amd64 1 build-check(1) blocked n/a test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a test-amd64-amd64-xl-pvshim1 build-check(1) blocked n/a test-armhf-armhf-xl-cubietruck 1 build-check(1) blocked n/a test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a test-amd64-amd64-pair 1 build-check(1) blocked n/a test-arm64-arm64-libvirt-xsm 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-ovmf-amd64 1 build-check(1) blocked n/a test-amd64-amd64-libvirt-vhd 1 build-check(1) blocked n/a test-amd64-i386-xl-xsm1 build-check(1) blocked n/a test-amd64-i386-libvirt-pair 1 build-check(1) blocked n/a build-arm64-libvirt 1 build-check(1) blocked n/a test-arm64-arm64-xl-credit2 1 build-check(1) blocked n/a test-amd64-i386-libvirt 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-debianhvm-amd64 1 build-check(1)blocked n/a test-amd64-amd64-pygrub 1 build-check(1) blocked n/a test-amd64-i386-xl1 build-check(1) blocked n/a test-amd64-amd64-xl-credit1 1 build-check(1) blocked n/a test-amd64-i386-xl-shadow 1 build-check(1) blocked n/a test-amd64-i386-freebsd10-amd64 1 build-check(1) blocked n/a test-amd64-amd64-xl-qcow2 1 build-check(1) blocked n/a test-amd64-amd64-xl-multivcpu 1 build-check(1) blocked n/a test-amd64-i386-qemuu-rhel6hvm-amd 1 build-check(1) blocked n/a test-amd64-i386-libvirt-xsm 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-win10-i386 1 build-check(1) blocked n/a test-armhf-armhf-xl-rtds 1 build-check(1) blocked n/a test-amd64-i386-freebsd10-i386 1 build-check(1) blocked n/a test-amd64-amd64-xl-xsm 1 build-check(1) blocked n/a test-amd64-amd64-xl-pvhv2-intel 1 build-check(1) blocked n/a test-amd64-amd64-libvirt 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-win10-i386 1 build-check(1) blocked n/a build-i386-libvirt1 build-check(1) blocked n/a test-arm64-arm64-xl-xsm 1 build-check(1) blocked n/a test-amd64-amd64-xl-credit2 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-win7-amd64 1 build-check(1) blocked n/a test-amd64-i386-p
[Xen-devel] [PATCH] tools/Makefile: Fix build of QEMU, remove --source-path
Following QEMU's commit 79d77bcd36 (configure: Remove --source-path option), Xen's build system fails to build qemu-xen. The --source-path option gives redundant information about the location of the sources so simply remove it. (configure already looks at its $0 to find the source-path.) Signed-off-by: Anthony PERARD --- This patch would unblock the qemu-mainline branch in osstest. --- tools/Makefile | 1 - 1 file changed, 1 deletion(-) diff --git a/tools/Makefile b/tools/Makefile index c903d6a63e..99cbc950dc 100644 --- a/tools/Makefile +++ b/tools/Makefile @@ -246,7 +246,6 @@ subdir-all-qemu-xen-dir: qemu-xen-dir-find --prefix=$(LIBEXEC) \ --libdir=$(LIBEXEC_LIB) \ --includedir=$(LIBEXEC_INC) \ - --source-path=$$source \ --extra-cflags="-DXC_WANT_COMPAT_EVTCHN_API=1 \ -DXC_WANT_COMPAT_GNTTAB_API=1 \ -DXC_WANT_COMPAT_MAP_FOREIGN_API=1 \ -- Anthony PERARD ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [qemu-upstream-4.10-testing test] 135444: regressions - FAIL
flight 135444 qemu-upstream-4.10-testing real [real] http://logs.test-lab.xenproject.org/osstest/logs/135444/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: build-amd64 6 xen-buildfail REGR. vs. 124921 build-amd64-xsm 6 xen-buildfail REGR. vs. 124921 build-i3866 xen-buildfail REGR. vs. 124921 build-i386-xsm6 xen-buildfail REGR. vs. 124921 Tests which did not succeed, but are not blocking: test-amd64-amd64-qemuu-nested-intel 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-ws16-amd64 1 build-check(1) blocked n/a test-amd64-amd64-amd64-pvgrub 1 build-check(1) blocked n/a test-amd64-amd64-libvirt 1 build-check(1) blocked n/a test-amd64-i386-qemuu-rhel6hvm-intel 1 build-check(1) blocked n/a test-amd64-i386-libvirt-pair 1 build-check(1) blocked n/a test-amd64-i386-pair 1 build-check(1) blocked n/a test-amd64-i386-freebsd10-i386 1 build-check(1) blocked n/a test-amd64-i386-libvirt 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-ws16-amd64 1 build-check(1) blocked n/a test-amd64-amd64-pygrub 1 build-check(1) blocked n/a test-amd64-amd64-xl-rtds 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-debianhvm-amd64-shadow 1 build-check(1) blocked n/a test-amd64-i386-libvirt-xsm 1 build-check(1) blocked n/a test-amd64-i386-xl-raw1 build-check(1) blocked n/a test-amd64-amd64-xl-qcow2 1 build-check(1) blocked n/a test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a test-amd64-amd64-libvirt-vhd 1 build-check(1) blocked n/a test-amd64-i386-qemuu-rhel6hvm-amd 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-ovmf-amd64 1 build-check(1) blocked n/a test-amd64-amd64-xl-credit2 1 build-check(1) blocked n/a test-amd64-amd64-xl-credit1 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-win10-i386 1 build-check(1) blocked n/a build-i386-libvirt1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-win7-amd64 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 1 build-check(1) blocked n/a test-amd64-amd64-i386-pvgrub 1 build-check(1) blocked n/a test-amd64-amd64-pair 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-win7-amd64 1 build-check(1) blocked n/a test-amd64-amd64-xl-shadow1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-debianhvm-amd64 1 build-check(1) blocked n/a test-amd64-amd64-xl-xsm 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-ovmf-amd64 1 build-check(1) blocked n/a test-amd64-i386-xl1 build-check(1) blocked n/a test-amd64-i386-freebsd10-amd64 1 build-check(1) blocked n/a test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a test-amd64-amd64-xl-pvhv2-intel 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict 1 build-check(1) blocked n/a test-amd64-amd64-xl-multivcpu 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-win10-i386 1 build-check(1) blocked n/a test-amd64-i386-xl-xsm1 build-check(1) blocked n/a test-amd64-amd64-xl-pvhv2-amd 1 build-check(1) blocked n/a test-amd64-i386-xl-shadow 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-debianhvm-i386-xsm 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-debianhvm-i386-xsm 1 build-check(1) blocked n/a test-amd64-amd64-libvirt-xsm 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-debianhvm-amd64-shadow 1 build-check(1) blocked n/a test-amd64-amd64-xl 1 build-check(1) blocked n/a test-amd64-amd64-libvirt-pair 1 build-check(1) blocked n/a test-amd64-amd64-qemuu-nested-amd 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-debianhvm-amd64 1 build-check(1)blocked n/a build-amd64-libvirt 1 build-check(1) blocked n/a test-arm64-arm64-xl-credit1 13 migrate-support-checkfail never pass test-arm64-arm64-xl 13 migrate-support-checkfail never pass test-arm64-arm64-xl 14 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit1 14 saverestore-support-checkfail never pass test-arm64-arm64-xl-xsm 13 migrate-support-check
Re: [Xen-devel] Xen GCC coverage ARM64 testing - Unexpected Trap: Data Abort
Hi Julien, Please find trace log below: root@h3ulcb:~# xencov reset (XEN) Data Abort Trap. Syndrome=0x7 (XEN) Walking Hypervisor VA 0x361700 on CPU3 via TTBR 0x78266000 (XEN) 0TH[0x0] = 0x78265f7f (XEN) 1ST[0x0] = 0x78262f7f (XEN) 2ND[0x1] = 0x00407825ff7f (XEN) 3RD[0x161] = 0x0060781e1f7e (XEN) CPU3: Unexpected Trap: Data Abort (XEN) [ Xen-4.13-unstable arm64 debug=y Not tainted ] (XEN) CPU:3 (XEN) PC: 0027bd10 gcov_info_reset+0/0x100 (XEN) LR: 0027bbd8 (XEN) SP: 80037ff07c80 (XEN) CPSR: 6249 MODE:64-bit EL2h (Hypervisor, handler) (XEN)X0: 00361698 X1: 003b0d80 X2: 0047 (XEN)X3: 0020 X4: X5: 0040 (XEN)X6: 003f X7: X8: 003b2e40 (XEN)X9: X10: X11: (XEN) X12: X13: X14: (XEN) X15: a12a5d00 X16: 0023 X17: a12670a0 (XEN) X18: 0530 X19: 003b0ba8 X20: 00361698 (XEN) X21: 003b0c18 X22: 003fb820 X23: 0031e698 (XEN) X24: c598aac0 X25: 0124 X26: 001d (XEN) X27: 08b11000 X28: 8006e98e8000 FP: 0e8abd50 (XEN) (XEN) VTCR_EL2: 80043594 (XEN) VTTBR_EL2: 000100073ffc5000 (XEN) (XEN) SCTLR_EL2: 30cd183d (XEN) HCR_EL2: 8078663f (XEN) TTBR0_EL2: 78266000 (XEN) (XEN) ESR_EL2: 9607 (XEN) HPFAR_EL2: (XEN) FAR_EL2: 00361700 (XEN) (XEN) Xen stack trace from sp=80037ff07c80: (XEN) 003b0ba8 a1451010 003ad558 0027b674 (XEN) a1451010 0026b1ac a1451010 (XEN) 0026a280 0026a1c4 80037ff07eb0 003f44f0 (XEN) 003f4c80 80037ff07f30 5a000ea1 c598aac0 (XEN) 0124 0028aa5c 80037ffc7000 003b8a78 (XEN) 002a8cd8 00097e6cf99f 0028b12c 0028b090 (XEN) 003b8f38 80037feed000 80037ffc7000 0028baec (XEN) 80037ffc7000 80037feed000 003a6a28 0025d898 (XEN) 003a8700 0003 003acdd0 0003 (XEN) 80037ffd8308 002b0cc0 003f5118 00120014 (XEN) 0002 (XEN) (XEN) (XEN) (XEN) 5a000ea1 80037ff07eb0 08b11000 2145 (XEN) 002abd64 09169000 8006edb15668 8006ec193c00 (XEN) 80037ff07fb8 80037feeddf0 002c342c 09169000 (XEN) 002c3430 939300476249 a1451010 (XEN) (XEN) 0e8abe00 0200 (XEN) 0101010101010101 (XEN) a12b2d98 a12a5d00 0023 a12670a0 (XEN) 0530 8006edb15668 8006ec193c00 8006ec193c00 (XEN) c598aac0 00305000 c598aac0 0124 (XEN) 001d 08b11000 8006e98e8000 0e8abd50 (XEN) 08551a10 080c1eec 5a000ea12145 (XEN) 8000 8006e98e8000 (XEN) 0e8abd50 a13664dc (XEN) Xen call trace: (XEN) [<0027bd10>] gcov_info_reset+0/0x100 (PC) (XEN) [<0027bbd8>] gcov.c#gcov_reset_all_counters+0x3c/0x78 (LR) (XEN) (XEN) (XEN) (XEN) Panic on CPU 3: (XEN) CPU3: Unexpected Trap: Data Abort (XEN) (XEN) (XEN) Reboot in five seconds... (XEN) (XEN) (XEN) Panic on CPU 0: (XEN) PSCI cpu off failed for CPU0 err=-3 (XEN) (XEN) (XEN) Reboot in five seconds... Thanks On Thu, May 2, 2019 at 6:17 PM Julien Grall wrote: > > Hi, > > On 5/2/19 3:50 PM, Viktor Mitin wrote: > > Adding Xen maintainers to this email CC. > > > > Thanks > > > > On Thu, May 2, 2019 at 5:08 PM Viktor Mitin > > wrote: > >> > >> Hi All, > >> > >> Please be aware that we have tried Xen ARM64 build with > >> CONFIG_COVERAGE feature enabled. The build environment is next: > >> Xen Versions tested: xen-4.12-stable, xen-4.13-staging > >> Board: H3ULCB, R-Car H3 Ver.2.0 > >> Poky: Yocto Project Reference Distro 2.4.2 > >> Compiler: aarch64-poky-linux-gcc (Linaro GCC 7.2-2017.11) 7.2.1 > >> > >> Both Xen versions (4.12 and staging) return "Unexpected Trap: Data > >> A
[Xen-devel] [PATCH V5 0/4] Renesas Stout board support (R-Car Gen2)
From: Oleksandr Tyshchenko Hi, all. The purpose of this patch series is to add required support to be able to run Xen on Renesas Stout board [1] which uses SCIFA compatible UART as a console interface. Actually Xen already has support for SCIF compatible UARTs which are used on Renesas Lager (R-Car Gen2), Salvator-X, H3ULCB/M3ULCB (R-Car Gen3) and other development boards. So this patch series extends existing support to be able to handle both interfaces. -- Current patch series is based on the following commit 1c6504163595d45e47a01750318c2b7b50541cbe and tested on Stout (ARM32) and H3ULCB (ARM64) boards. You can find current patch series here: repo: https://github.com/otyshchenko1/xen.git branch: stout_upstream3 You can find previous discussions here: [V1] https://www.mail-archive.com/xen-devel@lists.xenproject.org/msg21058.html [V2] https://www.mail-archive.com/xen-devel@lists.xenproject.org/msg37518.html [V3] https://www.mail-archive.com/xen-devel@lists.xenproject.org/msg42493.html [V4] https://www.mail-archive.com/xen-devel@lists.xenproject.org/msg43332.html -- In order to run Xen on Stout board you need "PSCI-enabled" U-Boot (not upsteamed yet). You can find corresponding patches for U-Boot here: http://u-boot.10912.n7.nabble.com/PATCH-0-3-PSCI-support-for-r8a7790-SoC-Lager-Stout-boards-td357352.html Have a plan to update Xen Wiki regarding this board. -- Please note, that first two patches already have Julien's A-b, and the following patch "[PATCH V4 1/5] xen/arm: Clarify usage of earlyprintk for Lager board" was removed from this series (as handled separately). [1] https://elinux.org/R-Car/Boards/Stout Oleksandr Tyshchenko (4): xen/arm: drivers: scif: Extend driver to handle other interfaces xen/arm: drivers: scif: Add support for SCIFA compatible UARTs xen/arm: Extend SCIF early prink code to handle other interfaces xen/arm: Add early printk support for SCIFA compatible UARTs docs/misc/arm/early-printk.txt| 5 ++ xen/arch/arm/Rules.mk | 7 ++ xen/arch/arm/arm32/debug-scif.inc | 22 +-- xen/drivers/char/scif-uart.c | 131 -- xen/include/asm-arm/scif-uart.h | 44 +++-- 5 files changed, 161 insertions(+), 48 deletions(-) -- 2.7.4 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel