Re: [Xen-devel] [PATCH v2 2/2] ns16550: enable use of PCI MSI

2018-11-30 Thread Jan Beulich
>>> On 29.11.18 at 18:33,  wrote:
> On Mon, Oct 01, 2018 at 10:26:05AM -0600, Jan Beulich wrote:
>> --- a/xen/arch/x86/msi.c
>> +++ b/xen/arch/x86/msi.c
>> @@ -742,6 +742,16 @@ static int msi_capability_init(struct pc
>>  
>>  *desc = entry;
>>  /* Restore the original MSI enabled bits  */
>> +if ( !hardware_domain )
> 
> Wouldn't it be better to assign the device to dom_xen (pdev->domain =
> dom_xen), and then check if the owner is dom_xen here?

I'm not sure this couldn't be wrong in the general case (and we
sit on a generic code path here): It depends on whether Dom0
can modify the device's config space, and I wouldn't want to
(here) introduce a connection between dom_xen ownership and
whether Dom0 can control INTX. The comment below here is
specifically worded to the effect of why I use hardware_domain
here.

If we ever get into the situation of wanting to enable MSI on an
internally used device _after_ Dom0 has started, this would need
careful re-considering.

> Or at the point where this is called from the serial console driver is
> too early for dom_xen to exist?

ns16550_init_postirq() is where both MSI setup and hiding of the
device happen, so in principle this would seem to be possible for
the specific case of a serial card.

>> @@ -777,8 +777,65 @@ static void __init ns16550_init_postirq(
>>  uart->ps_bdf[0], uart->ps_bdf[1],
>>  uart->ps_bdf[2]);
>>  }
>> +
>> +if ( uart->msi )
>> +{
>> +struct msi_info msi = {
>> +.bus = uart->ps_bdf[0],
>> +.devfn = PCI_DEVFN(uart->ps_bdf[1], uart->ps_bdf[2]),
>> +.irq = rc = uart->irq,
>> +.entry_nr = 1
>> +};
>> +
>> +if ( rc > 0 )
>> +{
>> +struct msi_desc *msi_desc = NULL;
>> +
>> +pcidevs_lock();
>> +
>> +rc = pci_enable_msi(&msi, &msi_desc);
> 
> Before attempting to enable MSI, shouldn't you make sure memory
> decoding is enabled in case the device uses MSIX?
> 
> I think this code assumes the device will always use MSI? (in which
> case my above question is likely moot).

I've yet to see serial cards using MSI-X. If we get to the point where
this is needed, we also may need to first set up the BAR for the MSI-X
table. Furthermore pci_enable_msi() won't even try to enable MSI-X
with msi->table_base being zero.

>> --- a/xen/drivers/pci/pci.c
>> +++ b/xen/drivers/pci/pci.c
>> @@ -115,6 +115,21 @@ int pci_find_next_ext_capability(int seg
>>  return 0;
>>  }
>>  
>> +void pci_intx(const struct pci_dev *pdev, bool_t enable)
> 
> Please use bool.

See how old this patch is. V1 was posted long before bool came into
existence, and I had refrained from posting v2 until I actually had a
device where MSI would indeed function (the first two I tried this
with claimed to be MSI capable, but no interrupts ever surfaced
when MSI was enabled on them, yet I couldn't be sure the code
was doing something wrong). Obviously I then forgot to switch this,
which I've now done.

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 4/6] microcode: don't call apply_microcode() in cpu_request_microcode()

2018-11-30 Thread Chao Gao
On Thu, Nov 29, 2018 at 10:46:05AM +0100, Roger Pau Monné wrote:
>On Thu, Nov 29, 2018 at 12:28:46PM +0800, Chao Gao wrote:
>> On Wed, Nov 28, 2018 at 04:02:25PM +0100, Roger Pau Monné wrote:
>> >On Wed, Nov 28, 2018 at 01:34:14PM +0800, Chao Gao wrote:
>> >> diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
>> >> index 8350d22..cca7b2c 100644
>> >> --- a/xen/arch/x86/microcode.c
>> >> +++ b/xen/arch/x86/microcode.c
>> >> @@ -233,20 +233,12 @@ int microcode_resume_cpu(unsigned int cpu)
>> >>  return err;
>> >>  }
>> >>  
>> >> -static int microcode_update_cpu(const void *buf, size_t size)
>> >> +static int microcode_update_cpu(void)
>> >>  {
>> >>  int err;
>> >> -unsigned int cpu = smp_processor_id();
>> >> -struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
>> >>  
>> >>  spin_lock(µcode_mutex);
>> >> -
>> >> -err = microcode_ops->collect_cpu_info(cpu, &uci->cpu_sig);
>> >> -if ( likely(!err) )
>> >> -err = microcode_ops->cpu_request_microcode(cpu, buf, size);
>> >> -else
>> >> -__microcode_fini_cpu(cpu);
>> >> -
>> >> +err = microcode_ops->apply_microcode(smp_processor_id());
>> >>  spin_unlock(µcode_mutex);
>> >>  
>> >>  return err;
>> >> @@ -259,7 +251,7 @@ static long do_microcode_update(void *_info)
>> >>  
>> >>  BUG_ON(info->cpu != smp_processor_id());
>> >>  
>> >> -error = microcode_update_cpu(info->buffer, info->buffer_size);
>> >> +error = microcode_update_cpu();
>> >
>> >Why don't you just set info->error = microcode_update_cpu()?
>> >
>> >AFAICT this is done to attempt to update the remaining CPUs if one
>> >update failed?
>> 
>> Yes. But this patch doesn't change the logic here. Actually, if HT is
>> enabled and microcode is shared between the logical threads of the same
>> core, so if one thread updates microcode successfully, its sibling would
>> always fail in current logic. I am trying to explain why we cannot abort
>> the update even though an error is met in current logic. It definitely
>> can be solved by tweaking the logic slightly. 
>> 
>> >
>> >Is there anyway to rollback to the previous state so all CPUs have the
>> >same microcode?
>> 
>> Seems it is not allowed to load a microcode with numeratically smaller
>> revision according to 9.11.7.2.
>> 
>> With patch 6, a panic() would be triggered if some cpus failed to do the
>> update. I didn't try to change the logic here.
>> 
>> >I assume nothing good will come out of running a
>> >system with CPUs using different microcode versions, but maybe that's
>> >not so bad?
>> 
>> It is better that all CPUs have the same microcode revision. 
>> 
>> Linux kernel rejects late microcode update if finding some CPUs
>> offlined. I may port this patch to Xen too in a separate patch.
>
>What happens with hotplug CPUs?
>
>Even if you disable late loading when there are offlined CPUs you
>still need to handle hotplug CPUs, which IMO should share the same
>path with offlined CPUs: the microcode update should be done ASAP
>after bringing the CPU up.
>

In linux, CPU's being offline is just a logical offline. It may participate
actions like MCE. It would lead to a situation that some cpus are using old
microcode while some are using the new one, which introduces instability.
CPU hotplug doesn't have such issue.

>> >
>> >>  if ( error )
>> >>  info->error = error;
>> >>  
>> >> @@ -276,6 +268,8 @@ int 
>> >> microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long 
>> >> len)
>> >>  {
>> >>  int ret;
>> >>  struct microcode_info *info;
>> >> +unsigned int cpu = smp_processor_id();
>> >> +struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
>> >>  
>> >>  if ( len != (uint32_t)len )
>> >>  return -E2BIG;
>> >> @@ -294,10 +288,6 @@ int 
>> >> microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long 
>> >> len)
>> >>  return ret;
>> >>  }
>> >>  
>> >> -info->buffer_size = len;
>> >> -info->error = 0;
>> >> -info->cpu = cpumask_first(&cpu_online_map);
>> >> -
>> >>  if ( microcode_ops->start_update )
>> >>  {
>> >>  ret = microcode_ops->start_update();
>> >> @@ -308,6 +298,26 @@ int 
>> >> microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long 
>> >> len)
>> >>  }
>> >>  }
>> >>  
>> >> +spin_lock(µcode_mutex);
>> >> +
>> >> +ret = microcode_ops->collect_cpu_info(cpu, &uci->cpu_sig);
>> >> +if ( likely(!ret) )
>> >> +ret = microcode_ops->cpu_request_microcode(cpu, info->buffer, 
>> >> len);
>> >> +else
>> >> +__microcode_fini_cpu(cpu);
>> >> +
>> >> +spin_unlock(µcode_mutex);
>> >
>> >Why do you need to hold the lock here?
>> >
>> >microcode_update is already serialized and should only be executed on
>> >a CPU at a time due to the usage of chained
>> >continue_hypercall_on_cpu.
>> 
>> microcode_resume_cpu() also uses the 'uci' and the global microcode cache.
>> This lock is to preve

Re: [Xen-devel] [PATCH v4 6/6] x86/microcode: Synchronize late microcode loading

2018-11-30 Thread Chao Gao
On Thu, Nov 29, 2018 at 10:56:53AM +0100, Roger Pau Monné wrote:
>On Thu, Nov 29, 2018 at 12:43:25PM +0800, Chao Gao wrote:
>> On Wed, Nov 28, 2018 at 04:22:09PM +0100, Roger Pau Monné wrote:
>> >On Wed, Nov 28, 2018 at 01:34:16PM +0800, Chao Gao wrote:
>> >> This patch ports microcode improvement patches from linux kernel.
>> >> 
>> >> Before you read any further: the early loading method is still the
>> >> preferred one and you should always do that. The following patch is
>> >> improving the late loading mechanism for long running jobs and cloud use
>> >> cases.
>> >> 
>> >> Gather all cores and serialize the microcode update on them by doing it
>> >> one-by-one to make the late update process as reliable as possible and
>> >> avoid potential issues caused by the microcode update.
>> >> 
>> >> Signed-off-by: Chao Gao 
>> >> Tested-by: Chao Gao 
>> >> [linux commit: a5321aec6412b20b5ad15db2d6b916c05349dbff]
>> >> [linux commit: bb8c13d61a629276a162c1d2b1a20a815cbcfbb7]
>> >
>> >If this patch is the squash of two Linux commits, please post the
>> >ported versions of the two commits separately.
>> 
>> I don't understand this one.
>
>You reference two Linux commits above, why is this done?
>
>I assume this is because you are porting two Linux commits to Xen, in
>which case I think that should be done in two different patches, or a
>note needs to be added to why you merge two Linux commits into a
>single Xen patch.

The latter fixed a severe bug introduced the first one. Maybe I need
to add a note to clarify this.

Thanks
Chao

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v5 01/13] x86: reduce general stack alignment to 8

2018-11-30 Thread Jan Beulich
>>> On 29.11.18 at 18:44,  wrote:
> On Thu, Nov 08, 2018 at 09:05:45AM -0700, Jan Beulich wrote:
>> --- a/xen/arch/x86/efi/Makefile
>> +++ b/xen/arch/x86/efi/Makefile
>> @@ -5,7 +5,11 @@ CFLAGS += -fshort-wchar
>>  
>>  boot.init.o: buildid.o
>>  
>> +EFIOBJ := boot.init.o compat.o runtime.o
>> +
>> +$(EFIOBJ): CFLAGS-stack-boundary := -mpreferred-stack-boundary=4
> 
> From gcc's manual on -mincoming-stack-boundary:
> 
> "Thus calling a function compiled with a higher preferred stack boundary
> from a function compiled with a lower preferred stack boundary most
> likely misaligns the stack." 
> 
> I notice runtime.o now has stack alignment of 2^4 while the rest of xen
> has 2^3.
> 
> There is at least one example (efi_get_time) that could misalign the
> stack. Is that okay?

It would not be okay if the runtime call machinery wouldn't force
32-byte alignment of the stack. See the declaration of struct
efi_rs_state, an instance of which gets put on the stack of every
function making runtime calls. Also note how this is no different
from prior to this change, as explained by the comment in that
structure declaration, except that instead of always running on
a reliably mis-aligned stack we will now run on a mixture (hence
the code [and stack] size savings mentioned in the description).

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [linux-4.9 test] 130840: regressions - FAIL

2018-11-30 Thread osstest service owner
flight 130840 linux-4.9 real [real]
http://logs.test-lab.xenproject.org/osstest/logs/130840/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-xsm   7 xen-boot fail REGR. vs. 130142
 test-amd64-i386-xl-qemuu-debianhvm-amd64-shadow 16 guest-localmigrate/x10 fail 
REGR. vs. 130142
 test-armhf-armhf-libvirt-raw 15 guest-start/debian.repeat fail REGR. vs. 130142

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-xl-rtds 12 guest-start  fail REGR. vs. 130142

Tests which did not succeed, but are not blocking:
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 130142
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 130142
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 130142
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 130142
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 130142
 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 10 debian-hvm-install 
fail never pass
 test-amd64-amd64-xl-pvhv2-intel 12 guest-start fail never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict 10 debian-hvm-install 
fail never pass
 test-amd64-amd64-xl-pvhv2-amd 12 guest-start  fail  never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-xl-pvshim12 guest-start  fail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stop fail never pass
 test-armhf-armhf-xl-credit1  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  14 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop  fail never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop  fail never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-amd64-amd64-xl-qemut-win10-i386 10 windows-installfail never pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass
 test-amd64-i386-xl-qemut-win10-i386 10 windows-install fail never pass

version targeted for testing:
 linuxbb2d990b6fefaf99b2832a7a588234e7986ebe15
baseline version:
 linux5552683784c9e2641e8c74827565476a45827126

Last test of basis   130142  2018-11-15 19:02:17 Z   14 days
Failing since130646  2018-11-21 08:42:52 Z9 days4 attempts
Testing same since   130840  2018-11-27 21:02:30 Z2 days1 attempts


People who touched revisions under test:
  "Eric W. Biederman" 
  "Yan, Zheng" 
  Aaron Brown 
  Al Viro 
  Alan Tull 
  Alexandre Belloni 
  Andrea Arcangeli 
  Andreas Gruenbacher 
  Andrew Morton 
  Andrey Ryabinin 
  Andy Shevchenko 
  Ard Biesheuvel 
  Arnd Bergmann 
  Bartlomiej Zolnierkiewicz 
  Behan Webster 
  Ben Hutchings 
  Boris Brezillon 
  Breno Leitao 
  Catalin Marinas 
  Changwei Ge 
  

[Xen-devel] [seabios test] 130842: regressions - FAIL

2018-11-30 Thread osstest service owner
flight 130842 seabios real [real]
http://logs.test-lab.xenproject.org/osstest/logs/130842/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 7 xen-boot fail REGR. vs. 
130373
 test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict 10 debian-hvm-install 
fail REGR. vs. 130373
 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 10 debian-hvm-install 
fail REGR. vs. 130373

Tests which did not succeed, but are not blocking:
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 130373
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 130373
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 130373
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop fail like 130373
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass

version targeted for testing:
 seabios  42efebdf1d120554e1a30e8debf562527ec6a53d
baseline version:
 seabios  a698c8995ffb2838296ec284fe3c4ad33dfca307

Last test of basis   130373  2018-11-18 03:30:13 Z   12 days
Testing same since   130842  2018-11-28 02:10:59 Z2 days1 attempts


People who touched revisions under test:
  Stephen Douthit 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm   pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmfail
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm pass
 test-amd64-amd64-qemuu-nested-amdfail
 test-amd64-i386-qemuu-rhel6hvm-amd   pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-i386-xl-qemuu-debianhvm-amd64 pass
 test-amd64-amd64-xl-qemuu-win7-amd64 fail
 test-amd64-i386-xl-qemuu-win7-amd64  fail
 test-amd64-amd64-xl-qemuu-ws16-amd64 fail
 test-amd64-i386-xl-qemuu-ws16-amd64  fail
 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrictfail
 test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict fail
 test-amd64-amd64-xl-qemuu-win10-i386 fail
 test-amd64-i386-xl-qemuu-win10-i386  fail
 test-amd64-amd64-qemuu-nested-intel  pass
 test-amd64-i386-qemuu-rhel6hvm-intel pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-shadow pass
 test-amd64-i386-xl-qemuu-debianhvm-amd64-shadow  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.


commit 42efebdf1d120554e1a30e8debf562527ec6a53d
Author: Stephen Douthit 
Date:   Wed Mar 7 13:17:36 2018 -0500

tpm: Check for TPM related ACPI tables before attempting hw probe

Signed-off-by: Stephen Douthit 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 1/2] tools/libxc: Fix issues with libxc and Xen having different featureset lengths

2018-11-30 Thread Jan Beulich
>>> On 29.11.18 at 20:20,  wrote:
> In almost all cases, Xen and libxc will agree on the featureset length,
> because they are built from the same source.
> 
> However, there are circumstances (e.g. security hotfixes) where the featureset
> gets longer and dom0 will, after installing updates, be running with an old
> Xen but new libxc.  Despite writing the code with this scenario in mind, there
> were some bugs.
> 
> First, xen-cpuid's get_featureset() erroneously allocates a buffer based on
> Xen's featureset length, but records libxc's length, which is longer.

"... which may be longer", seeing that nr_features gets initialized from
xc_get_cpu_featureset_size()'s return value, and its subsequent
updating (through xc_get_cpu_featureset()) is only done in certain
cases.

> The hypercall bounce buffer code reads/writes the recorded length, which is
> beyond the end of the allocated object, and a later free() encounters corrupt
> heap metadata.  Fix this by recording the same length that we allocate.
> 
> Secondly, get_cpuid_domain_info() has a related bug when the passed-in
> featureset is a different length to libxc's.
> 
> A large amount of the libxc cpuid functionality depends on info->featureset
> being as long as expected, and it is allocated appropriately.  However, in the
> case that a shorter external featureset is passed in, the logic to check for
> trailing nonzero bits may read off the end of it.  Rework the logic to use the
> correct upper bound.
> 
> In addition, leave a comment next to the fields in struct cpuid_domain_info
> explaining the relationship between the various lengths, and how to cope with
> different lengths.
> 
> Signed-off-by: Andrew Cooper 

Reviewed-by: Jan Beulich 



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 2/2] tools/libxc: Fix error handling in get_cpuid_domain_info()

2018-11-30 Thread Jan Beulich
>>> On 29.11.18 at 20:20,  wrote:
> get_cpuid_domain_info() has two conflicting return styles - either -error for
> local failures, or -1/errno for hypercall failures.  Switch to consistently
> use -error.
> 
> While fixing the xc_get_cpu_featureset(), take the opportunity to remove the
> redundancy and move it to be adjacent to the other featureset handling.

Having noticed the redundancy before, I was assuming this was done
to prevent altering host_nr_features early, just in case future code
additions would want to use its original value. The way it is now, the
code folding looks correct.

> Signed-off-by: Andrew Cooper 

Reviewed-by: Jan Beulich 



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 2/6] microcode: save all microcodes which pass sanity check

2018-11-30 Thread Jan Beulich
>>> On 30.11.18 at 08:55,  wrote:
> On Thu, Nov 29, 2018 at 10:22:10AM +0100, Roger Pau Monné wrote:
>>On Thu, Nov 29, 2018 at 10:40:32AM +0800, Chao Gao wrote:
>>> On Wed, Nov 28, 2018 at 01:00:14PM +0100, Roger Pau Monné wrote:
>>> >On Wed, Nov 28, 2018 at 01:34:12PM +0800, Chao Gao wrote:
>>> >> ... and search caches to find a suitable one when loading.
>>> >
>>> >Why do you need to save all of them? You are only going to load a
>>> >single microcode, so I don't understand the need to cache them all.
>>
>>I think the above question needs an answer.
> 
> Out of consideraton for a mixed-family system. Anyway, Since Jan commented
> that we gave up support of a mixed-family system, we only need to save
> a single microcode for offlined or hot-plugged cpus.

Well, there might be slightly more needed than just a single instance.
This depends on whether same family/model CPUs with different
stepping and/or different "pf" value can be mixed (and would have
identical feature flags in their CPUID output).

I may have oversimplified the current state of things: Hotplugging
a CPU with more capabilities than the boot CPU is generally fine
afaict. Both you (Intel) and AMD place restrictions on permitted
mixes iirc, so I don't think we need to support anything beyond
such restrictions. Being able to run on permitted mixes which are
not in conflict with our general assumption of there not being any
CPU in the system with less capabilities than the boot CPU would
seem desirable.

Andrew, do you have any view or opinion here?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 4/6] microcode: don't call apply_microcode() in cpu_request_microcode()

2018-11-30 Thread Jan Beulich
>>> On 30.11.18 at 09:57,  wrote:
> On Thu, Nov 29, 2018 at 10:46:05AM +0100, Roger Pau Monné wrote:
>>On Thu, Nov 29, 2018 at 12:28:46PM +0800, Chao Gao wrote:
>>> It is better that all CPUs have the same microcode revision. 
>>> 
>>> Linux kernel rejects late microcode update if finding some CPUs
>>> offlined. I may port this patch to Xen too in a separate patch.
>>
>>What happens with hotplug CPUs?
>>
>>Even if you disable late loading when there are offlined CPUs you
>>still need to handle hotplug CPUs, which IMO should share the same
>>path with offlined CPUs: the microcode update should be done ASAP
>>after bringing the CPU up.
> 
> In linux, CPU's being offline is just a logical offline. It may participate
> actions like MCE. It would lead to a situation that some cpus are using old
> microcode while some are using the new one, which introduces instability.
> CPU hotplug doesn't have such issue.

But please be careful not to disallow ucode loading in cases where
doing so is fine. I'm in particular thinking about "smt=no", where
secondary hyperthreads get parked. Since ucode updates on one
hyperthread of a core affect all hyperthreads, updating is fine in
this particular case.

Furthermore, if you look at my patch series "x86: more power-
efficient CPU parking", it is also within realm of possibility to
briefly wake up parked CPUs, just to get their ucode updated.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 20/21] xen_backend: remove xen_sysdev_init() function

2018-11-30 Thread Mao Zhongyi
The init function doesn't do anything at all, so we
just omit it.

Cc: sstabell...@kernel.org
Cc: anthony.per...@citrix.com
Cc: xen-devel@lists.xenproject.org
Cc: peter.mayd...@linaro.org

Signed-off-by: Mao Zhongyi 
Signed-off-by: Zhang Shengju 
Acked-by: Anthony PERARD 
---
 hw/xen/xen_backend.c | 7 ---
 1 file changed, 7 deletions(-)

diff --git a/hw/xen/xen_backend.c b/hw/xen/xen_backend.c
index 9a8e8771ec..0bc6b1de60 100644
--- a/hw/xen/xen_backend.c
+++ b/hw/xen/xen_backend.c
@@ -809,11 +809,6 @@ static const TypeInfo xensysbus_info = {
 }
 };
 
-static int xen_sysdev_init(SysBusDevice *dev)
-{
-return 0;
-}
-
 static Property xen_sysdev_properties[] = {
 {/* end of property list */},
 };
@@ -821,9 +816,7 @@ static Property xen_sysdev_properties[] = {
 static void xen_sysdev_class_init(ObjectClass *klass, void *data)
 {
 DeviceClass *dc = DEVICE_CLASS(klass);
-SysBusDeviceClass *k = SYS_BUS_DEVICE_CLASS(klass);
 
-k->init = xen_sysdev_init;
 dc->props = xen_sysdev_properties;
 dc->bus_type = TYPE_XENSYSBUS;
 }
-- 
2.17.1




___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 00/21] QOM'ify SysBusDeviceClass->init

2018-11-30 Thread Mao Zhongyi
v3 -> v2:

- rebase to the HEAD
- use SysBusDevice *sbd variable in patch15

v2 -> v1:

- SYS_BUS_DEVICE(dev) was used in a function several
  times, so use a variable 'sbd' to replace it, like:
  SysBusDevice *sbd = SYS_BUS_DEVICE(dev);
- remove the xen_sysdev_init() function
- drop the patch21 in v1
- fix the broken in sysbus_realize of patch22

Cc: alistair.fran...@wdc.com
Cc: anthony.per...@citrix.com
Cc: arm...@redhat.com
Cc: borntrae...@de.ibm.com
Cc: chout...@adacore.com
Cc: coh...@redhat.com
Cc: da...@gibson.dropbear.id.au
Cc: da...@redhat.com
Cc: edgar.igles...@gmail.com
Cc: ehabk...@redhat.com
Cc: f4...@amsat.org
Cc: g...@mprc.pku.edu.cn
Cc: jan.kis...@web.de
Cc: kra...@redhat.com
Cc: kw...@redhat.com
Cc: marcandre.lur...@redhat.com
Cc: marcel.apfelb...@gmail.com
Cc: mich...@walle.cc
Cc: mre...@redhat.com
Cc: m...@redhat.com
Cc: pbonz...@redhat.com
Cc: peter.mayd...@linaro.org

  
Cc: qemu-...@nongnu.org
Cc: qemu-bl...@nongnu.org
Cc: qemu-...@nongnu.org
Cc: qemu-s3...@nongnu.org
Cc: richard.hender...@linaro.org
Cc: r...@twiddle.net
Cc: sstabell...@kernel.org
Cc: th...@redhat.com
Cc: xen-devel@lists.xenproject.org

Mao Zhongyi (21):
  musicpal: Convert sysbus init function to realize function
  block/noenand: Convert sysbus init function to realize function
  char/grlib_apbuart: Convert sysbus init function to realize function
  core/empty_slot: Convert sysbus init function to realize function
  display/g364fb: Convert sysbus init function to realize function
  dma/puv3_dma: Convert sysbus init function to realize function
  gpio/puv3_gpio: Convert sysbus init function to realize function
  milkymist-softusb: Convert sysbus init function to realize function
  input/pl050: Convert sysbus init function to realize function
  intc/puv3_intc: Convert sysbus init function to realize function
  milkymist-hpdmc: Convert sysbus init function to realize function
  milkymist-pfpu: Convert sysbus init function to realize function
  puv3_pm.c: Convert sysbus init function to realize function
  nvram/ds1225y: Convert sysbus init function to realize function
  pci-bridge/dec: Convert sysbus init function to realize function
  timer/etraxfs_timer: Convert sysbus init function to realize function
  timer/grlib_gptimer: Convert sysbus init function to realize function
  timer/puv3_ost: Convert sysbus init function to realize function
  usb/tusb6010: Convert sysbus init function to realize function
  xen_backend: remove xen_sysdev_init() function
  core/sysbus: remove the SysBusDeviceClass::init path

 hw/arm/musicpal.c|  9 -
 hw/block/onenand.c   | 16 +++-
 hw/char/grlib_apbuart.c  | 12 +---
 hw/core/empty_slot.c |  9 -
 hw/core/sysbus.c | 15 +--
 hw/display/g364fb.c  |  9 +++--
 hw/dma/puv3_dma.c| 10 --
 hw/gpio/puv3_gpio.c  | 29 ++---
 hw/input/milkymist-softusb.c | 16 +++-
 hw/input/pl050.c | 11 +--
 hw/intc/puv3_intc.c  | 11 ---
 hw/misc/milkymist-hpdmc.c|  9 +++--
 hw/misc/milkymist-pfpu.c | 12 +---
 hw/misc/puv3_pm.c| 10 --
 hw/nvram/ds1225y.c   | 12 +---
 hw/pci-bridge/dec.c  | 12 ++--
 hw/timer/etraxfs_timer.c | 14 +++---
 hw/timer/grlib_gptimer.c | 11 +--
 hw/timer/puv3_ost.c  | 13 ++---
 hw/usb/tusb6010.c|  8 +++-
 hw/xen/xen_backend.c |  7 ---
 include/hw/sysbus.h  |  3 ---
 22 files changed, 106 insertions(+), 152 deletions(-)

-- 
2.17.1




___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 6/6] x86/microcode: Synchronize late microcode loading

2018-11-30 Thread Jan Beulich
>>> On 29.11.18 at 23:43,  wrote:
> One other comment about this patch (which IIRC was raised by Andrew on
> an earlier version) is that it may be worth to stop timer calibration. I
> am pretty sure we've seen deadlocks, which is why we ended up disabling
> it during microcode updates.

I recall the claim, but I don't think I've ever seen proof. My point was
ans still is that if there's a problem with ucode loading using the
stop-machine logic here, then there's a problem with the stop-machine
logic in general, which would make other uses, perhaps most notably
rcu_barrier(), unsafe too. Otoh from your reply it's not clear whether
the observed issue wasn't with our present way of loading ucode,
but then it would put under question the general correctness of
continue_hypercall_on_cpu(), which again we use for more than just
ucode loading.

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] x86/shadow: don't enable shadow mode with too small a shadow allocation

2018-11-30 Thread Tim Deegan
At 07:53 -0700 on 29 Nov (1543477992), Jan Beulich wrote:
> We've had more than one report of host crashes after failed migration,
> and in at least one case we've had a hint towards a too far shrunk
> shadow allocation pool. Instead of just checking the pool for being
> empty, check whether the pool is smaller than what
> shadow_set_allocation() would minimally bump it to if it was invoked in
> the first place.
> 
> Signed-off-by: Jan Beulich 

Acked-by: Tim Deegan 

Thanks,

Tim.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 2/2] iommu: elide flushing for higher order map/unmap operations

2018-11-30 Thread Paul Durrant
This patch removes any implicit flushing that occurs in the implementation
of map and unmap operations and, instead, adds explicit flushing at the
end of the loops in the iommu_map/unmap() wrapper functions.

Because VT-d currently performs two different types of flush dependent upon
whether a PTE is being modified versus merely added (i.e. replacing a non-
present PTE) a 'iommu_flush_type' enumeration is defined by this patch and
the iommu_ops map method is modified to pass back the type of flush
necessary for the PTE that has been populated. When a higher order mapping
operation is done, the wrapper code performs the 'highest' level of flush
required by the individual iommu_ops method calls, where a 'modified PTE'
flush is deemed to be higher than a 'added PTE' flush. The ARM SMMU
implementation currently performs no implicit flushing and therefore
the modified map method always passes back a flush type of 'none'.

NOTE: The per-cpu 'iommu_dont_flush_iotlb' is respected by the iommu_map()
  wrapper function and therefore this now applies to all IOMMU
  implementations rather than just VT-d. Use of the flag has been added
  to arch_iommu_hwdom_init() so that flushing is now fully elided for
  the hardware domain's initial table population.

Signed-off-by: Paul Durrant 
---
Cc: Stefano Stabellini 
Cc: Julien Grall 
Cc: Suravee Suthikulpanit 
Cc: Brian Woods 
Cc: Jan Beulich 
Cc: Kevin Tian 
Cc: Andrew Cooper 
Cc: Wei Liu 
Cc: "Roger Pau Monné" 

This code has only been compile tested for ARM.
---
 xen/drivers/passthrough/amd/iommu_map.c   | 70 ---
 xen/drivers/passthrough/arm/smmu.c| 13 +++--
 xen/drivers/passthrough/iommu.c   | 31 ++--
 xen/drivers/passthrough/vtd/iommu.c   | 26 +-
 xen/drivers/passthrough/x86/iommu.c   | 15 --
 xen/include/asm-x86/hvm/svm/amd-iommu-proto.h |  8 +--
 xen/include/xen/iommu.h   | 13 -
 7 files changed, 121 insertions(+), 55 deletions(-)

diff --git a/xen/drivers/passthrough/amd/iommu_map.c 
b/xen/drivers/passthrough/amd/iommu_map.c
index c05b042821..6afc412d3f 100644
--- a/xen/drivers/passthrough/amd/iommu_map.c
+++ b/xen/drivers/passthrough/amd/iommu_map.c
@@ -45,13 +45,14 @@ static void clear_iommu_pte_present(unsigned long l1_mfn, 
unsigned long dfn)
 unmap_domain_page(table);
 }
 
-static bool set_iommu_pde_present(uint32_t *pde, unsigned long next_mfn,
-  unsigned int next_level,
-  bool iw, bool ir)
+static enum iommu_flush_type set_iommu_pde_present(
+uint32_t *pde, unsigned long next_mfn, unsigned int next_level, bool iw,
+bool ir)
 {
 uint64_t maddr_next;
 uint32_t addr_lo, addr_hi, entry;
-bool need_flush = false, old_present;
+bool old_present;
+enum iommu_flush_type flush_type = IOMMU_FLUSH_none;
 
 maddr_next = __pfn_to_paddr(next_mfn);
 
@@ -84,7 +85,7 @@ static bool set_iommu_pde_present(uint32_t *pde, unsigned 
long next_mfn,
 
 if ( maddr_old != maddr_next || iw != old_w || ir != old_r ||
  old_level != next_level )
-need_flush = true;
+flush_type = IOMMU_FLUSH_modified;
 }
 
 addr_lo = maddr_next & DMA_32BIT_MASK;
@@ -121,24 +122,24 @@ static bool set_iommu_pde_present(uint32_t *pde, unsigned 
long next_mfn,
  IOMMU_PDE_PRESENT_SHIFT, &entry);
 pde[0] = entry;
 
-return need_flush;
+return flush_type;
 }
 
-static bool set_iommu_pte_present(unsigned long pt_mfn, unsigned long dfn,
-  unsigned long next_mfn, int pde_level,
-  bool iw, bool ir)
+static enum iommu_flush_type set_iommu_pte_present(
+unsigned long pt_mfn, unsigned long dfn, unsigned long next_mfn,
+int pde_level, bool iw, bool ir)
 {
 uint64_t *table;
 uint32_t *pde;
-bool need_flush;
+enum iommu_flush_type flush_type;
 
 table = map_domain_page(_mfn(pt_mfn));
 
 pde = (uint32_t *)(table + pfn_to_pde_idx(dfn, pde_level));
 
-need_flush = set_iommu_pde_present(pde, next_mfn, 0, iw, ir);
+flush_type = set_iommu_pde_present(pde, next_mfn, 0, iw, ir);
 unmap_domain_page(table);
-return need_flush;
+return flush_type;
 }
 
 void amd_iommu_set_root_page_table(uint32_t *dte, uint64_t root_ptr,
@@ -522,13 +523,15 @@ static int update_paging_mode(struct domain *d, unsigned 
long dfn)
 }
 
 int amd_iommu_map_page(struct domain *d, dfn_t dfn, mfn_t mfn,
-   unsigned int flags)
+   unsigned int flags,
+   enum iommu_flush_type *flush_type)
 {
-bool need_flush;
 struct domain_iommu *hd = dom_iommu(d);
 int rc;
 unsigned long pt_mfn[7];
 
+*flush_type = IOMMU_FLUSH_none;
+
 if ( iommu_use_hap_pt(d) )
 return 0;
 
@@ -570,12 +573,9 @@ int amd_iommu_map_page(struct domain *d, dfn_t dfn, mfn_t 
mf

[Xen-devel] [PATCH 0/2] iommu improvements

2018-11-30 Thread Paul Durrant
Paul Durrant (2):
  amd-iommu: add flush iommu_ops
  iommu: elide flushing for higher order map/unmap operations

 xen/drivers/passthrough/amd/iommu_map.c   | 114 --
 xen/drivers/passthrough/amd/pci_amd_iommu.c   |   2 +
 xen/drivers/passthrough/arm/smmu.c|  13 ++-
 xen/drivers/passthrough/iommu.c   |  31 ++-
 xen/drivers/passthrough/vtd/iommu.c   |  26 +++---
 xen/drivers/passthrough/x86/iommu.c   |  15 +++-
 xen/include/asm-x86/hvm/svm/amd-iommu-proto.h |   7 +-
 xen/include/xen/iommu.h   |  18 +++-
 8 files changed, 175 insertions(+), 51 deletions(-)

-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 1/2] amd-iommu: add flush iommu_ops

2018-11-30 Thread Paul Durrant
The iommu_ops structure contains two methods for flushing: 'iotlb_flush' and
'iotlb_flush_all'. This patch adds implementations of these for AMD IOMMUs.

The iotlb_flush method takes a base DFN and a (4k) page count, but the
flush needs to be done by page order (i.e. 0, 9 or 18). Because a flush
operation is fairly expensive to perform, the code calculates the minimum
order single flush that will cover the specified page range rather than
performing multiple flushes.

Signed-off-by: Paul Durrant 
---
Cc: Suravee Suthikulpanit 
Cc: Brian Woods 
Cc: Jan Beulich 
Cc: Andrew Cooper 
Cc: Wei Liu 
Cc: "Roger Pau Monné" 
---
 xen/drivers/passthrough/amd/iommu_map.c   | 48 +++
 xen/drivers/passthrough/amd/pci_amd_iommu.c   |  2 ++
 xen/include/asm-x86/hvm/svm/amd-iommu-proto.h |  3 ++
 xen/include/xen/iommu.h   |  5 +++
 4 files changed, 58 insertions(+)

diff --git a/xen/drivers/passthrough/amd/iommu_map.c 
b/xen/drivers/passthrough/amd/iommu_map.c
index 04cb7b3182..c05b042821 100644
--- a/xen/drivers/passthrough/amd/iommu_map.c
+++ b/xen/drivers/passthrough/amd/iommu_map.c
@@ -631,6 +631,54 @@ int amd_iommu_unmap_page(struct domain *d, dfn_t dfn)
 spin_unlock(&hd->arch.mapping_lock);
 
 amd_iommu_flush_pages(d, dfn_x(dfn), 0);
+return 0;
+}
+
+static unsigned long flush_count(dfn_t dfn, unsigned int page_count,
+ unsigned int order)
+{
+unsigned long start = dfn_x(dfn) / (1u << order);
+unsigned long end = DIV_ROUND_UP(dfn_x(dfn) + page_count,
+ (1u << order));
+
+ASSERT(end > start);
+return end - start;
+}
+
+int amd_iommu_flush_iotlb_pages(struct domain *d, dfn_t dfn,
+unsigned int page_count)
+{
+/* Match VT-d semantics */
+if ( !page_count || dfn_eq(dfn, INVALID_DFN) ||
+ dfn_lt(dfn_add(dfn, page_count), dfn) /* overflow */ )
+{
+amd_iommu_flush_all_pages(d);
+return 0;
+}
+
+/*
+ * Flushes are expensive so find the minimal single flush that will
+ * cover the page range.
+ *
+ * NOTE: It is unnecessary to round down the DFN value to align with
+ *   the flush order here. This is done by the internals of the
+ *   flush code.
+ */
+if ( page_count == 1 ) /* order 0 flush count */
+amd_iommu_flush_pages(d, dfn_x(dfn), 0);
+else if ( flush_count(dfn, page_count, 9) == 1 )
+amd_iommu_flush_pages(d, dfn_x(dfn), 9);
+else if ( flush_count(dfn, page_count, 18) == 1 )
+amd_iommu_flush_pages(d, dfn_x(dfn), 18);
+else
+amd_iommu_flush_all_pages(d);
+
+return 0;
+}
+
+int amd_iommu_flush_iotlb_all(struct domain *d)
+{
+amd_iommu_flush_all_pages(d);
 
 return 0;
 }
diff --git a/xen/drivers/passthrough/amd/pci_amd_iommu.c 
b/xen/drivers/passthrough/amd/pci_amd_iommu.c
index 900136390d..33a3798f36 100644
--- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
+++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
@@ -579,6 +579,8 @@ static const struct iommu_ops __initconstrel amd_iommu_ops 
= {
 .teardown = amd_iommu_domain_destroy,
 .map_page = amd_iommu_map_page,
 .unmap_page = amd_iommu_unmap_page,
+.iotlb_flush = amd_iommu_flush_iotlb_pages,
+.iotlb_flush_all = amd_iommu_flush_iotlb_all,
 .free_page_table = deallocate_page_table,
 .reassign_device = reassign_device,
 .get_device_group_id = amd_iommu_group_id,
diff --git a/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h 
b/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h
index 718a44f956..88715329ca 100644
--- a/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h
+++ b/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h
@@ -60,6 +60,9 @@ int __must_check amd_iommu_alloc_root(struct domain_iommu 
*hd);
 int amd_iommu_reserve_domain_unity_map(struct domain *domain,
paddr_t phys_addr, unsigned long size,
int iw, int ir);
+int __must_check amd_iommu_flush_iotlb_pages(struct domain *d, dfn_t dfn,
+ unsigned int page_count);
+int __must_check amd_iommu_flush_iotlb_all(struct domain *d);
 
 /* Share p2m table with iommu */
 void amd_iommu_share_p2m(struct domain *d);
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index 3d78126801..da8294bac8 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -52,6 +52,11 @@ static inline bool_t dfn_eq(dfn_t x, dfn_t y)
 return dfn_x(x) == dfn_x(y);
 }
 
+static inline bool_t dfn_lt(dfn_t x, dfn_t y)
+{
+return dfn_x(x) < dfn_x(y);
+}
+
 extern bool_t iommu_enable, iommu_enabled;
 extern bool_t force_iommu, iommu_verbose;
 extern bool_t iommu_workaround_bios_bug, iommu_igfx;
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [distros-debian-jessie test] 75627: regressions - FAIL

2018-11-30 Thread Platform Team regression test user
flight 75627 distros-debian-jessie real [real]
http://osstest.xensource.com/osstest/logs/75627/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-i386-jessie-netboot-pvgrub 10 debian-di-install fail REGR. vs. 
75618

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-i386-jessie-netboot-pygrub 19 guest-start/debian.repeat fail 
like 75618
 test-armhf-armhf-armhf-jessie-netboot-pygrub 10 debian-di-install fail like 
75618

baseline version:
 flight   75618

jobs:
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-amd64-jessie-netboot-pvgrub pass
 test-amd64-i386-i386-jessie-netboot-pvgrub   fail
 test-amd64-i386-amd64-jessie-netboot-pygrub  pass
 test-armhf-armhf-armhf-jessie-netboot-pygrub fail
 test-amd64-amd64-i386-jessie-netboot-pygrub  fail



sg-report-flight on osstest.xs.citrite.net
logs: /home/osstest/logs
images: /home/osstest/images

Logs, config files, etc. are available at
http://osstest.xensource.com/osstest/logs

Test harness code can be found at
http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary


Push not applicable.


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2] pvcalls-front: Avoid __get_free_pages(GFP_KERNEL) under spinlock

2018-11-30 Thread Wen Yang
The problem is that we call this with a spin lock held.
The call tree is:
pvcalls_front_accept() holds bedata->socket_lock.
-> create_active()
-> __get_free_pages() uses GFP_KERNEL

The create_active() function is only called from pvcalls_front_accept()
with a spin_lock held, The allocation is not allowed to sleep and
GFP_KERNEL is not sufficient.

This issue was detected by using the Coccinelle software.

v2: Add a function doing the allocations which is called
outside the lock and passing the allocated data to
create_active().

Suggested-by: Juergen Gross 
Signed-off-by: Wen Yang 
CC: Julia Lawall 
CC: Boris Ostrovsky 
CC: Juergen Gross 
CC: Stefano Stabellini 
CC: xen-devel@lists.xenproject.org
CC: linux-ker...@vger.kernel.org
---
 drivers/xen/pvcalls-front.c | 71 ++---
 1 file changed, 59 insertions(+), 12 deletions(-)

diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
index 2f11ca72a281..cc92af31807f 100644
--- a/drivers/xen/pvcalls-front.c
+++ b/drivers/xen/pvcalls-front.c
@@ -335,7 +335,43 @@ int pvcalls_front_socket(struct socket *sock)
return ret;
 }
 
-static int create_active(struct sock_mapping *map, int *evtchn)
+struct sock_mapping_active_ring {
+   struct pvcalls_data_intf *ring;
+   RING_IDX ring_order;
+   void *bytes;
+};
+
+static int alloc_active_ring(struct sock_mapping_active_ring *active_ring)
+{
+   active_ring->ring = NULL;
+   active_ring->bytes = NULL;
+
+   active_ring->ring = (struct pvcalls_data_intf *)
+   __get_free_page(GFP_KERNEL | __GFP_ZERO);
+   if (active_ring->ring == NULL)
+   goto out_error;
+   active_ring->ring_order = PVCALLS_RING_ORDER;
+   active_ring->bytes = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO,
+   PVCALLS_RING_ORDER);
+   if (active_ring->bytes == NULL)
+   goto out_error;
+
+   return 0;
+
+out_error:
+   kfree(active_ring->bytes);
+   kfree(active_ring->ring);
+   return -ENOMEM;
+}
+
+static void free_active_ring(struct sock_mapping_active_ring *active_ring)
+{
+   free_pages((unsigned long)active_ring->bytes, active_ring->ring_order);
+   free_page((unsigned long)active_ring->ring);
+}
+
+static int create_active(struct sock_mapping *map, int *evtchn,
+   struct sock_mapping_active_ring *active_ring)
 {
void *bytes;
int ret = -ENOMEM, irq = -1, i;
@@ -343,15 +379,9 @@ static int create_active(struct sock_mapping *map, int 
*evtchn)
*evtchn = -1;
init_waitqueue_head(&map->active.inflight_conn_req);
 
-   map->active.ring = (struct pvcalls_data_intf *)
-   __get_free_page(GFP_KERNEL | __GFP_ZERO);
-   if (map->active.ring == NULL)
-   goto out_error;
-   map->active.ring->ring_order = PVCALLS_RING_ORDER;
-   bytes = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO,
-   PVCALLS_RING_ORDER);
-   if (bytes == NULL)
-   goto out_error;
+   map->active.ring = active_ring->ring;
+   map->active.ring->ring_order = active_ring->ring_order;
+   bytes = active_ring->bytes;
for (i = 0; i < (1 << PVCALLS_RING_ORDER); i++)
map->active.ring->ref[i] = gnttab_grant_foreign_access(
pvcalls_front_dev->otherend_id,
@@ -397,6 +427,7 @@ int pvcalls_front_connect(struct socket *sock, struct 
sockaddr *addr,
struct sock_mapping *map = NULL;
struct xen_pvcalls_request *req;
int notify, req_id, ret, evtchn;
+   struct sock_mapping_active_ring active_ring;
 
if (addr->sa_family != AF_INET || sock->type != SOCK_STREAM)
return -EOPNOTSUPP;
@@ -406,15 +437,21 @@ int pvcalls_front_connect(struct socket *sock, struct 
sockaddr *addr,
return PTR_ERR(map);
 
bedata = dev_get_drvdata(&pvcalls_front_dev->dev);
+   ret = alloc_active_ring(&active_ring);
+   if (ret < 0) {
+   pvcalls_exit_sock(sock);
+   return ret;
+   }
 
spin_lock(&bedata->socket_lock);
ret = get_request(bedata, &req_id);
if (ret < 0) {
spin_unlock(&bedata->socket_lock);
+   free_active_ring(&active_ring);
pvcalls_exit_sock(sock);
return ret;
}
-   ret = create_active(map, &evtchn);
+   ret = create_active(map, &evtchn, &active_ring);
if (ret < 0) {
spin_unlock(&bedata->socket_lock);
pvcalls_exit_sock(sock);
@@ -744,6 +781,7 @@ int pvcalls_front_accept(struct socket *sock, struct socket 
*newsock, int flags)
struct sock_mapping *map2 = NULL;
struct xen_pvcalls_request *req;
int notify, req_id, ret, evtchn, nonblock;
+   struct sock_mapping_active_ring active_ring;
 
map = pvcalls_enter_sock(sock);
if (IS_ERR(map))
@@ -780,12 +818

Re: [Xen-devel] [PATCH] pci: apply workaround for Intel errata HSE43 and BDF2

2018-11-30 Thread Jan Beulich
>>> On 29.11.18 at 18:11,  wrote:
> This errata affect the values read from the BAR registers, and could
> render vPCI (and by extension PVH Dom0 unusable).
> 
> HSE43 is a Haswell erratum where a non-BAR register is implemented at
> the position where the first BAR of the device should be found in the

s/in the/in a/ or something like this, because out of the several
Power Control Unit devices only one kind is really affected.

> Power Control Unit device. Note that there are no BARs on this device,
> apart from the bogus CSR register positioned on top of the first BAR.
> 
> BDF2 is a Broadwell erratum where BARs in the Home Agent device will
> return bogus non-zero values.

I'm afraid there's quite a bit of confusion in Intel's docs here: The vol 2
datasheet link for this CPU from
https://www.intel.com/content/www/us/en/processors/xeon/xeon-technical-resources.html
looks to be dead, and the local copy I have of this lists PCI IDs identical
to E5v3. The E7v4 link still works, and vol 2 has the same issue. (I really
just wanted to cross check that we fully cover the issue with the three
PCI IDs used.)

In any event in the code I'd like to see BDX2 mentioned as well (the
same erratum on E7v4). However, given the situation with the
datasheets I can't see a way to associate the device IDs used with
the individual errata (I would generally suspect there being a 3rd
erratum for the 3rd device ID). Several years ago spec updates also
used to have PCI device ID tables, but this doesn't appear to be the
case anymore.

> @@ -298,6 +299,34 @@ static void check_pdev(const struct pci_dev *pdev)
>  #undef PCI_STATUS_CHECK
>  }
>  
> +static void apply_quirks(struct pci_dev *pdev)
> +{
> +uint16_t vendor = pci_conf_read16(pdev->seg, pdev->bus,
> +  PCI_SLOT(pdev->devfn),
> +  PCI_FUNC(pdev->devfn), PCI_VENDOR_ID);
> +uint16_t device = pci_conf_read16(pdev->seg, pdev->bus,
> +  PCI_SLOT(pdev->devfn),
> +  PCI_FUNC(pdev->devfn), PCI_DEVICE_ID);
> +
> +if ( vendor == PCI_VENDOR_ID_INTEL && (device == 0x2fc0 ||
> + device == 0x6f60 || device == 0x6fa0 || device == 0x6fc0) )

Instead of such an ever growing if(), could we make this table based?

> @@ -397,6 +426,7 @@ static struct pci_dev *alloc_pdev(struct pci_seg *pseg, 
> u8 bus, u8 devfn)
>  }
>  
>  check_pdev(pdev);
> +apply_quirks(pdev);

At which point putting the small loop into check_pdev() might be as
good as adding a new function.

> --- a/xen/drivers/vpci/header.c
> +++ b/xen/drivers/vpci/header.c
> @@ -480,6 +480,9 @@ static int init_bars(struct pci_dev *pdev)
>  return -EOPNOTSUPP;
>  }
>  
> +if ( pdev->ignore_bars )
> +num_bars = 0;

While benign for the errata currently dealt with I wonder whether the
ROM BAR wouldn't better be left alone as well with this flag set. Since
additionally enabling memory decoding on a device without BARs is
a questionable operation I wonder whether you couldn't better move
this a few lines down immediately ahead of the loop over the BARs,
and make it return instead of zeroing num_bars.

> --- a/xen/include/xen/pci.h
> +++ b/xen/include/xen/pci.h
> @@ -115,6 +115,9 @@ struct pci_dev {
>  
>  /* Data for vPCI. */
>  struct vpci *vpci;
> +
> +/* Device with errata, ignore the BARs. */
> +bool ignore_bars;

Please can you put this into (one of?) the existing hole(s), instead
of at the end?

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3] pvcalls-front: Avoid __get_free_pages(GFP_KERNEL) under spinlock

2018-11-30 Thread Wen Yang
The problem is that we call this with a spin lock held.
The call tree is:
pvcalls_front_accept() holds bedata->socket_lock.
-> create_active()
-> __get_free_pages() uses GFP_KERNEL

The create_active() function is only called from pvcalls_front_accept()
with a spin_lock held, The allocation is not allowed to sleep and
GFP_KERNEL is not sufficient.

This issue was detected by using the Coccinelle software.

v2: Add a function doing the allocations which is called
outside the lock and passing the allocated data to
create_active().
v3: Use the matching deallocators i.e., free_page()
and free_pages(), respectively.

Suggested-by: Juergen Gross 
Signed-off-by: Wen Yang 
CC: Julia Lawall 
CC: Boris Ostrovsky 
CC: Juergen Gross 
CC: Stefano Stabellini 
CC: xen-devel@lists.xenproject.org
CC: linux-ker...@vger.kernel.org
---
 drivers/xen/pvcalls-front.c | 71 ++---
 1 file changed, 59 insertions(+), 12 deletions(-)

diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
index 2f11ca72a281..a26f416daf46 100644
--- a/drivers/xen/pvcalls-front.c
+++ b/drivers/xen/pvcalls-front.c
@@ -335,7 +335,43 @@ int pvcalls_front_socket(struct socket *sock)
return ret;
 }
 
-static int create_active(struct sock_mapping *map, int *evtchn)
+struct sock_mapping_active_ring {
+   struct pvcalls_data_intf *ring;
+   RING_IDX ring_order;
+   void *bytes;
+};
+
+static int alloc_active_ring(struct sock_mapping_active_ring *active_ring)
+{
+   active_ring->ring = NULL;
+   active_ring->bytes = NULL;
+
+   active_ring->ring = (struct pvcalls_data_intf *)
+   __get_free_page(GFP_KERNEL | __GFP_ZERO);
+   if (active_ring->ring == NULL)
+   goto out_error;
+   active_ring->ring_order = PVCALLS_RING_ORDER;
+   active_ring->bytes = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO,
+   PVCALLS_RING_ORDER);
+   if (active_ring->bytes == NULL)
+   goto out_error;
+
+   return 0;
+
+out_error:
+   free_pages((unsigned long)active_ring->bytes, active_ring->ring_order);
+   free_page((unsigned long)active_ring->ring);
+   return -ENOMEM;
+}
+
+static void free_active_ring(struct sock_mapping_active_ring *active_ring)
+{
+   free_pages((unsigned long)active_ring->bytes, active_ring->ring_order);
+   free_page((unsigned long)active_ring->ring);
+}
+
+static int create_active(struct sock_mapping *map, int *evtchn,
+   struct sock_mapping_active_ring *active_ring)
 {
void *bytes;
int ret = -ENOMEM, irq = -1, i;
@@ -343,15 +379,9 @@ static int create_active(struct sock_mapping *map, int 
*evtchn)
*evtchn = -1;
init_waitqueue_head(&map->active.inflight_conn_req);
 
-   map->active.ring = (struct pvcalls_data_intf *)
-   __get_free_page(GFP_KERNEL | __GFP_ZERO);
-   if (map->active.ring == NULL)
-   goto out_error;
-   map->active.ring->ring_order = PVCALLS_RING_ORDER;
-   bytes = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO,
-   PVCALLS_RING_ORDER);
-   if (bytes == NULL)
-   goto out_error;
+   map->active.ring = active_ring->ring;
+   map->active.ring->ring_order = active_ring->ring_order;
+   bytes = active_ring->bytes;
for (i = 0; i < (1 << PVCALLS_RING_ORDER); i++)
map->active.ring->ref[i] = gnttab_grant_foreign_access(
pvcalls_front_dev->otherend_id,
@@ -397,6 +427,7 @@ int pvcalls_front_connect(struct socket *sock, struct 
sockaddr *addr,
struct sock_mapping *map = NULL;
struct xen_pvcalls_request *req;
int notify, req_id, ret, evtchn;
+   struct sock_mapping_active_ring active_ring;
 
if (addr->sa_family != AF_INET || sock->type != SOCK_STREAM)
return -EOPNOTSUPP;
@@ -406,15 +437,21 @@ int pvcalls_front_connect(struct socket *sock, struct 
sockaddr *addr,
return PTR_ERR(map);
 
bedata = dev_get_drvdata(&pvcalls_front_dev->dev);
+   ret = alloc_active_ring(&active_ring);
+   if (ret < 0) {
+   pvcalls_exit_sock(sock);
+   return ret;
+   }
 
spin_lock(&bedata->socket_lock);
ret = get_request(bedata, &req_id);
if (ret < 0) {
spin_unlock(&bedata->socket_lock);
+   free_active_ring(&active_ring);
pvcalls_exit_sock(sock);
return ret;
}
-   ret = create_active(map, &evtchn);
+   ret = create_active(map, &evtchn, &active_ring);
if (ret < 0) {
spin_unlock(&bedata->socket_lock);
pvcalls_exit_sock(sock);
@@ -744,6 +781,7 @@ int pvcalls_front_accept(struct socket *sock, struct socket 
*newsock, int flags)
struct sock_mapping *map2 = NULL;
struct xen_pvcalls_request *req;
int notify, req_id, ret, e

Re: [Xen-devel] [PATCH v2 2/2] ns16550: enable use of PCI MSI

2018-11-30 Thread Roger Pau Monné
On Fri, Nov 30, 2018 at 01:52:39AM -0700, Jan Beulich wrote:
> >>> On 29.11.18 at 18:33,  wrote:
> > On Mon, Oct 01, 2018 at 10:26:05AM -0600, Jan Beulich wrote:
> >> --- a/xen/arch/x86/msi.c
> >> +++ b/xen/arch/x86/msi.c
> >> @@ -742,6 +742,16 @@ static int msi_capability_init(struct pc
> >>  
> >>  *desc = entry;
> >>  /* Restore the original MSI enabled bits  */
> >> +if ( !hardware_domain )
> > 
> > Wouldn't it be better to assign the device to dom_xen (pdev->domain =
> > dom_xen), and then check if the owner is dom_xen here?
> 
> I'm not sure this couldn't be wrong in the general case (and we
> sit on a generic code path here): It depends on whether Dom0
> can modify the device's config space, and I wouldn't want to
> (here) introduce a connection between dom_xen ownership and
> whether Dom0 can control INTX. The comment below here is
> specifically worded to the effect of why I use hardware_domain
> here.

Well, I think Dom0 shouldn't be allowed to interact with devices owned
by dom_xen. That being set, at least the current vPCI code will allow
PVH Dom0 to do so by passing through any accesses to registers not
explicitly handled by vPCI.

> If we ever get into the situation of wanting to enable MSI on an
> internally used device _after_ Dom0 has started, this would need
> careful re-considering.

OK, I'm fine with this. Maybe using system_state would be clearer to
note that this code path is only to be used during early boot?

> > Or at the point where this is called from the serial console driver is
> > too early for dom_xen to exist?
> 
> ns16550_init_postirq() is where both MSI setup and hiding of the
> device happen, so in principle this would seem to be possible for
> the specific case of a serial card.

IMO it's clear from a conceptual PoV to check against the ownership of
the device rather than the system state at the point of the function
call. Devices assigned to Dom0 use the split model, devices assigned
to Xen don't.

> >> @@ -777,8 +777,65 @@ static void __init ns16550_init_postirq(
> >>  uart->ps_bdf[0], uart->ps_bdf[1],
> >>  uart->ps_bdf[2]);
> >>  }
> >> +
> >> +if ( uart->msi )
> >> +{
> >> +struct msi_info msi = {
> >> +.bus = uart->ps_bdf[0],
> >> +.devfn = PCI_DEVFN(uart->ps_bdf[1], uart->ps_bdf[2]),
> >> +.irq = rc = uart->irq,
> >> +.entry_nr = 1
> >> +};
> >> +
> >> +if ( rc > 0 )
> >> +{
> >> +struct msi_desc *msi_desc = NULL;
> >> +
> >> +pcidevs_lock();
> >> +
> >> +rc = pci_enable_msi(&msi, &msi_desc);
> > 
> > Before attempting to enable MSI, shouldn't you make sure memory
> > decoding is enabled in case the device uses MSIX?
> > 
> > I think this code assumes the device will always use MSI? (in which
> > case my above question is likely moot).
> 
> I've yet to see serial cards using MSI-X. If we get to the point where
> this is needed, we also may need to first set up the BAR for the MSI-X
> table. Furthermore pci_enable_msi() won't even try to enable MSI-X
> with msi->table_base being zero.

Yes, that's how I figured out it was restricted to MSI because
table_base was always 0. Note sure if a comment to note this code is
not capable of handling MSIX would be helpful.

> >> --- a/xen/drivers/pci/pci.c
> >> +++ b/xen/drivers/pci/pci.c
> >> @@ -115,6 +115,21 @@ int pci_find_next_ext_capability(int seg
> >>  return 0;
> >>  }
> >>  
> >> +void pci_intx(const struct pci_dev *pdev, bool_t enable)
> > 
> > Please use bool.
> 
> See how old this patch is. V1 was posted long before bool came into
> existence, and I had refrained from posting v2 until I actually had a
> device where MSI would indeed function (the first two I tried this
> with claimed to be MSI capable, but no interrupts ever surfaced
> when MSI was enabled on them, yet I couldn't be sure the code
> was doing something wrong). Obviously I then forgot to switch this,
> which I've now done.

Thanks.

With the bool change:

Reviewed-by: Roger Pau Monné 

I don't want to hold this any longer for the msi_capability_init
change, we can always switch to check device ownership in a later
patch.

Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 2/9] libxl: Move dm user determination logic into a helper function

2018-11-30 Thread George Dunlap


> On Nov 23, 2018, at 5:14 PM, George Dunlap  wrote:
> 
> diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
> index e498435e16..a370de54ed 100644
> --- a/tools/libxl/libxl_internal.h
> +++ b/tools/libxl/libxl_internal.h
> @@ -1135,6 +1135,8 @@ typedef struct {
> const char * shim_cmdline;
> const char * pv_cmdline;
> 
> +char * dm_runas;

I’ve removed the space both here, and in the above char * elements. I’ll retain 
both of your Acks unless I hear otherwise.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 2/2] iommu: elide flushing for higher order map/unmap operations

2018-11-30 Thread Andrew Cooper
On 30/11/2018 10:45, Paul Durrant wrote:
> This patch removes any implicit flushing that occurs in the implementation
> of map and unmap operations and, instead, adds explicit flushing at the
> end of the loops in the iommu_map/unmap() wrapper functions.
>
> Because VT-d currently performs two different types of flush dependent upon
> whether a PTE is being modified versus merely added (i.e. replacing a non-
> present PTE) a 'iommu_flush_type' enumeration is defined by this patch and
> the iommu_ops map method is modified to pass back the type of flush
> necessary for the PTE that has been populated. When a higher order mapping
> operation is done, the wrapper code performs the 'highest' level of flush
> required by the individual iommu_ops method calls, where a 'modified PTE'
> flush is deemed to be higher than a 'added PTE' flush. The ARM SMMU
> implementation currently performs no implicit flushing and therefore
> the modified map method always passes back a flush type of 'none'.
>
> NOTE: The per-cpu 'iommu_dont_flush_iotlb' is respected by the iommu_map()
>   wrapper function and therefore this now applies to all IOMMU
>   implementations rather than just VT-d. Use of the flag has been added
>   to arch_iommu_hwdom_init() so that flushing is now fully elided for
>   the hardware domain's initial table population.

iommu_dont_flush_iotlb is a misfeature.  While it still exists, the
flushing API is fundamentally broken.

Do you have a plan to remove it?  I ask, because the only feasible
option I see is for iommu_{map,unmap}() to pass the flush accumulation
out to the caller, and have the caller use the appropriate flush interfaces.

[Edit - lunch happened around about this point, and there was a long
discussion]

One idea with be to start with a prep patch renaming iommu_{,un}map() to
_legacy(), nothing beside them that they have implicit flushing
characteristics.  Then, the nonflushing versions of iommu_{,un}map() can
be introduced, which return the accumulated flush flag, and the callers
can DTRT.

This way, we can avoid introducing a further user of
iommu_dont_flush_iotlb in arch_iommu_hwdom_init(), and clean up the
remaining legacy callsites at a later point when more infrastructure is
in place.

In particular, the P2M code cannot be fixed to behave in this way at the
moment because the point at which the IOMMU changes are hooked in lacks
an order parameter.

> diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
> index da8294bac8..289e0e2772 100644
> --- a/xen/include/xen/iommu.h
> +++ b/xen/include/xen/iommu.h
> @@ -155,6 +155,13 @@ struct page_info;
>   */
>  typedef int iommu_grdm_t(xen_pfn_t start, xen_ulong_t nr, u32 id, void 
> *ctxt);
>  

This wants at least a comment stating that some IOMMUs require that we
issue a flush when modifying a not-present/otherwise invalid entry.

> +enum iommu_flush_type
> +{
> +IOMMU_FLUSH_none, /* no flush required */
> +IOMMU_FLUSH_added, /* no modified entries, just additional entries */

IOMMU_FLUSH_invalid ?  I think it is more descriptive of the scenario in
which it is used.

> +IOMMU_FLUSH_modified, /* modified entries */
> +};
> +
>  struct iommu_ops {
>  int (*init)(struct domain *d);
>  void (*hwdom_init)(struct domain *d);
> @@ -177,7 +184,8 @@ struct iommu_ops {
>   * other by the caller in order to have meaningful results.
>   */
>  int __must_check (*map_page)(struct domain *d, dfn_t dfn, mfn_t mfn,
> - unsigned int flags);
> + unsigned int flags,
> + enum iommu_flush_type *flush_type);

Maintaining the flush type by pointer is quite awkward.

How about folding a positive flush type in with negative errors?  i.e.
map_page() becomes < 0 on error, 0 for success/no flush and >0 for
success/w flush.

I think the result would be rather cleaner to read.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 3/9] libxl: Get rid of support for QEMU_USER_BASE (xen-qemuuser-domidNN)

2018-11-30 Thread George Dunlap


> On Nov 28, 2018, at 3:32 PM, Wei Liu  wrote:
> 
> On Fri, Nov 23, 2018 at 05:14:56PM +, George Dunlap wrote:
>> QEMU_USER_BASE allows a user to specify the UID to use when running
>> the devicemodel for a specific domain number.  Unfortunately, this is
>> not really practical: It requires nearly 32,000 entries in
>> /etc/passwd.  QEMU_USER_RANGE_BASE is much more practical.
>> 
>> Remove support for QEMU_USER_BASE.
>> 
>> Signed-off-by: George Dunlap 
> 
> I have no problem with you removing this, but you forgot to change
> xl.cfg.pod.5.in.

Hmm… it seems what I *really* forgot to do was remove that entire section and 
point to the feature doc instead.  It’s quite out of date now (e.g., cdrom 
insert should work I think, qemu is chrooted so can’t read world-readable 
files, &c).  With your permission I’ll put fixing that up properly on a to-do 
list (to be done before 4.12 release at a minimum).

 -George
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 1/2] amd-iommu: add flush iommu_ops

2018-11-30 Thread Andrew Cooper
On 30/11/2018 10:45, Paul Durrant wrote:
> diff --git a/xen/drivers/passthrough/amd/iommu_map.c 
> b/xen/drivers/passthrough/amd/iommu_map.c
> index 04cb7b3182..c05b042821 100644
> --- a/xen/drivers/passthrough/amd/iommu_map.c
> +++ b/xen/drivers/passthrough/amd/iommu_map.c
> @@ -631,6 +631,54 @@ int amd_iommu_unmap_page(struct domain *d, dfn_t dfn)
>  spin_unlock(&hd->arch.mapping_lock);
>  
>  amd_iommu_flush_pages(d, dfn_x(dfn), 0);
> +return 0;
> +}
> +
> +static unsigned long flush_count(dfn_t dfn, unsigned int page_count,
> + unsigned int order)
> +{
> +unsigned long start = dfn_x(dfn) / (1u << order);
> +unsigned long end = DIV_ROUND_UP(dfn_x(dfn) + page_count,
> + (1u << order));
> +
> +ASSERT(end > start);
> +return end - start;
> +}
> +
> +int amd_iommu_flush_iotlb_pages(struct domain *d, dfn_t dfn,
> +unsigned int page_count)

What are the semantics here?  Why page_count rather than order?  Are we
guaranteed to be actually flushing consecutive dfn's ?

> +{
> +/* Match VT-d semantics */
> +if ( !page_count || dfn_eq(dfn, INVALID_DFN) ||

Do we really have callers passing these?  I'd argue that these should be
invalid to pass (accepting that we might need to tolerate them until
other cleanup can occur).

> + dfn_lt(dfn_add(dfn, page_count), dfn) /* overflow */ )

Given that all users of dfn here want it in unsigned long form, I'd make
a local dfn_l and use that.  I'm not sure that we want to start
accumulating a sporadic set of binary operator functions for the
typesafe variables.

> diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
> index 3d78126801..da8294bac8 100644
> --- a/xen/include/xen/iommu.h
> +++ b/xen/include/xen/iommu.h
> @@ -52,6 +52,11 @@ static inline bool_t dfn_eq(dfn_t x, dfn_t y)
>  return dfn_x(x) == dfn_x(y);
>  }
>  
> +static inline bool_t dfn_lt(dfn_t x, dfn_t y)

No new introductions of bool_t please.  dfn_eq() shouldn't have been
bool_t either.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 1/2] tools/libxc: Fix issues with libxc and Xen having different featureset lengths

2018-11-30 Thread Andrew Cooper
On 30/11/2018 09:18, Jan Beulich wrote:
 On 29.11.18 at 20:20,  wrote:
>> In almost all cases, Xen and libxc will agree on the featureset length,
>> because they are built from the same source.
>>
>> However, there are circumstances (e.g. security hotfixes) where the 
>> featureset
>> gets longer and dom0 will, after installing updates, be running with an old
>> Xen but new libxc.  Despite writing the code with this scenario in mind, 
>> there
>> were some bugs.
>>
>> First, xen-cpuid's get_featureset() erroneously allocates a buffer based on
>> Xen's featureset length, but records libxc's length, which is longer.
> "... which may be longer", seeing that nr_features gets initialized from
> xc_get_cpu_featureset_size()'s return value, and its subsequent
> updating (through xc_get_cpu_featureset()) is only done in certain
> cases.

Ah yes - very true.  I'll also tweak the following sentence to start
with "In this situation, the hypercall bounce ..."

>
>> The hypercall bounce buffer code reads/writes the recorded length, which is
>> beyond the end of the allocated object, and a later free() encounters corrupt
>> heap metadata.  Fix this by recording the same length that we allocate.
>>
>> Secondly, get_cpuid_domain_info() has a related bug when the passed-in
>> featureset is a different length to libxc's.
>>
>> A large amount of the libxc cpuid functionality depends on info->featureset
>> being as long as expected, and it is allocated appropriately.  However, in 
>> the
>> case that a shorter external featureset is passed in, the logic to check for
>> trailing nonzero bits may read off the end of it.  Rework the logic to use 
>> the
>> correct upper bound.
>>
>> In addition, leave a comment next to the fields in struct cpuid_domain_info
>> explaining the relationship between the various lengths, and how to cope with
>> different lengths.
>>
>> Signed-off-by: Andrew Cooper 
> Reviewed-by: Jan Beulich 

Thanks,

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 1/2] amd-iommu: add flush iommu_ops

2018-11-30 Thread Paul Durrant
> -Original Message-
> From: Andrew Cooper
> Sent: 30 November 2018 13:04
> To: Paul Durrant ; xen-devel@lists.xenproject.org
> Cc: Suravee Suthikulpanit ; Brian Woods
> ; Jan Beulich ; Wei Liu
> ; Roger Pau Monne 
> Subject: Re: [PATCH 1/2] amd-iommu: add flush iommu_ops
> 
> On 30/11/2018 10:45, Paul Durrant wrote:
> > diff --git a/xen/drivers/passthrough/amd/iommu_map.c
> b/xen/drivers/passthrough/amd/iommu_map.c
> > index 04cb7b3182..c05b042821 100644
> > --- a/xen/drivers/passthrough/amd/iommu_map.c
> > +++ b/xen/drivers/passthrough/amd/iommu_map.c
> > @@ -631,6 +631,54 @@ int amd_iommu_unmap_page(struct domain *d, dfn_t
> dfn)
> >  spin_unlock(&hd->arch.mapping_lock);
> >
> >  amd_iommu_flush_pages(d, dfn_x(dfn), 0);
> > +return 0;
> > +}
> > +
> > +static unsigned long flush_count(dfn_t dfn, unsigned int page_count,
> > + unsigned int order)
> > +{
> > +unsigned long start = dfn_x(dfn) / (1u << order);
> > +unsigned long end = DIV_ROUND_UP(dfn_x(dfn) + page_count,
> > + (1u << order));
> > +
> > +ASSERT(end > start);
> > +return end - start;
> > +}
> > +
> > +int amd_iommu_flush_iotlb_pages(struct domain *d, dfn_t dfn,
> > +unsigned int page_count)
> 
> What are the semantics here?  Why page_count rather than order?

Because that's the way the iommu_ops method is declared in the header?

>  Are we
> guaranteed to be actually flushing consecutive dfn's ?

I believe so. Looking at the wrapper code and its callers, that seems to be the 
assumption.

> 
> > +{
> > +/* Match VT-d semantics */
> > +if ( !page_count || dfn_eq(dfn, INVALID_DFN) ||
> 
> Do we really have callers passing these?  I'd argue that these should be
> invalid to pass (accepting that we might need to tolerate them until
> other cleanup can occur).

I could look at cleaning that up. There aren't many callers.

> 
> > + dfn_lt(dfn_add(dfn, page_count), dfn) /* overflow */ )
> 
> Given that all users of dfn here want it in unsigned long form, I'd make
> a local dfn_l and use that.  I'm not sure that we want to start
> accumulating a sporadic set of binary operator functions for the
> typesafe variables.
> 

Ok, I don't really mind. I thought it was neater this way though.

> > diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
> > index 3d78126801..da8294bac8 100644
> > --- a/xen/include/xen/iommu.h
> > +++ b/xen/include/xen/iommu.h
> > @@ -52,6 +52,11 @@ static inline bool_t dfn_eq(dfn_t x, dfn_t y)
> >  return dfn_x(x) == dfn_x(y);
> >  }
> >
> > +static inline bool_t dfn_lt(dfn_t x, dfn_t y)
> 
> No new introductions of bool_t please.  dfn_eq() shouldn't have been
> bool_t either.

Oops, yes. Cut'n'pasted the wrong thing.

  Paul

> 
> ~Andrew
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] Linux 4.19.5 fails to boot as Xen dom0

2018-11-30 Thread Hans van Kranenburg
On 11/29/18 4:06 PM, Kirill A. Shutemov wrote:
> On Thu, Nov 29, 2018 at 03:00:45PM +, Juergen Gross wrote:
>> On 29/11/2018 15:32, Kirill A. Shutemov wrote:
>>> On Thu, Nov 29, 2018 at 02:24:47PM +, Kirill A. Shutemov wrote:
 On Thu, Nov 29, 2018 at 01:35:17PM +, Juergen Gross wrote:
> On 29/11/2018 14:26, Kirill A. Shutemov wrote:
>> On Thu, Nov 29, 2018 at 09:41:25AM +, Juergen Gross wrote:
>>> On 29/11/2018 02:22, Hans van Kranenburg wrote:
 Hi,

 As also seen at:
 https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=914951

 Attached there are two serial console output logs. One is starting with
 Xen 4.11 (from debian unstable) as dom0, and the other one without Xen.

 [2.085543] BUG: unable to handle kernel paging request at
 888d9fffc000
 [2.085610] PGD 200c067 P4D 200c067 PUD 0
 [2.085674] Oops:  [#1] SMP NOPTI
 [2.085736] CPU: 1 PID: 1 Comm: swapper/0 Not tainted
 4.19.0-trunk-amd64 #1 Debian 4.19.5-1~exp1+pvh1
 [2.085823] Hardware name: HP ProLiant DL360 G7, BIOS P68 05/21/2018
 [2.085895] RIP: e030:ptdump_walk_pgd_level_core+0x1fd/0x490
 [...]
>>>
>>> The offending stable commit is 4074ca7d8a1832921c865d250bbd08f3441b3657
>>> ("x86/mm: Move LDT remap out of KASLR region on 5-level paging"), this
>>> is commit d52888aa2753e3063a9d3a0c9f72f94aa9809c15 upstream.
>>>
>>> Current upstream kernel is booting fine under Xen, so in general the
>>> patch should be fine. Using an upstream kernel built from above commit
>>> (with the then needed Xen fixup patch 1457d8cf7664f34c4ba534) is fine,
>>> too.
>>>
>>> Kirill, are you aware of any prerequisite patch from 4.20 which could be
>>> missing in 4.19.5?
>>
>> I'm not.
>>
>> Let me look into this.
>>
>
> What is making me suspicious is the failure happening just after
> releasing the init memory. Maybe there is an access to .init.data
> segment or similar? The native kernel booting could be related to the
> usage of 2M mappings not being available in a PV-domain.

 Ahh.. Could you test this:

 diff --git a/arch/x86/mm/dump_pagetables.c b/arch/x86/mm/dump_pagetables.c
 index a12afff146d1..7dec63ec7aab 100644
 --- a/arch/x86/mm/dump_pagetables.c
 +++ b/arch/x86/mm/dump_pagetables.c
 @@ -496,7 +496,7 @@ static inline bool is_hypervisor_range(int idx)
 * 8000 - 87ff is reserved for
 * the hypervisor.
 */
 -  return  (idx >= pgd_index(__PAGE_OFFSET) - 16) &&
 +  return  (idx >= pgd_index(__PAGE_OFFSET) - 17) &&
(idx <  pgd_index(__PAGE_OFFSET));
  #else
return false;
>>>
>>> Or, better, this:
>>
>> That makes it boot again!
>>
>> Any idea why upstream doesn't need it?
> 
> Nope.
> 
> I'll prepare a proper fix.
> 

Thanks for looking into this.

In the meantime, I applied the "Or, better, this" change, and my dom0
boots again.

FYI, boot log now: (paste 90d valid)
https://paste.debian.net/plainh/48940826

Hans

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 2/2] tools/libxc: Fix error handling in get_cpuid_domain_info()

2018-11-30 Thread Andrew Cooper
On 30/11/2018 09:23, Jan Beulich wrote:
 On 29.11.18 at 20:20,  wrote:
>> get_cpuid_domain_info() has two conflicting return styles - either -error for
>> local failures, or -1/errno for hypercall failures.  Switch to consistently
>> use -error.
>>
>> While fixing the xc_get_cpu_featureset(), take the opportunity to remove the
>> redundancy and move it to be adjacent to the other featureset handling.
> Having noticed the redundancy before, I was assuming this was done
> to prevent altering host_nr_features early, just in case future code
> additions would want to use its original value. The way it is now, the
> code folding looks correct.

I don't recall a deliberate intention to do that, but it was a long time
ago now.

>
>> Signed-off-by: Andrew Cooper 
> Reviewed-by: Jan Beulich 

Thanks.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 03/23] arm: remove the mapping_error dma_map_ops method

2018-11-30 Thread Christoph Hellwig
Arm already returns (~(dma_addr_t)0x0) on mapping failures, so we can
switch over to returning DMA_MAPPING_ERROR and let the core dma-mapping
code handle the rest.

Signed-off-by: Christoph Hellwig 
---
 arch/arm/common/dmabounce.c  | 12 +++---
 arch/arm/include/asm/dma-iommu.h |  2 --
 arch/arm/mm/dma-mapping.c| 39 
 3 files changed, 18 insertions(+), 35 deletions(-)

diff --git a/arch/arm/common/dmabounce.c b/arch/arm/common/dmabounce.c
index 9a92de63426f..5ba4622030ca 100644
--- a/arch/arm/common/dmabounce.c
+++ b/arch/arm/common/dmabounce.c
@@ -257,7 +257,7 @@ static inline dma_addr_t map_single(struct device *dev, 
void *ptr, size_t size,
if (buf == NULL) {
dev_err(dev, "%s: unable to map unsafe buffer %p!\n",
   __func__, ptr);
-   return ARM_MAPPING_ERROR;
+   return DMA_MAPPING_ERROR;
}
 
dev_dbg(dev, "%s: unsafe buffer %p (dma=%#x) mapped to %p (dma=%#x)\n",
@@ -327,7 +327,7 @@ static dma_addr_t dmabounce_map_page(struct device *dev, 
struct page *page,
 
ret = needs_bounce(dev, dma_addr, size);
if (ret < 0)
-   return ARM_MAPPING_ERROR;
+   return DMA_MAPPING_ERROR;
 
if (ret == 0) {
arm_dma_ops.sync_single_for_device(dev, dma_addr, size, dir);
@@ -336,7 +336,7 @@ static dma_addr_t dmabounce_map_page(struct device *dev, 
struct page *page,
 
if (PageHighMem(page)) {
dev_err(dev, "DMA buffer bouncing of HIGHMEM pages is not 
supported\n");
-   return ARM_MAPPING_ERROR;
+   return DMA_MAPPING_ERROR;
}
 
return map_single(dev, page_address(page) + offset, size, dir, attrs);
@@ -453,11 +453,6 @@ static int dmabounce_dma_supported(struct device *dev, u64 
dma_mask)
return arm_dma_ops.dma_supported(dev, dma_mask);
 }
 
-static int dmabounce_mapping_error(struct device *dev, dma_addr_t dma_addr)
-{
-   return arm_dma_ops.mapping_error(dev, dma_addr);
-}
-
 static const struct dma_map_ops dmabounce_ops = {
.alloc  = arm_dma_alloc,
.free   = arm_dma_free,
@@ -472,7 +467,6 @@ static const struct dma_map_ops dmabounce_ops = {
.sync_sg_for_cpu= arm_dma_sync_sg_for_cpu,
.sync_sg_for_device = arm_dma_sync_sg_for_device,
.dma_supported  = dmabounce_dma_supported,
-   .mapping_error  = dmabounce_mapping_error,
 };
 
 static int dmabounce_init_pool(struct dmabounce_pool *pool, struct device *dev,
diff --git a/arch/arm/include/asm/dma-iommu.h b/arch/arm/include/asm/dma-iommu.h
index 6821f1249300..772f48ef84b7 100644
--- a/arch/arm/include/asm/dma-iommu.h
+++ b/arch/arm/include/asm/dma-iommu.h
@@ -9,8 +9,6 @@
 #include 
 #include 
 
-#define ARM_MAPPING_ERROR  (~(dma_addr_t)0x0)
-
 struct dma_iommu_mapping {
/* iommu specific data */
struct iommu_domain *domain;
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 661fe48ab78d..2cfb17bad1e6 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -179,11 +179,6 @@ static void arm_dma_sync_single_for_device(struct device 
*dev,
__dma_page_cpu_to_dev(page, offset, size, dir);
 }
 
-static int arm_dma_mapping_error(struct device *dev, dma_addr_t dma_addr)
-{
-   return dma_addr == ARM_MAPPING_ERROR;
-}
-
 const struct dma_map_ops arm_dma_ops = {
.alloc  = arm_dma_alloc,
.free   = arm_dma_free,
@@ -197,7 +192,6 @@ const struct dma_map_ops arm_dma_ops = {
.sync_single_for_device = arm_dma_sync_single_for_device,
.sync_sg_for_cpu= arm_dma_sync_sg_for_cpu,
.sync_sg_for_device = arm_dma_sync_sg_for_device,
-   .mapping_error  = arm_dma_mapping_error,
.dma_supported  = arm_dma_supported,
 };
 EXPORT_SYMBOL(arm_dma_ops);
@@ -217,7 +211,6 @@ const struct dma_map_ops arm_coherent_dma_ops = {
.get_sgtable= arm_dma_get_sgtable,
.map_page   = arm_coherent_dma_map_page,
.map_sg = arm_dma_map_sg,
-   .mapping_error  = arm_dma_mapping_error,
.dma_supported  = arm_dma_supported,
 };
 EXPORT_SYMBOL(arm_coherent_dma_ops);
@@ -774,7 +767,7 @@ static void *__dma_alloc(struct device *dev, size_t size, 
dma_addr_t *handle,
gfp &= ~(__GFP_COMP);
args.gfp = gfp;
 
-   *handle = ARM_MAPPING_ERROR;
+   *handle = DMA_MAPPING_ERROR;
allowblock = gfpflags_allow_blocking(gfp);
cma = allowblock ? dev_get_cma_area(dev) : false;
 
@@ -1217,7 +1210,7 @@ static inline dma_addr_t __alloc_iova(struct 
dma_iommu_mapping *mapping,
if (i == mapping->nr_bitmaps) {
if (extend_iommu_mapping(mapping)) {
spin_unlock_irqrestore(&mapping->lock, flags);
-   ret

[Xen-devel] [PATCH 01/23] dma-mapping: provide a generic DMA_MAPPING_ERROR

2018-11-30 Thread Christoph Hellwig
Error handling of the dma_map_single and dma_map_page APIs is a little
problematic at the moment, in that we use different encodings in the
returned dma_addr_t to indicate an error.  That means we require an
additional indirect call to figure out if a dma mapping call returned
an error, and a lot of boilerplate code to implement these semantics.

Instead return the maximum addressable value as the error.  As long
as we don't allow mapping single-byte ranges with single-byte alignment
this value can never be a valid return.  Additionaly if drivers do
not check the return value from the dma_map* routines this values means
they will generally not be pointed to actual memory.

Once the default value is added here we can start removing the
various mapping_error methods and just rely on this generic check.

Signed-off-by: Christoph Hellwig 
---
 include/linux/dma-mapping.h | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index 0f81c713f6e9..46bd612d929e 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -133,6 +133,8 @@ struct dma_map_ops {
u64 (*get_required_mask)(struct device *dev);
 };
 
+#define DMA_MAPPING_ERROR  (~(dma_addr_t)0)
+
 extern const struct dma_map_ops dma_direct_ops;
 extern const struct dma_map_ops dma_virt_ops;
 
@@ -576,6 +578,10 @@ static inline int dma_mapping_error(struct device *dev, 
dma_addr_t dma_addr)
const struct dma_map_ops *ops = get_dma_ops(dev);
 
debug_dma_mapping_error(dev, dma_addr);
+
+   if (dma_addr == DMA_MAPPING_ERROR)
+   return 1;
+
if (ops->mapping_error)
return ops->mapping_error(dev, dma_addr);
return 0;
-- 
2.19.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] remove the ->mapping_error method from dma_map_ops V3

2018-11-30 Thread Christoph Hellwig
Error reporting for the dma_map_single and dma_map_page operations is
currently a mess.  Both APIs directly return the dma_addr_t to be used for
the DMA, with a magic error escape that is specific to the instance and
checked by another method provided.  This has a few downsides:

 - the error check is easily forgotten and a __must_check marker doesn't
   help as the value always is consumed anyway
 - the error checking requires another indirect call, which have gotten
   incredibly expensive
 - a lot of code is wasted on implementing these methods

The historical reason for this is that people thought DMA mappings would
not fail when the API was created, which sounds like a really bad
assumption in retrospective, and then we tried to cram error handling
onto it later on.

There basically are two variants:  the error code is 0 because the
implementation will never return 0 as a valid DMA address, or the error
code is all-F as the implementation won't ever return an address that
high.  The old AMD GART is the only one not falling into these two camps
as it picks sort of a relative zero relative to where it is mapped.

The 0 return doesn't work for direct mappings that have ram at address
zero and a lot of IOMMUs that start allocating bus space from address
zero, so we can't consolidate on that, but I think we can move everyone
to all-Fs, which the patch here does.  The reason for that is that
there is only one way to ever get this address: by doing a 1-byte long,
1-byte aligned mapping, but all our mappings are not only longer but
generally aligned, and the mappings have to keep at least the basic
alignment.

A git tree is also available here:

git://git.infradead.org/users/hch/misc.git dma-mapping-error.3

Gitweb:


http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/dma-mapping-error.3

Changes since v2:
 - fix a compile error in the ia64 sba_iommu driver
 - return an errno value from dma_mapping_error

Changes since v1:
 - dropped the signature change
 - split into multiple patches
 - fixed the iova allocator return check in amd-iommu
 - remove EMERGENCY_PAGES in amd_gart and calgary

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 08/23] parisc/ccio: remove the mapping_error dma_map_ops method

2018-11-30 Thread Christoph Hellwig
The CCIO iommu code already returns (~(dma_addr_t)0x0) on mapping
failures, so we can switch over to returning DMA_MAPPING_ERROR and let
the core dma-mapping code handle the rest.

Signed-off-by: Christoph Hellwig 
---
 drivers/parisc/ccio-dma.c | 10 +-
 1 file changed, 1 insertion(+), 9 deletions(-)

diff --git a/drivers/parisc/ccio-dma.c b/drivers/parisc/ccio-dma.c
index 701a7d6a74d5..714aac72df0e 100644
--- a/drivers/parisc/ccio-dma.c
+++ b/drivers/parisc/ccio-dma.c
@@ -110,8 +110,6 @@
 #define CMD_TLB_DIRECT_WRITE 35 /* IO_COMMAND for I/O TLB Writes */
 #define CMD_TLB_PURGE33 /* IO_COMMAND to Purge I/O TLB entry */
 
-#define CCIO_MAPPING_ERROR(~(dma_addr_t)0)
-
 struct ioa_registers {
 /* Runway Supervisory Set */
 int32_tunused1[12];
@@ -740,7 +738,7 @@ ccio_map_single(struct device *dev, void *addr, size_t size,
BUG_ON(!dev);
ioc = GET_IOC(dev);
if (!ioc)
-   return CCIO_MAPPING_ERROR;
+   return DMA_MAPPING_ERROR;
 
BUG_ON(size <= 0);
 
@@ -1021,11 +1019,6 @@ ccio_unmap_sg(struct device *dev, struct scatterlist 
*sglist, int nents,
DBG_RUN_SG("%s() DONE (nents %d)\n", __func__, nents);
 }
 
-static int ccio_mapping_error(struct device *dev, dma_addr_t dma_addr)
-{
-   return dma_addr == CCIO_MAPPING_ERROR;
-}
-
 static const struct dma_map_ops ccio_ops = {
.dma_supported =ccio_dma_supported,
.alloc =ccio_alloc,
@@ -1034,7 +1027,6 @@ static const struct dma_map_ops ccio_ops = {
.unmap_page =   ccio_unmap_page,
.map_sg =   ccio_map_sg,
.unmap_sg = ccio_unmap_sg,
-   .mapping_error =ccio_mapping_error,
 };
 
 #ifdef CONFIG_PROC_FS
-- 
2.19.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 06/23] s390: remove the mapping_error dma_map_ops method

2018-11-30 Thread Christoph Hellwig
S390 already returns (~(dma_addr_t)0x0) on mapping failures, so we can
switch over to returning DMA_MAPPING_ERROR and let the core dma-mapping
code handle the rest.

Signed-off-by: Christoph Hellwig 
---
 arch/s390/pci/pci_dma.c | 18 +-
 1 file changed, 5 insertions(+), 13 deletions(-)

diff --git a/arch/s390/pci/pci_dma.c b/arch/s390/pci/pci_dma.c
index d387a0fbdd7e..346ba382193a 100644
--- a/arch/s390/pci/pci_dma.c
+++ b/arch/s390/pci/pci_dma.c
@@ -15,8 +15,6 @@
 #include 
 #include 
 
-#define S390_MAPPING_ERROR (~(dma_addr_t) 0x0)
-
 static struct kmem_cache *dma_region_table_cache;
 static struct kmem_cache *dma_page_table_cache;
 static int s390_iommu_strict;
@@ -301,7 +299,7 @@ static dma_addr_t dma_alloc_address(struct device *dev, int 
size)
 
 out_error:
spin_unlock_irqrestore(&zdev->iommu_bitmap_lock, flags);
-   return S390_MAPPING_ERROR;
+   return DMA_MAPPING_ERROR;
 }
 
 static void dma_free_address(struct device *dev, dma_addr_t dma_addr, int size)
@@ -349,7 +347,7 @@ static dma_addr_t s390_dma_map_pages(struct device *dev, 
struct page *page,
/* This rounds up number of pages based on size and offset */
nr_pages = iommu_num_pages(pa, size, PAGE_SIZE);
dma_addr = dma_alloc_address(dev, nr_pages);
-   if (dma_addr == S390_MAPPING_ERROR) {
+   if (dma_addr == DMA_MAPPING_ERROR) {
ret = -ENOSPC;
goto out_err;
}
@@ -372,7 +370,7 @@ static dma_addr_t s390_dma_map_pages(struct device *dev, 
struct page *page,
 out_err:
zpci_err("map error:\n");
zpci_err_dma(ret, pa);
-   return S390_MAPPING_ERROR;
+   return DMA_MAPPING_ERROR;
 }
 
 static void s390_dma_unmap_pages(struct device *dev, dma_addr_t dma_addr,
@@ -449,7 +447,7 @@ static int __s390_dma_map_sg(struct device *dev, struct 
scatterlist *sg,
int ret;
 
dma_addr_base = dma_alloc_address(dev, nr_pages);
-   if (dma_addr_base == S390_MAPPING_ERROR)
+   if (dma_addr_base == DMA_MAPPING_ERROR)
return -ENOMEM;
 
dma_addr = dma_addr_base;
@@ -496,7 +494,7 @@ static int s390_dma_map_sg(struct device *dev, struct 
scatterlist *sg,
for (i = 1; i < nr_elements; i++) {
s = sg_next(s);
 
-   s->dma_address = S390_MAPPING_ERROR;
+   s->dma_address = DMA_MAPPING_ERROR;
s->dma_length = 0;
 
if (s->offset || (size & ~PAGE_MASK) ||
@@ -546,11 +544,6 @@ static void s390_dma_unmap_sg(struct device *dev, struct 
scatterlist *sg,
}
 }

-static int s390_mapping_error(struct device *dev, dma_addr_t dma_addr)
-{
-   return dma_addr == S390_MAPPING_ERROR;
-}
-
 int zpci_dma_init_device(struct zpci_dev *zdev)
 {
int rc;
@@ -675,7 +668,6 @@ const struct dma_map_ops s390_pci_dma_ops = {
.unmap_sg   = s390_dma_unmap_sg,
.map_page   = s390_dma_map_pages,
.unmap_page = s390_dma_unmap_pages,
-   .mapping_error  = s390_mapping_error,
/* dma_supported is unconditionally true without a callback */
 };
 EXPORT_SYMBOL_GPL(s390_pci_dma_ops);
-- 
2.19.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 07/23] sparc: remove the mapping_error dma_map_ops method

2018-11-30 Thread Christoph Hellwig
Sparc already returns (~(dma_addr_t)0x0) on mapping failures, so we can
switch over to returning DMA_MAPPING_ERROR and let the core dma-mapping
code handle the rest.

Signed-off-by: Christoph Hellwig 
---
 arch/sparc/kernel/iommu.c| 12 +++-
 arch/sparc/kernel/iommu_common.h |  2 --
 arch/sparc/kernel/pci_sun4v.c| 14 --
 3 files changed, 7 insertions(+), 21 deletions(-)

diff --git a/arch/sparc/kernel/iommu.c b/arch/sparc/kernel/iommu.c
index 40d008b0bd3e..0626bae5e3da 100644
--- a/arch/sparc/kernel/iommu.c
+++ b/arch/sparc/kernel/iommu.c
@@ -315,7 +315,7 @@ static dma_addr_t dma_4u_map_page(struct device *dev, 
struct page *page,
 bad_no_ctx:
if (printk_ratelimit())
WARN_ON(1);
-   return SPARC_MAPPING_ERROR;
+   return DMA_MAPPING_ERROR;
 }
 
 static void strbuf_flush(struct strbuf *strbuf, struct iommu *iommu,
@@ -548,7 +548,7 @@ static int dma_4u_map_sg(struct device *dev, struct 
scatterlist *sglist,
 
if (outcount < incount) {
outs = sg_next(outs);
-   outs->dma_address = SPARC_MAPPING_ERROR;
+   outs->dma_address = DMA_MAPPING_ERROR;
outs->dma_length = 0;
}
 
@@ -574,7 +574,7 @@ static int dma_4u_map_sg(struct device *dev, struct 
scatterlist *sglist,
iommu_tbl_range_free(&iommu->tbl, vaddr, npages,
 IOMMU_ERROR_CODE);
 
-   s->dma_address = SPARC_MAPPING_ERROR;
+   s->dma_address = DMA_MAPPING_ERROR;
s->dma_length = 0;
}
if (s == outs)
@@ -742,11 +742,6 @@ static void dma_4u_sync_sg_for_cpu(struct device *dev,
spin_unlock_irqrestore(&iommu->lock, flags);
 }
 
-static int dma_4u_mapping_error(struct device *dev, dma_addr_t dma_addr)
-{
-   return dma_addr == SPARC_MAPPING_ERROR;
-}
-
 static int dma_4u_supported(struct device *dev, u64 device_mask)
 {
struct iommu *iommu = dev->archdata.iommu;
@@ -772,7 +767,6 @@ static const struct dma_map_ops sun4u_dma_ops = {
.sync_single_for_cpu= dma_4u_sync_single_for_cpu,
.sync_sg_for_cpu= dma_4u_sync_sg_for_cpu,
.dma_supported  = dma_4u_supported,
-   .mapping_error  = dma_4u_mapping_error,
 };
 
 const struct dma_map_ops *dma_ops = &sun4u_dma_ops;
diff --git a/arch/sparc/kernel/iommu_common.h b/arch/sparc/kernel/iommu_common.h
index e3c02ba32500..d62ed9c5682d 100644
--- a/arch/sparc/kernel/iommu_common.h
+++ b/arch/sparc/kernel/iommu_common.h
@@ -48,6 +48,4 @@ static inline int is_span_boundary(unsigned long entry,
return iommu_is_span_boundary(entry, nr, shift, boundary_size);
 }
 
-#define SPARC_MAPPING_ERROR(~(dma_addr_t)0x0)
-
 #endif /* _IOMMU_COMMON_H */
diff --git a/arch/sparc/kernel/pci_sun4v.c b/arch/sparc/kernel/pci_sun4v.c
index 565d9ac883d0..fa0e42b4cbfb 100644
--- a/arch/sparc/kernel/pci_sun4v.c
+++ b/arch/sparc/kernel/pci_sun4v.c
@@ -414,12 +414,12 @@ static dma_addr_t dma_4v_map_page(struct device *dev, 
struct page *page,
 bad:
if (printk_ratelimit())
WARN_ON(1);
-   return SPARC_MAPPING_ERROR;
+   return DMA_MAPPING_ERROR;
 
 iommu_map_fail:
local_irq_restore(flags);
iommu_tbl_range_free(tbl, bus_addr, npages, IOMMU_ERROR_CODE);
-   return SPARC_MAPPING_ERROR;
+   return DMA_MAPPING_ERROR;
 }
 
 static void dma_4v_unmap_page(struct device *dev, dma_addr_t bus_addr,
@@ -592,7 +592,7 @@ static int dma_4v_map_sg(struct device *dev, struct 
scatterlist *sglist,
 
if (outcount < incount) {
outs = sg_next(outs);
-   outs->dma_address = SPARC_MAPPING_ERROR;
+   outs->dma_address = DMA_MAPPING_ERROR;
outs->dma_length = 0;
}
 
@@ -609,7 +609,7 @@ static int dma_4v_map_sg(struct device *dev, struct 
scatterlist *sglist,
iommu_tbl_range_free(tbl, vaddr, npages,
 IOMMU_ERROR_CODE);
/* XXX demap? XXX */
-   s->dma_address = SPARC_MAPPING_ERROR;
+   s->dma_address = DMA_MAPPING_ERROR;
s->dma_length = 0;
}
if (s == outs)
@@ -688,11 +688,6 @@ static int dma_4v_supported(struct device *dev, u64 
device_mask)
return pci64_dma_supported(to_pci_dev(dev), device_mask);
 }
 
-static int dma_4v_mapping_error(struct device *dev, dma_addr_t dma_addr)
-{
-   return dma_addr == SPARC_MAPPING_ERROR;
-}
-
 static const struct dma_map_ops sun4v_dma_ops = {
.alloc  = dma_4v_alloc_coherent,
.free   = dma_4v_free_coherent,
@@ -701,7 +696,6 @@ static const struct dma_map_ops sun4v_dma_ops = {
.map_sg = dma_4v_map_sg,
.unmap_sg   = dma_4v_unmap_sg,

[Xen-devel] [PATCH 04/23] powerpc/iommu: remove the mapping_error dma_map_ops method

2018-11-30 Thread Christoph Hellwig
The powerpc iommu code already returns (~(dma_addr_t)0x0) on mapping
failures, so we can switch over to returning DMA_MAPPING_ERROR and let
the core dma-mapping code handle the rest.

Signed-off-by: Christoph Hellwig 
---
 arch/powerpc/include/asm/iommu.h |  4 
 arch/powerpc/kernel/dma-iommu.c  |  6 --
 arch/powerpc/kernel/iommu.c  | 28 ++--
 arch/powerpc/platforms/cell/iommu.c  |  1 -
 arch/powerpc/platforms/pseries/vio.c |  3 +--
 5 files changed, 15 insertions(+), 27 deletions(-)

diff --git a/arch/powerpc/include/asm/iommu.h b/arch/powerpc/include/asm/iommu.h
index 35db0cbc9222..55312990d1d2 100644
--- a/arch/powerpc/include/asm/iommu.h
+++ b/arch/powerpc/include/asm/iommu.h
@@ -143,8 +143,6 @@ struct scatterlist;
 
 #ifdef CONFIG_PPC64
 
-#define IOMMU_MAPPING_ERROR(~(dma_addr_t)0x0)
-
 static inline void set_iommu_table_base(struct device *dev,
struct iommu_table *base)
 {
@@ -242,8 +240,6 @@ static inline int __init tce_iommu_bus_notifier_init(void)
 }
 #endif /* !CONFIG_IOMMU_API */
 
-int dma_iommu_mapping_error(struct device *dev, dma_addr_t dma_addr);
-
 #else
 
 static inline void *get_iommu_table_base(struct device *dev)
diff --git a/arch/powerpc/kernel/dma-iommu.c b/arch/powerpc/kernel/dma-iommu.c
index f9fe2080ceb9..5ebacf0fe41a 100644
--- a/arch/powerpc/kernel/dma-iommu.c
+++ b/arch/powerpc/kernel/dma-iommu.c
@@ -106,11 +106,6 @@ static u64 dma_iommu_get_required_mask(struct device *dev)
return mask;
 }
 
-int dma_iommu_mapping_error(struct device *dev, dma_addr_t dma_addr)
-{
-   return dma_addr == IOMMU_MAPPING_ERROR;
-}
-
 struct dma_map_ops dma_iommu_ops = {
.alloc  = dma_iommu_alloc_coherent,
.free   = dma_iommu_free_coherent,
@@ -121,6 +116,5 @@ struct dma_map_ops dma_iommu_ops = {
.map_page   = dma_iommu_map_page,
.unmap_page = dma_iommu_unmap_page,
.get_required_mask  = dma_iommu_get_required_mask,
-   .mapping_error  = dma_iommu_mapping_error,
 };
 EXPORT_SYMBOL(dma_iommu_ops);
diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
index f0dc680e659a..ca7f73488c62 100644
--- a/arch/powerpc/kernel/iommu.c
+++ b/arch/powerpc/kernel/iommu.c
@@ -197,11 +197,11 @@ static unsigned long iommu_range_alloc(struct device *dev,
if (unlikely(npages == 0)) {
if (printk_ratelimit())
WARN_ON(1);
-   return IOMMU_MAPPING_ERROR;
+   return DMA_MAPPING_ERROR;
}
 
if (should_fail_iommu(dev))
-   return IOMMU_MAPPING_ERROR;
+   return DMA_MAPPING_ERROR;
 
/*
 * We don't need to disable preemption here because any CPU can
@@ -277,7 +277,7 @@ static unsigned long iommu_range_alloc(struct device *dev,
} else {
/* Give up */
spin_unlock_irqrestore(&(pool->lock), flags);
-   return IOMMU_MAPPING_ERROR;
+   return DMA_MAPPING_ERROR;
}
}
 
@@ -309,13 +309,13 @@ static dma_addr_t iommu_alloc(struct device *dev, struct 
iommu_table *tbl,
  unsigned long attrs)
 {
unsigned long entry;
-   dma_addr_t ret = IOMMU_MAPPING_ERROR;
+   dma_addr_t ret = DMA_MAPPING_ERROR;
int build_fail;
 
entry = iommu_range_alloc(dev, tbl, npages, NULL, mask, align_order);
 
-   if (unlikely(entry == IOMMU_MAPPING_ERROR))
-   return IOMMU_MAPPING_ERROR;
+   if (unlikely(entry == DMA_MAPPING_ERROR))
+   return DMA_MAPPING_ERROR;
 
entry += tbl->it_offset;/* Offset into real TCE table */
ret = entry << tbl->it_page_shift;  /* Set the return dma address */
@@ -327,12 +327,12 @@ static dma_addr_t iommu_alloc(struct device *dev, struct 
iommu_table *tbl,
 
/* tbl->it_ops->set() only returns non-zero for transient errors.
 * Clean up the table bitmap in this case and return
-* IOMMU_MAPPING_ERROR. For all other errors the functionality is
+* DMA_MAPPING_ERROR. For all other errors the functionality is
 * not altered.
 */
if (unlikely(build_fail)) {
__iommu_free(tbl, ret, npages);
-   return IOMMU_MAPPING_ERROR;
+   return DMA_MAPPING_ERROR;
}
 
/* Flush/invalidate TLB caches if necessary */
@@ -477,7 +477,7 @@ int ppc_iommu_map_sg(struct device *dev, struct iommu_table 
*tbl,
DBG("  - vaddr: %lx, size: %lx\n", vaddr, slen);
 
/* Handle failure */
-   if (unlikely(entry == IOMMU_MAPPING_ERROR)) {
+   if (unlikely(entry == DMA_MAPPING_ERROR)) {
if (!(attrs & DMA_ATTR_NO_WARN) &&
printk_ratelimit())
  

[Xen-devel] [PATCH 10/23] arm64: remove the dummy_dma_ops mapping_error method

2018-11-30 Thread Christoph Hellwig
Just return DMA_MAPPING_ERROR from __dummy_map_page and let the core
dma-mapping code handle the rest.

Signed-off-by: Christoph Hellwig 
---
 arch/arm64/mm/dma-mapping.c | 8 +---
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index a3ac26284845..fdc26ea5036c 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -282,7 +282,7 @@ static dma_addr_t __dummy_map_page(struct device *dev, 
struct page *page,
   enum dma_data_direction dir,
   unsigned long attrs)
 {
-   return 0;
+   return DMA_MAPPING_ERROR;
 }
 
 static void __dummy_unmap_page(struct device *dev, dma_addr_t dev_addr,
@@ -317,11 +317,6 @@ static void __dummy_sync_sg(struct device *dev,
 {
 }
 
-static int __dummy_mapping_error(struct device *hwdev, dma_addr_t dma_addr)
-{
-   return 1;
-}
-
 static int __dummy_dma_supported(struct device *hwdev, u64 mask)
 {
return 0;
@@ -339,7 +334,6 @@ const struct dma_map_ops dummy_dma_ops = {
.sync_single_for_device = __dummy_sync_single,
.sync_sg_for_cpu= __dummy_sync_sg,
.sync_sg_for_device = __dummy_sync_sg,
-   .mapping_error  = __dummy_mapping_error,
.dma_supported  = __dummy_dma_supported,
 };
 EXPORT_SYMBOL(dummy_dma_ops);
-- 
2.19.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 15/23] x86/amd_gart: remove the mapping_error dma_map_ops method

2018-11-30 Thread Christoph Hellwig
Return DMA_MAPPING_ERROR instead of the magic bad_dma_addr on a dma
mapping failure and let the core dma-mapping code handle the rest.

Remove the magic EMERGENCY_PAGES that the bad_dma_addr gets redirected to.

Signed-off-by: Christoph Hellwig 
---
 arch/x86/kernel/amd_gart_64.c | 39 ++-
 1 file changed, 6 insertions(+), 33 deletions(-)

diff --git a/arch/x86/kernel/amd_gart_64.c b/arch/x86/kernel/amd_gart_64.c
index 3f9d1b4019bb..4e733de93f41 100644
--- a/arch/x86/kernel/amd_gart_64.c
+++ b/arch/x86/kernel/amd_gart_64.c
@@ -50,8 +50,6 @@ static unsigned long iommu_pages; /* .. and in pages */
 
 static u32 *iommu_gatt_base;   /* Remapping table */
 
-static dma_addr_t bad_dma_addr;
-
 /*
  * If this is disabled the IOMMU will use an optimized flushing strategy
  * of only flushing when an mapping is reused. With it true the GART is
@@ -74,8 +72,6 @@ static u32 gart_unmapped_entry;
(((x) & 0xf000) | (((x) >> 32) << 4) | GPTE_VALID | GPTE_COHERENT)
 #define GPTE_DECODE(x) (((x) & 0xf000) | (((u64)(x) & 0xff0) << 28))
 
-#define EMERGENCY_PAGES 32 /* = 128KB */
-
 #ifdef CONFIG_AGP
 #define AGPEXTERN extern
 #else
@@ -184,14 +180,6 @@ static void iommu_full(struct device *dev, size_t size, 
int dir)
 */
 
dev_err(dev, "PCI-DMA: Out of IOMMU space for %lu bytes\n", size);
-
-   if (size > PAGE_SIZE*EMERGENCY_PAGES) {
-   if (dir == PCI_DMA_FROMDEVICE || dir == PCI_DMA_BIDIRECTIONAL)
-   panic("PCI-DMA: Memory would be corrupted\n");
-   if (dir == PCI_DMA_TODEVICE || dir == PCI_DMA_BIDIRECTIONAL)
-   panic(KERN_ERR
-   "PCI-DMA: Random memory would be DMAed\n");
-   }
 #ifdef CONFIG_IOMMU_LEAK
dump_leak();
 #endif
@@ -220,7 +208,7 @@ static dma_addr_t dma_map_area(struct device *dev, 
dma_addr_t phys_mem,
int i;
 
if (unlikely(phys_mem + size > GART_MAX_PHYS_ADDR))
-   return bad_dma_addr;
+   return DMA_MAPPING_ERROR;
 
iommu_page = alloc_iommu(dev, npages, align_mask);
if (iommu_page == -1) {
@@ -229,7 +217,7 @@ static dma_addr_t dma_map_area(struct device *dev, 
dma_addr_t phys_mem,
if (panic_on_overflow)
panic("dma_map_area overflow %lu bytes\n", size);
iommu_full(dev, size, dir);
-   return bad_dma_addr;
+   return DMA_MAPPING_ERROR;
}
 
for (i = 0; i < npages; i++) {
@@ -271,7 +259,7 @@ static void gart_unmap_page(struct device *dev, dma_addr_t 
dma_addr,
int npages;
int i;
 
-   if (dma_addr < iommu_bus_base + EMERGENCY_PAGES*PAGE_SIZE ||
+   if (dma_addr == DMA_MAPPING_ERROR ||
dma_addr >= iommu_bus_base + iommu_size)
return;
 
@@ -315,7 +303,7 @@ static int dma_map_sg_nonforce(struct device *dev, struct 
scatterlist *sg,
 
if (nonforced_iommu(dev, addr, s->length)) {
addr = dma_map_area(dev, addr, s->length, dir, 0);
-   if (addr == bad_dma_addr) {
+   if (addr == DMA_MAPPING_ERROR) {
if (i > 0)
gart_unmap_sg(dev, sg, i, dir, 0);
nents = 0;
@@ -471,7 +459,7 @@ static int gart_map_sg(struct device *dev, struct 
scatterlist *sg, int nents,
 
iommu_full(dev, pages << PAGE_SHIFT, dir);
for_each_sg(sg, s, nents, i)
-   s->dma_address = bad_dma_addr;
+   s->dma_address = DMA_MAPPING_ERROR;
return 0;
 }
 
@@ -490,7 +478,7 @@ gart_alloc_coherent(struct device *dev, size_t size, 
dma_addr_t *dma_addr,
*dma_addr = dma_map_area(dev, virt_to_phys(vaddr), size,
DMA_BIDIRECTIONAL, (1UL << get_order(size)) - 1);
flush_gart();
-   if (unlikely(*dma_addr == bad_dma_addr))
+   if (unlikely(*dma_addr == DMA_MAPPING_ERROR))
goto out_free;
return vaddr;
 out_free:
@@ -507,11 +495,6 @@ gart_free_coherent(struct device *dev, size_t size, void 
*vaddr,
dma_direct_free_pages(dev, size, vaddr, dma_addr, attrs);
 }
 
-static int gart_mapping_error(struct device *dev, dma_addr_t dma_addr)
-{
-   return (dma_addr == bad_dma_addr);
-}
-
 static int no_agp;
 
 static __init unsigned long check_iommu_size(unsigned long aper, u64 aper_size)
@@ -695,7 +678,6 @@ static const struct dma_map_ops gart_dma_ops = {
.unmap_page = gart_unmap_page,
.alloc  = gart_alloc_coherent,
.free   = gart_free_coherent,
-   .mapping_error  = gart_mapping_error,
.dma_supported  = dma_direct_supported,
 };
 
@@ -784,19 +766,12 @@ int __init gart_iommu_init(void)
}
 #endif
 
-   /*
-* Out of IOMMU space handling.

[Xen-devel] [PATCH 12/23] ia64/sba_iommu: improve internal map_page users

2018-11-30 Thread Christoph Hellwig
Remove the odd sba_{un,}map_single_attrs wrappers, check errors
everywhere.

Signed-off-by: Christoph Hellwig 
---
 arch/ia64/hp/common/sba_iommu.c | 73 +
 1 file changed, 29 insertions(+), 44 deletions(-)

diff --git a/arch/ia64/hp/common/sba_iommu.c b/arch/ia64/hp/common/sba_iommu.c
index e8a93b07283e..c56f28c98102 100644
--- a/arch/ia64/hp/common/sba_iommu.c
+++ b/arch/ia64/hp/common/sba_iommu.c
@@ -907,11 +907,12 @@ sba_mark_invalid(struct ioc *ioc, dma_addr_t iova, size_t 
byte_cnt)
 }
 
 /**
- * sba_map_single_attrs - map one buffer and return IOVA for DMA
+ * sba_map_page - map one buffer and return IOVA for DMA
  * @dev: instance of PCI owned by the driver that's asking.
- * @addr:  driver buffer to map.
- * @size:  number of bytes to map in driver buffer.
- * @dir:  R/W or both.
+ * @page: page to map
+ * @poff: offset into page
+ * @size: number of bytes to map
+ * @dir: dma direction
  * @attrs: optional dma attributes
  *
  * See Documentation/DMA-API-HOWTO.txt
@@ -944,7 +945,7 @@ static dma_addr_t sba_map_page(struct device *dev, struct 
page *page,
** Device is bit capable of DMA'ing to the buffer...
** just return the PCI address of ptr
*/
-   DBG_BYPASS("sba_map_single_attrs() bypass mask/addr: "
+   DBG_BYPASS("sba_map_page() bypass mask/addr: "
   "0x%lx/0x%lx\n",
   to_pci_dev(dev)->dma_mask, pci_addr);
return pci_addr;
@@ -966,7 +967,7 @@ static dma_addr_t sba_map_page(struct device *dev, struct 
page *page,
 
 #ifdef ASSERT_PDIR_SANITY
spin_lock_irqsave(&ioc->res_lock, flags);
-   if (sba_check_pdir(ioc,"Check before sba_map_single_attrs()"))
+   if (sba_check_pdir(ioc,"Check before sba_map_page()"))
panic("Sanity check failed");
spin_unlock_irqrestore(&ioc->res_lock, flags);
 #endif
@@ -997,20 +998,12 @@ static dma_addr_t sba_map_page(struct device *dev, struct 
page *page,
/* form complete address */
 #ifdef ASSERT_PDIR_SANITY
spin_lock_irqsave(&ioc->res_lock, flags);
-   sba_check_pdir(ioc,"Check after sba_map_single_attrs()");
+   sba_check_pdir(ioc,"Check after sba_map_page()");
spin_unlock_irqrestore(&ioc->res_lock, flags);
 #endif
return SBA_IOVA(ioc, iovp, offset);
 }
 
-static dma_addr_t sba_map_single_attrs(struct device *dev, void *addr,
-  size_t size, enum dma_data_direction dir,
-  unsigned long attrs)
-{
-   return sba_map_page(dev, virt_to_page(addr),
-   (unsigned long)addr & ~PAGE_MASK, size, dir, attrs);
-}
-
 #ifdef ENABLE_MARK_CLEAN
 static SBA_INLINE void
 sba_mark_clean(struct ioc *ioc, dma_addr_t iova, size_t size)
@@ -1036,7 +1029,7 @@ sba_mark_clean(struct ioc *ioc, dma_addr_t iova, size_t 
size)
 #endif
 
 /**
- * sba_unmap_single_attrs - unmap one IOVA and free resources
+ * sba_unmap_page - unmap one IOVA and free resources
  * @dev: instance of PCI owned by the driver that's asking.
  * @iova:  IOVA of driver buffer previously mapped.
  * @size:  number of bytes mapped in driver buffer.
@@ -1063,7 +1056,7 @@ static void sba_unmap_page(struct device *dev, dma_addr_t 
iova, size_t size,
/*
** Address does not fall w/in IOVA, must be bypassing
*/
-   DBG_BYPASS("sba_unmap_single_attrs() bypass addr: 0x%lx\n",
+   DBG_BYPASS("sba_unmap_page() bypass addr: 0x%lx\n",
   iova);
 
 #ifdef ENABLE_MARK_CLEAN
@@ -1114,12 +1107,6 @@ static void sba_unmap_page(struct device *dev, 
dma_addr_t iova, size_t size,
 #endif /* DELAYED_RESOURCE_CNT == 0 */
 }
 
-void sba_unmap_single_attrs(struct device *dev, dma_addr_t iova, size_t size,
-   enum dma_data_direction dir, unsigned long attrs)
-{
-   sba_unmap_page(dev, iova, size, dir, attrs);
-}
-
 /**
  * sba_alloc_coherent - allocate/map shared mem for DMA
  * @dev: instance of PCI owned by the driver that's asking.
@@ -1132,30 +1119,24 @@ static void *
 sba_alloc_coherent(struct device *dev, size_t size, dma_addr_t *dma_handle,
   gfp_t flags, unsigned long attrs)
 {
+   struct page *page;
struct ioc *ioc;
+   int node = -1;
void *addr;
 
ioc = GET_IOC(dev);
ASSERT(ioc);
-
 #ifdef CONFIG_NUMA
-   {
-   struct page *page;
-
-   page = alloc_pages_node(ioc->node, flags, get_order(size));
-   if (unlikely(!page))
-   return NULL;
-
-   addr = page_address(page);
-   }
-#else
-   addr = (void *) __get_free_pages(flags, get_order(size));
+   node = ioc->node;
 #endif
-   if (unlikely(!addr))
+
+   page = alloc_pages_node(node, flags, get_order(size));
+   if (unlikely(!page))
return NULL;
 
+   a

[Xen-devel] [PATCH 05/23] mips/jazz: remove the mapping_error dma_map_ops method

2018-11-30 Thread Christoph Hellwig
The Jazz iommu code already returns (~(dma_addr_t)0x0) on mapping
failures, so we can switch over to returning DMA_MAPPING_ERROR and
let the core dma-mapping code handle the rest.

Signed-off-by: Christoph Hellwig 
---
 arch/mips/include/asm/jazzdma.h |  6 --
 arch/mips/jazz/jazzdma.c| 16 +---
 2 files changed, 5 insertions(+), 17 deletions(-)

diff --git a/arch/mips/include/asm/jazzdma.h b/arch/mips/include/asm/jazzdma.h
index d913439c738c..d13f940022d5 100644
--- a/arch/mips/include/asm/jazzdma.h
+++ b/arch/mips/include/asm/jazzdma.h
@@ -39,12 +39,6 @@ extern int vdma_get_enable(int channel);
 #define VDMA_PAGE(a)   ((unsigned int)(a) >> 12)
 #define VDMA_OFFSET(a) ((unsigned int)(a) & (VDMA_PAGESIZE-1))
 
-/*
- * error code returned by vdma_alloc()
- * (See also arch/mips/kernel/jazzdma.c)
- */
-#define VDMA_ERROR 0x
-
 /*
  * VDMA pagetable entry description
  */
diff --git a/arch/mips/jazz/jazzdma.c b/arch/mips/jazz/jazzdma.c
index 4c41ed0a637e..6256d35dbf4d 100644
--- a/arch/mips/jazz/jazzdma.c
+++ b/arch/mips/jazz/jazzdma.c
@@ -104,12 +104,12 @@ unsigned long vdma_alloc(unsigned long paddr, unsigned 
long size)
if (vdma_debug)
printk("vdma_alloc: Invalid physical address: %08lx\n",
   paddr);
-   return VDMA_ERROR;  /* invalid physical address */
+   return DMA_MAPPING_ERROR;   /* invalid physical address */
}
if (size > 0x40 || size == 0) {
if (vdma_debug)
printk("vdma_alloc: Invalid size: %08lx\n", size);
-   return VDMA_ERROR;  /* invalid physical address */
+   return DMA_MAPPING_ERROR;   /* invalid physical address */
}
 
spin_lock_irqsave(&vdma_lock, flags);
@@ -123,7 +123,7 @@ unsigned long vdma_alloc(unsigned long paddr, unsigned long 
size)
   first < VDMA_PGTBL_ENTRIES) first++;
if (first + pages > VDMA_PGTBL_ENTRIES) {   /* nothing free 
*/
spin_unlock_irqrestore(&vdma_lock, flags);
-   return VDMA_ERROR;
+   return DMA_MAPPING_ERROR;
}
 
last = first + 1;
@@ -569,7 +569,7 @@ static void *jazz_dma_alloc(struct device *dev, size_t size,
return NULL;
 
*dma_handle = vdma_alloc(virt_to_phys(ret), size);
-   if (*dma_handle == VDMA_ERROR) {
+   if (*dma_handle == DMA_MAPPING_ERROR) {
dma_direct_free_pages(dev, size, ret, *dma_handle, attrs);
return NULL;
}
@@ -620,7 +620,7 @@ static int jazz_dma_map_sg(struct device *dev, struct 
scatterlist *sglist,
arch_sync_dma_for_device(dev, sg_phys(sg), sg->length,
dir);
sg->dma_address = vdma_alloc(sg_phys(sg), sg->length);
-   if (sg->dma_address == VDMA_ERROR)
+   if (sg->dma_address == DMA_MAPPING_ERROR)
return 0;
sg_dma_len(sg) = sg->length;
}
@@ -674,11 +674,6 @@ static void jazz_dma_sync_sg_for_cpu(struct device *dev,
arch_sync_dma_for_cpu(dev, sg_phys(sg), sg->length, dir);
 }
 
-static int jazz_dma_mapping_error(struct device *dev, dma_addr_t dma_addr)
-{
-   return dma_addr == VDMA_ERROR;
-}
-
 const struct dma_map_ops jazz_dma_ops = {
.alloc  = jazz_dma_alloc,
.free   = jazz_dma_free,
@@ -692,6 +687,5 @@ const struct dma_map_ops jazz_dma_ops = {
.sync_sg_for_device = jazz_dma_sync_sg_for_device,
.dma_supported  = dma_direct_supported,
.cache_sync = arch_dma_cache_sync,
-   .mapping_error  = jazz_dma_mapping_error,
 };
 EXPORT_SYMBOL(jazz_dma_ops);
-- 
2.19.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 13/23] ia64/sba_iommu: remove the mapping_error dma_map_ops method

2018-11-30 Thread Christoph Hellwig
Return DMA_MAPPING_ERROR instead of 0 on a dma mapping failure and let
the core dma-mapping code handle the rest.

Signed-off-by: Christoph Hellwig 
---
 arch/ia64/hp/common/sba_iommu.c | 8 +---
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/arch/ia64/hp/common/sba_iommu.c b/arch/ia64/hp/common/sba_iommu.c
index c56f28c98102..0d21c0b5b23d 100644
--- a/arch/ia64/hp/common/sba_iommu.c
+++ b/arch/ia64/hp/common/sba_iommu.c
@@ -974,7 +974,7 @@ static dma_addr_t sba_map_page(struct device *dev, struct 
page *page,
 
pide = sba_alloc_range(ioc, dev, size);
if (pide < 0)
-   return 0;
+   return DMA_MAPPING_ERROR;
 
iovp = (dma_addr_t) pide << iovp_shift;
 
@@ -2155,11 +2155,6 @@ static int sba_dma_supported (struct device *dev, u64 
mask)
return ((mask & 0xUL) == 0xUL);
 }
 
-static int sba_dma_mapping_error(struct device *dev, dma_addr_t dma_addr)
-{
-   return 0;
-}
-
 __setup("nosbagart", nosbagart);
 
 static int __init
@@ -2193,7 +2188,6 @@ const struct dma_map_ops sba_dma_ops = {
.map_sg = sba_map_sg_attrs,
.unmap_sg   = sba_unmap_sg_attrs,
.dma_supported  = sba_dma_supported,
-   .mapping_error  = sba_dma_mapping_error,
 };
 
 void sba_dma_init(void)
-- 
2.19.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 14/23] ia64/sn: remove the mapping_error dma_map_ops method

2018-11-30 Thread Christoph Hellwig
Return DMA_MAPPING_ERROR instead of 0 on a dma mapping failure and let
the core dma-mapping code handle the rest.

Signed-off-by: Christoph Hellwig 
---
 arch/ia64/sn/pci/pci_dma.c | 8 +---
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/arch/ia64/sn/pci/pci_dma.c b/arch/ia64/sn/pci/pci_dma.c
index 4ce4ee4ef9f2..b7d42e4edc1f 100644
--- a/arch/ia64/sn/pci/pci_dma.c
+++ b/arch/ia64/sn/pci/pci_dma.c
@@ -196,7 +196,7 @@ static dma_addr_t sn_dma_map_page(struct device *dev, 
struct page *page,
 
if (!dma_addr) {
printk(KERN_ERR "%s: out of ATEs\n", __func__);
-   return 0;
+   return DMA_MAPPING_ERROR;
}
return dma_addr;
 }
@@ -314,11 +314,6 @@ static int sn_dma_map_sg(struct device *dev, struct 
scatterlist *sgl,
return nhwentries;
 }
 
-static int sn_dma_mapping_error(struct device *dev, dma_addr_t dma_addr)
-{
-   return 0;
-}
-
 static u64 sn_dma_get_required_mask(struct device *dev)
 {
return DMA_BIT_MASK(64);
@@ -441,7 +436,6 @@ static struct dma_map_ops sn_dma_ops = {
.unmap_page = sn_dma_unmap_page,
.map_sg = sn_dma_map_sg,
.unmap_sg   = sn_dma_unmap_sg,
-   .mapping_error  = sn_dma_mapping_error,
.dma_supported  = sn_dma_supported,
.get_required_mask  = sn_dma_get_required_mask,
 };
-- 
2.19.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 09/23] parisc/sba_iommu: remove the mapping_error dma_map_ops method

2018-11-30 Thread Christoph Hellwig
The SBA iommu code already returns (~(dma_addr_t)0x0) on mapping
failures, so we can switch over to returning DMA_MAPPING_ERROR and let
the core dma-mapping code handle the rest.

Signed-off-by: Christoph Hellwig 
---
 drivers/parisc/sba_iommu.c | 10 +-
 1 file changed, 1 insertion(+), 9 deletions(-)

diff --git a/drivers/parisc/sba_iommu.c b/drivers/parisc/sba_iommu.c
index c1e599a429af..452d306ce5cb 100644
--- a/drivers/parisc/sba_iommu.c
+++ b/drivers/parisc/sba_iommu.c
@@ -93,8 +93,6 @@
 
 #define DEFAULT_DMA_HINT_REG   0
 
-#define SBA_MAPPING_ERROR(~(dma_addr_t)0)
-
 struct sba_device *sba_list;
 EXPORT_SYMBOL_GPL(sba_list);
 
@@ -725,7 +723,7 @@ sba_map_single(struct device *dev, void *addr, size_t size,
 
ioc = GET_IOC(dev);
if (!ioc)
-   return SBA_MAPPING_ERROR;
+   return DMA_MAPPING_ERROR;
 
/* save offset bits */
offset = ((dma_addr_t) (long) addr) & ~IOVP_MASK;
@@ -1080,11 +1078,6 @@ sba_unmap_sg(struct device *dev, struct scatterlist 
*sglist, int nents,
 
 }
 
-static int sba_mapping_error(struct device *dev, dma_addr_t dma_addr)
-{
-   return dma_addr == SBA_MAPPING_ERROR;
-}
-
 static const struct dma_map_ops sba_ops = {
.dma_supported =sba_dma_supported,
.alloc =sba_alloc,
@@ -1093,7 +1086,6 @@ static const struct dma_map_ops sba_ops = {
.unmap_page =   sba_unmap_page,
.map_sg =   sba_map_sg,
.unmap_sg = sba_unmap_sg,
-   .mapping_error =sba_mapping_error,
 };
 
 
-- 
2.19.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 02/23] dma-direct: remove the mapping_error dma_map_ops method

2018-11-30 Thread Christoph Hellwig
The dma-direct code already returns (~(dma_addr_t)0x0) on mapping
failures, so we can switch over to returning DMA_MAPPING_ERROR and let
the core dma-mapping code handle the rest.

Signed-off-by: Christoph Hellwig 
---
 arch/powerpc/kernel/dma-swiotlb.c |  1 -
 include/linux/dma-direct.h|  3 ---
 kernel/dma/direct.c   |  8 +---
 kernel/dma/swiotlb.c  | 11 +--
 4 files changed, 6 insertions(+), 17 deletions(-)

diff --git a/arch/powerpc/kernel/dma-swiotlb.c 
b/arch/powerpc/kernel/dma-swiotlb.c
index 5fc335f4d9cd..3d8df2cf8be9 100644
--- a/arch/powerpc/kernel/dma-swiotlb.c
+++ b/arch/powerpc/kernel/dma-swiotlb.c
@@ -59,7 +59,6 @@ const struct dma_map_ops powerpc_swiotlb_dma_ops = {
.sync_single_for_device = swiotlb_sync_single_for_device,
.sync_sg_for_cpu = swiotlb_sync_sg_for_cpu,
.sync_sg_for_device = swiotlb_sync_sg_for_device,
-   .mapping_error = dma_direct_mapping_error,
.get_required_mask = swiotlb_powerpc_get_required,
 };
 
diff --git a/include/linux/dma-direct.h b/include/linux/dma-direct.h
index 9e66bfe369aa..e7600f92d876 100644
--- a/include/linux/dma-direct.h
+++ b/include/linux/dma-direct.h
@@ -5,8 +5,6 @@
 #include 
 #include 
 
-#define DIRECT_MAPPING_ERROR   (~(dma_addr_t)0)
-
 #ifdef CONFIG_ARCH_HAS_PHYS_TO_DMA
 #include 
 #else
@@ -73,5 +71,4 @@ dma_addr_t dma_direct_map_page(struct device *dev, struct 
page *page,
 int dma_direct_map_sg(struct device *dev, struct scatterlist *sgl, int nents,
enum dma_data_direction dir, unsigned long attrs);
 int dma_direct_supported(struct device *dev, u64 mask);
-int dma_direct_mapping_error(struct device *dev, dma_addr_t dma_addr);
 #endif /* _LINUX_DMA_DIRECT_H */
diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
index 22a12ab5a5e9..d4335a03193a 100644
--- a/kernel/dma/direct.c
+++ b/kernel/dma/direct.c
@@ -265,7 +265,7 @@ dma_addr_t dma_direct_map_page(struct device *dev, struct 
page *page,
dma_addr_t dma_addr = phys_to_dma(dev, phys);
 
if (!check_addr(dev, dma_addr, size, __func__))
-   return DIRECT_MAPPING_ERROR;
+   return DMA_MAPPING_ERROR;
 
if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
dma_direct_sync_single_for_device(dev, dma_addr, size, dir);
@@ -312,11 +312,6 @@ int dma_direct_supported(struct device *dev, u64 mask)
return mask >= phys_to_dma(dev, min_mask);
 }
 
-int dma_direct_mapping_error(struct device *dev, dma_addr_t dma_addr)
-{
-   return dma_addr == DIRECT_MAPPING_ERROR;
-}
-
 const struct dma_map_ops dma_direct_ops = {
.alloc  = dma_direct_alloc,
.free   = dma_direct_free,
@@ -335,7 +330,6 @@ const struct dma_map_ops dma_direct_ops = {
 #endif
.get_required_mask  = dma_direct_get_required_mask,
.dma_supported  = dma_direct_supported,
-   .mapping_error  = dma_direct_mapping_error,
.cache_sync = arch_dma_cache_sync,
 };
 EXPORT_SYMBOL(dma_direct_ops);
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 045930e32c0e..ff1ce81bb623 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -631,21 +631,21 @@ static dma_addr_t swiotlb_bounce_page(struct device *dev, 
phys_addr_t *phys,
if (unlikely(swiotlb_force == SWIOTLB_NO_FORCE)) {
dev_warn_ratelimited(dev,
"Cannot do DMA to address %pa\n", phys);
-   return DIRECT_MAPPING_ERROR;
+   return DMA_MAPPING_ERROR;
}
 
/* Oh well, have to allocate and map a bounce buffer. */
*phys = swiotlb_tbl_map_single(dev, __phys_to_dma(dev, io_tlb_start),
*phys, size, dir, attrs);
if (*phys == SWIOTLB_MAP_ERROR)
-   return DIRECT_MAPPING_ERROR;
+   return DMA_MAPPING_ERROR;
 
/* Ensure that the address returned is DMA'ble */
dma_addr = __phys_to_dma(dev, *phys);
if (unlikely(!dma_capable(dev, dma_addr, size))) {
swiotlb_tbl_unmap_single(dev, *phys, size, dir,
attrs | DMA_ATTR_SKIP_CPU_SYNC);
-   return DIRECT_MAPPING_ERROR;
+   return DMA_MAPPING_ERROR;
}
 
return dma_addr;
@@ -680,7 +680,7 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page 
*page,
 
if (!dev_is_dma_coherent(dev) &&
(attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0 &&
-   dev_addr != DIRECT_MAPPING_ERROR)
+   dev_addr != DMA_MAPPING_ERROR)
arch_sync_dma_for_device(dev, phys, size, dir);
 
return dev_addr;
@@ -789,7 +789,7 @@ swiotlb_map_sg_attrs(struct device *dev, struct scatterlist 
*sgl, int nelems,
for_each_sg(sgl, sg, nelems, i) {
sg->dma_address = swiotlb_map_page(dev, sg_page(sg), sg->offset,
sg->length, dir, attrs);
-   if (sg->dma_address == DIRECT

[Xen-devel] [PATCH 11/23] alpha: remove the mapping_error dma_map_ops method

2018-11-30 Thread Christoph Hellwig
Return DMA_MAPPING_ERROR instead of 0 on a dma mapping failure and let
the core dma-mapping code handle the rest.

Signed-off-by: Christoph Hellwig 
---
 arch/alpha/kernel/pci_iommu.c | 14 --
 1 file changed, 4 insertions(+), 10 deletions(-)

diff --git a/arch/alpha/kernel/pci_iommu.c b/arch/alpha/kernel/pci_iommu.c
index 46e08e0d9181..e1716e0d92fd 100644
--- a/arch/alpha/kernel/pci_iommu.c
+++ b/arch/alpha/kernel/pci_iommu.c
@@ -291,7 +291,7 @@ pci_map_single_1(struct pci_dev *pdev, void *cpu_addr, 
size_t size,
   use direct_map above, it now must be considered an error. */
if (! alpha_mv.mv_pci_tbi) {
printk_once(KERN_WARNING "pci_map_single: no HW sg\n");
-   return 0;
+   return DMA_MAPPING_ERROR;
}
 
arena = hose->sg_pci;
@@ -307,7 +307,7 @@ pci_map_single_1(struct pci_dev *pdev, void *cpu_addr, 
size_t size,
if (dma_ofs < 0) {
printk(KERN_WARNING "pci_map_single failed: "
   "could not allocate dma page tables\n");
-   return 0;
+   return DMA_MAPPING_ERROR;
}
 
paddr &= PAGE_MASK;
@@ -455,7 +455,7 @@ static void *alpha_pci_alloc_coherent(struct device *dev, 
size_t size,
memset(cpu_addr, 0, size);
 
*dma_addrp = pci_map_single_1(pdev, cpu_addr, size, 0);
-   if (*dma_addrp == 0) {
+   if (*dma_addrp == DMA_MAPPING_ERROR) {
free_pages((unsigned long)cpu_addr, order);
if (alpha_mv.mv_pci_tbi || (gfp & GFP_DMA))
return NULL;
@@ -671,7 +671,7 @@ static int alpha_pci_map_sg(struct device *dev, struct 
scatterlist *sg,
sg->dma_address
  = pci_map_single_1(pdev, SG_ENT_VIRT_ADDRESS(sg),
 sg->length, dac_allowed);
-   return sg->dma_address != 0;
+   return sg->dma_address != DMA_MAPPING_ERROR;
}
 
start = sg;
@@ -935,11 +935,6 @@ iommu_unbind(struct pci_iommu_arena *arena, long pg_start, 
long pg_count)
return 0;
 }
 
-static int alpha_pci_mapping_error(struct device *dev, dma_addr_t dma_addr)
-{
-   return dma_addr == 0;
-}
-
 const struct dma_map_ops alpha_pci_ops = {
.alloc  = alpha_pci_alloc_coherent,
.free   = alpha_pci_free_coherent,
@@ -947,7 +942,6 @@ const struct dma_map_ops alpha_pci_ops = {
.unmap_page = alpha_pci_unmap_page,
.map_sg = alpha_pci_map_sg,
.unmap_sg   = alpha_pci_unmap_sg,
-   .mapping_error  = alpha_pci_mapping_error,
.dma_supported  = alpha_pci_supported,
 };
 EXPORT_SYMBOL(alpha_pci_ops);
-- 
2.19.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 20/23] iommu/dma-iommu: remove the mapping_error dma_map_ops method

2018-11-30 Thread Christoph Hellwig
Return DMA_MAPPING_ERROR instead of 0 on a dma mapping failure and let
the core dma-mapping code handle the rest.

Signed-off-by: Christoph Hellwig 
---
 arch/arm64/mm/dma-mapping.c |  7 +++
 drivers/iommu/dma-iommu.c   | 23 ---
 include/linux/dma-iommu.h   |  1 -
 3 files changed, 11 insertions(+), 20 deletions(-)

diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index fdc26ea5036c..4cc70029cf8d 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -397,7 +397,7 @@ static void *__iommu_alloc_attrs(struct device *dev, size_t 
size,
return NULL;
 
*handle = iommu_dma_map_page(dev, page, 0, iosize, ioprot);
-   if (iommu_dma_mapping_error(dev, *handle)) {
+   if (*handle == DMA_MAPPING_ERROR) {
if (coherent)
__free_pages(page, get_order(size));
else
@@ -414,7 +414,7 @@ static void *__iommu_alloc_attrs(struct device *dev, size_t 
size,
return NULL;
 
*handle = iommu_dma_map_page(dev, page, 0, iosize, ioprot);
-   if (iommu_dma_mapping_error(dev, *handle)) {
+   if (*handle == DMA_MAPPING_ERROR) {
dma_release_from_contiguous(dev, page,
size >> PAGE_SHIFT);
return NULL;
@@ -574,7 +574,7 @@ static dma_addr_t __iommu_map_page(struct device *dev, 
struct page *page,
dma_addr_t dev_addr = iommu_dma_map_page(dev, page, offset, size, prot);
 
if (!coherent && !(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
-   !iommu_dma_mapping_error(dev, dev_addr))
+   dev_addr != DMA_MAPPING_ERROR)
__dma_map_area(page_address(page) + offset, size, dir);
 
return dev_addr;
@@ -657,7 +657,6 @@ static const struct dma_map_ops iommu_dma_ops = {
.sync_sg_for_device = __iommu_sync_sg_for_device,
.map_resource = iommu_dma_map_resource,
.unmap_resource = iommu_dma_unmap_resource,
-   .mapping_error = iommu_dma_mapping_error,
 };
 
 static int __init __iommu_dma_init(void)
diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index d1b04753b204..60c7e9e9901e 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -32,8 +32,6 @@
 #include 
 #include 
 
-#define IOMMU_MAPPING_ERROR0
-
 struct iommu_dma_msi_page {
struct list_headlist;
dma_addr_t  iova;
@@ -523,7 +521,7 @@ void iommu_dma_free(struct device *dev, struct page 
**pages, size_t size,
 {
__iommu_dma_unmap(iommu_get_dma_domain(dev), *handle, size);
__iommu_dma_free_pages(pages, PAGE_ALIGN(size) >> PAGE_SHIFT);
-   *handle = IOMMU_MAPPING_ERROR;
+   *handle = DMA_MAPPING_ERROR;
 }
 
 /**
@@ -556,7 +554,7 @@ struct page **iommu_dma_alloc(struct device *dev, size_t 
size, gfp_t gfp,
dma_addr_t iova;
unsigned int count, min_size, alloc_sizes = domain->pgsize_bitmap;
 
-   *handle = IOMMU_MAPPING_ERROR;
+   *handle = DMA_MAPPING_ERROR;
 
min_size = alloc_sizes & -alloc_sizes;
if (min_size < PAGE_SIZE) {
@@ -649,11 +647,11 @@ static dma_addr_t __iommu_dma_map(struct device *dev, 
phys_addr_t phys,
 
iova = iommu_dma_alloc_iova(domain, size, dma_get_mask(dev), dev);
if (!iova)
-   return IOMMU_MAPPING_ERROR;
+   return DMA_MAPPING_ERROR;
 
if (iommu_map(domain, iova, phys - iova_off, size, prot)) {
iommu_dma_free_iova(cookie, iova, size);
-   return IOMMU_MAPPING_ERROR;
+   return DMA_MAPPING_ERROR;
}
return iova + iova_off;
 }
@@ -694,7 +692,7 @@ static int __finalise_sg(struct device *dev, struct 
scatterlist *sg, int nents,
 
s->offset += s_iova_off;
s->length = s_length;
-   sg_dma_address(s) = IOMMU_MAPPING_ERROR;
+   sg_dma_address(s) = DMA_MAPPING_ERROR;
sg_dma_len(s) = 0;
 
/*
@@ -737,11 +735,11 @@ static void __invalidate_sg(struct scatterlist *sg, int 
nents)
int i;
 
for_each_sg(sg, s, nents, i) {
-   if (sg_dma_address(s) != IOMMU_MAPPING_ERROR)
+   if (sg_dma_address(s) != DMA_MAPPING_ERROR)
s->offset += sg_dma_address(s);
if (sg_dma_len(s))
s->length = sg_dma_len(s);
-   sg_dma_address(s) = IOMMU_MAPPING_ERROR;
+   sg_dma_address(s) = DMA_MAPPING_ERROR;
sg_dma_len(s) = 0;
}
 }
@@ -858,11 +856,6 @@ void iommu_dma_unmap_resource(struct device *dev, 
dma_addr_t handle,
__iommu_dma_unmap(iommu_get_dma_domain(dev), handle, size);
 }
 
-int iommu_dma_mapping_error(struct device *dev, dma_addr_t dma_addr)
-{
-   return dma_addr == IOMMU_MAPPING_ERROR;
-}
-
 

[Xen-devel] [PATCH 22/23] dma-mapping: remove the mapping_error dma_map_ops method

2018-11-30 Thread Christoph Hellwig
No users left except for vmd which just forwards it.

Signed-off-by: Christoph Hellwig 
---
 drivers/pci/controller/vmd.c | 6 --
 include/linux/dma-mapping.h  | 7 ---
 2 files changed, 13 deletions(-)

diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
index e50b0b5815ff..98ce79eac128 100644
--- a/drivers/pci/controller/vmd.c
+++ b/drivers/pci/controller/vmd.c
@@ -394,11 +394,6 @@ static void vmd_sync_sg_for_device(struct device *dev, 
struct scatterlist *sg,
vmd_dma_ops(dev)->sync_sg_for_device(to_vmd_dev(dev), sg, nents, dir);
 }
 
-static int vmd_mapping_error(struct device *dev, dma_addr_t addr)
-{
-   return vmd_dma_ops(dev)->mapping_error(to_vmd_dev(dev), addr);
-}
-
 static int vmd_dma_supported(struct device *dev, u64 mask)
 {
return vmd_dma_ops(dev)->dma_supported(to_vmd_dev(dev), mask);
@@ -446,7 +441,6 @@ static void vmd_setup_dma_ops(struct vmd_dev *vmd)
ASSIGN_VMD_DMA_OPS(source, dest, sync_single_for_device);
ASSIGN_VMD_DMA_OPS(source, dest, sync_sg_for_cpu);
ASSIGN_VMD_DMA_OPS(source, dest, sync_sg_for_device);
-   ASSIGN_VMD_DMA_OPS(source, dest, mapping_error);
ASSIGN_VMD_DMA_OPS(source, dest, dma_supported);
ASSIGN_VMD_DMA_OPS(source, dest, get_required_mask);
add_dma_domain(domain);
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index 46bd612d929e..2adef56c6069 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -128,7 +128,6 @@ struct dma_map_ops {
   enum dma_data_direction dir);
void (*cache_sync)(struct device *dev, void *vaddr, size_t size,
enum dma_data_direction direction);
-   int (*mapping_error)(struct device *dev, dma_addr_t dma_addr);
int (*dma_supported)(struct device *dev, u64 mask);
u64 (*get_required_mask)(struct device *dev);
 };
@@ -575,15 +574,9 @@ static inline void dma_free_coherent(struct device *dev, 
size_t size,
 
 static inline int dma_mapping_error(struct device *dev, dma_addr_t dma_addr)
 {
-   const struct dma_map_ops *ops = get_dma_ops(dev);
-
debug_dma_mapping_error(dev, dma_addr);
-
if (dma_addr == DMA_MAPPING_ERROR)
return 1;
-
-   if (ops->mapping_error)
-   return ops->mapping_error(dev, dma_addr);
return 0;
 }
 
-- 
2.19.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 16/23] x86/calgary: remove the mapping_error dma_map_ops method

2018-11-30 Thread Christoph Hellwig
Return DMA_MAPPING_ERROR instead of the magic bad_dma_addr on a dma
mapping failure and let the core dma-mapping code handle the rest.

Remove the magic EMERGENCY_PAGES that the bad_dma_addr gets redirected to.

Signed-off-by: Christoph Hellwig 
---
 arch/x86/kernel/pci-calgary_64.c | 29 +++--
 1 file changed, 7 insertions(+), 22 deletions(-)

diff --git a/arch/x86/kernel/pci-calgary_64.c b/arch/x86/kernel/pci-calgary_64.c
index bbfc8b1e9104..e76ec1b8ed1f 100644
--- a/arch/x86/kernel/pci-calgary_64.c
+++ b/arch/x86/kernel/pci-calgary_64.c
@@ -51,8 +51,6 @@
 #include 
 #include 
 
-#define CALGARY_MAPPING_ERROR  0
-
 #ifdef CONFIG_CALGARY_IOMMU_ENABLED_BY_DEFAULT
 int use_calgary __read_mostly = 1;
 #else
@@ -157,8 +155,6 @@ static const unsigned long phb_debug_offsets[] = {
 
 #define PHB_DEBUG_STUFF_OFFSET 0x0020
 
-#define EMERGENCY_PAGES 32 /* = 128KB */
-
 unsigned int specified_table_size = TCE_TABLE_SIZE_UNSPECIFIED;
 static int translate_empty_slots __read_mostly = 0;
 static int calgary_detected __read_mostly = 0;
@@ -255,7 +251,7 @@ static unsigned long iommu_range_alloc(struct device *dev,
if (panic_on_overflow)
panic("Calgary: fix the allocator.\n");
else
-   return CALGARY_MAPPING_ERROR;
+   return DMA_MAPPING_ERROR;
}
}
 
@@ -274,11 +270,10 @@ static dma_addr_t iommu_alloc(struct device *dev, struct 
iommu_table *tbl,
dma_addr_t ret;
 
entry = iommu_range_alloc(dev, tbl, npages);
-
-   if (unlikely(entry == CALGARY_MAPPING_ERROR)) {
+   if (unlikely(entry == DMA_MAPPING_ERROR)) {
pr_warn("failed to allocate %u pages in iommu %p\n",
npages, tbl);
-   return CALGARY_MAPPING_ERROR;
+   return DMA_MAPPING_ERROR;
}
 
/* set the return dma address */
@@ -298,8 +293,7 @@ static void iommu_free(struct iommu_table *tbl, dma_addr_t 
dma_addr,
unsigned long flags;
 
/* were we called with bad_dma_address? */
-   badend = CALGARY_MAPPING_ERROR + (EMERGENCY_PAGES * PAGE_SIZE);
-   if (unlikely(dma_addr < badend)) {
+   if (unlikely(dma_addr == DMA_MAPPING_ERROR)) {
WARN(1, KERN_ERR "Calgary: driver tried unmapping bad DMA "
   "address 0x%Lx\n", dma_addr);
return;
@@ -383,7 +377,7 @@ static int calgary_map_sg(struct device *dev, struct 
scatterlist *sg,
npages = iommu_num_pages(vaddr, s->length, PAGE_SIZE);
 
entry = iommu_range_alloc(dev, tbl, npages);
-   if (entry == CALGARY_MAPPING_ERROR) {
+   if (entry == DMA_MAPPING_ERROR) {
/* makes sure unmap knows to stop */
s->dma_length = 0;
goto error;
@@ -401,7 +395,7 @@ static int calgary_map_sg(struct device *dev, struct 
scatterlist *sg,
 error:
calgary_unmap_sg(dev, sg, nelems, dir, 0);
for_each_sg(sg, s, nelems, i) {
-   sg->dma_address = CALGARY_MAPPING_ERROR;
+   sg->dma_address = DMA_MAPPING_ERROR;
sg->dma_length = 0;
}
return 0;
@@ -454,7 +448,7 @@ static void* calgary_alloc_coherent(struct device *dev, 
size_t size,
 
/* set up tces to cover the allocated range */
mapping = iommu_alloc(dev, tbl, ret, npages, DMA_BIDIRECTIONAL);
-   if (mapping == CALGARY_MAPPING_ERROR)
+   if (mapping == DMA_MAPPING_ERROR)
goto free;
*dma_handle = mapping;
return ret;
@@ -479,11 +473,6 @@ static void calgary_free_coherent(struct device *dev, 
size_t size,
free_pages((unsigned long)vaddr, get_order(size));
 }
 
-static int calgary_mapping_error(struct device *dev, dma_addr_t dma_addr)
-{
-   return dma_addr == CALGARY_MAPPING_ERROR;
-}
-
 static const struct dma_map_ops calgary_dma_ops = {
.alloc = calgary_alloc_coherent,
.free = calgary_free_coherent,
@@ -491,7 +480,6 @@ static const struct dma_map_ops calgary_dma_ops = {
.unmap_sg = calgary_unmap_sg,
.map_page = calgary_map_page,
.unmap_page = calgary_unmap_page,
-   .mapping_error = calgary_mapping_error,
.dma_supported = dma_direct_supported,
 };
 
@@ -739,9 +727,6 @@ static void __init calgary_reserve_regions(struct pci_dev 
*dev)
u64 start;
struct iommu_table *tbl = pci_iommu(dev->bus);
 
-   /* reserve EMERGENCY_PAGES from bad_dma_address and up */
-   iommu_range_reserve(tbl, CALGARY_MAPPING_ERROR, EMERGENCY_PAGES);
-
/* avoid the BIOS/VGA first 640KB-1MB region */
/* for CalIOC2 - avoid the entire first MB */
if (is_calgary(dev->device)) {
-- 
2.19.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailm

[Xen-devel] [PATCH 17/23] iommu: remove the mapping_error dma_map_ops method

2018-11-30 Thread Christoph Hellwig
Return DMA_MAPPING_ERROR instead of 0 on a dma mapping failure and let
the core dma-mapping code handle the rest.

Note that the existing code used AMD_IOMMU_MAPPING_ERROR to check from
a 0 return from the IOVA allocator, which is replaced with an explicit
0 as in the implementation and other users of that interface.

Signed-off-by: Christoph Hellwig 
---
 drivers/iommu/amd_iommu.c | 18 +-
 1 file changed, 5 insertions(+), 13 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 1167ff0416cf..c5d6c7c42b0a 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -55,8 +55,6 @@
 #include "amd_iommu_types.h"
 #include "irq_remapping.h"
 
-#define AMD_IOMMU_MAPPING_ERROR0
-
 #define CMD_SET_TYPE(cmd, t) ((cmd)->data[1] |= ((t) << 28))
 
 #define LOOP_TIMEOUT   10
@@ -2339,7 +2337,7 @@ static dma_addr_t __map_single(struct device *dev,
paddr &= PAGE_MASK;
 
address = dma_ops_alloc_iova(dev, dma_dom, pages, dma_mask);
-   if (address == AMD_IOMMU_MAPPING_ERROR)
+   if (!address)
goto out;
 
prot = dir2prot(direction);
@@ -2376,7 +2374,7 @@ static dma_addr_t __map_single(struct device *dev,
 
dma_ops_free_iova(dma_dom, address, pages);
 
-   return AMD_IOMMU_MAPPING_ERROR;
+   return DMA_MAPPING_ERROR;
 }
 
 /*
@@ -2427,7 +2425,7 @@ static dma_addr_t map_page(struct device *dev, struct 
page *page,
if (PTR_ERR(domain) == -EINVAL)
return (dma_addr_t)paddr;
else if (IS_ERR(domain))
-   return AMD_IOMMU_MAPPING_ERROR;
+   return DMA_MAPPING_ERROR;
 
dma_mask = *dev->dma_mask;
dma_dom = to_dma_ops_domain(domain);
@@ -2504,7 +2502,7 @@ static int map_sg(struct device *dev, struct scatterlist 
*sglist,
npages = sg_num_pages(dev, sglist, nelems);
 
address = dma_ops_alloc_iova(dev, dma_dom, npages, dma_mask);
-   if (address == AMD_IOMMU_MAPPING_ERROR)
+   if (address == DMA_MAPPING_ERROR)
goto out_err;
 
prot = dir2prot(direction);
@@ -2627,7 +2625,7 @@ static void *alloc_coherent(struct device *dev, size_t 
size,
*dma_addr = __map_single(dev, dma_dom, page_to_phys(page),
 size, DMA_BIDIRECTIONAL, dma_mask);
 
-   if (*dma_addr == AMD_IOMMU_MAPPING_ERROR)
+   if (*dma_addr == DMA_MAPPING_ERROR)
goto out_free;
 
return page_address(page);
@@ -2678,11 +2676,6 @@ static int amd_iommu_dma_supported(struct device *dev, 
u64 mask)
return check_device(dev);
 }
 
-static int amd_iommu_mapping_error(struct device *dev, dma_addr_t dma_addr)
-{
-   return dma_addr == AMD_IOMMU_MAPPING_ERROR;
-}
-
 static const struct dma_map_ops amd_iommu_dma_ops = {
.alloc  = alloc_coherent,
.free   = free_coherent,
@@ -2691,7 +2684,6 @@ static const struct dma_map_ops amd_iommu_dma_ops = {
.map_sg = map_sg,
.unmap_sg   = unmap_sg,
.dma_supported  = amd_iommu_dma_supported,
-   .mapping_error  = amd_iommu_mapping_error,
 };
 
 static int init_reserved_iova_ranges(void)
-- 
2.19.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 19/23] iommu/vt-d: remove the mapping_error dma_map_ops method

2018-11-30 Thread Christoph Hellwig
Return DMA_MAPPING_ERROR instead of 0 on a dma mapping failure and let
the core dma-mapping code handle the rest.

Signed-off-by: Christoph Hellwig 
---
 drivers/iommu/intel-iommu.c | 12 +++-
 1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 66b398ae..0ad67d65bbce 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -3617,7 +3617,7 @@ static dma_addr_t __intel_map_page(struct device *dev, 
struct page *page,
 
domain = get_valid_domain_for_dev(dev);
if (!domain)
-   return 0;
+   return DMA_MAPPING_ERROR;
 
iommu = domain_get_iommu(domain);
size = aligned_nrpages(paddr, size);
@@ -3655,7 +3655,7 @@ static dma_addr_t __intel_map_page(struct device *dev, 
struct page *page,
free_iova_fast(&domain->iovad, iova_pfn, dma_to_mm_pfn(size));
pr_err("Device %s request: %zx@%llx dir %d --- failed\n",
dev_name(dev), size, (unsigned long long)paddr, dir);
-   return 0;
+   return DMA_MAPPING_ERROR;
 }
 
 static dma_addr_t intel_map_page(struct device *dev, struct page *page,
@@ -3756,7 +3756,7 @@ static void *intel_alloc_coherent(struct device *dev, 
size_t size,
 
*dma_handle = __intel_map_page(dev, page, 0, size, DMA_BIDIRECTIONAL,
   dev->coherent_dma_mask);
-   if (*dma_handle)
+   if (*dma_handle != DMA_MAPPING_ERROR)
return page_address(page);
if (!dma_release_from_contiguous(dev, page, size >> PAGE_SHIFT))
__free_pages(page, order);
@@ -3865,11 +3865,6 @@ static int intel_map_sg(struct device *dev, struct 
scatterlist *sglist, int nele
return nelems;
 }
 
-static int intel_mapping_error(struct device *dev, dma_addr_t dma_addr)
-{
-   return !dma_addr;
-}
-
 static const struct dma_map_ops intel_dma_ops = {
.alloc = intel_alloc_coherent,
.free = intel_free_coherent,
@@ -3877,7 +3872,6 @@ static const struct dma_map_ops intel_dma_ops = {
.unmap_sg = intel_unmap_sg,
.map_page = intel_map_page,
.unmap_page = intel_unmap_page,
-   .mapping_error = intel_mapping_error,
.dma_supported = dma_direct_supported,
 };
 
-- 
2.19.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 21/23] xen-swiotlb: remove the mapping_error dma_map_ops method

2018-11-30 Thread Christoph Hellwig
Return DMA_MAPPING_ERROR instead of 0 on a dma mapping failure and let
the core dma-mapping code handle the rest.

Signed-off-by: Christoph Hellwig 
---
 drivers/xen/swiotlb-xen.c | 12 ++--
 1 file changed, 2 insertions(+), 10 deletions(-)

diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index 2a7f545bd0b5..6dc969d5ea2f 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -53,8 +53,6 @@
  * API.
  */
 
-#define XEN_SWIOTLB_ERROR_CODE (~(dma_addr_t)0x0)
-
 static char *xen_io_tlb_start, *xen_io_tlb_end;
 static unsigned long xen_io_tlb_nslabs;
 /*
@@ -406,7 +404,7 @@ static dma_addr_t xen_swiotlb_map_page(struct device *dev, 
struct page *page,
map = swiotlb_tbl_map_single(dev, start_dma_addr, phys, size, dir,
 attrs);
if (map == SWIOTLB_MAP_ERROR)
-   return XEN_SWIOTLB_ERROR_CODE;
+   return DMA_MAPPING_ERROR;
 
dev_addr = xen_phys_to_bus(map);
xen_dma_map_page(dev, pfn_to_page(map >> PAGE_SHIFT),
@@ -421,7 +419,7 @@ static dma_addr_t xen_swiotlb_map_page(struct device *dev, 
struct page *page,
attrs |= DMA_ATTR_SKIP_CPU_SYNC;
swiotlb_tbl_unmap_single(dev, map, size, dir, attrs);
 
-   return XEN_SWIOTLB_ERROR_CODE;
+   return DMA_MAPPING_ERROR;
 }
 
 /*
@@ -700,11 +698,6 @@ xen_swiotlb_get_sgtable(struct device *dev, struct 
sg_table *sgt,
return dma_common_get_sgtable(dev, sgt, cpu_addr, handle, size, attrs);
 }
 
-static int xen_swiotlb_mapping_error(struct device *dev, dma_addr_t dma_addr)
-{
-   return dma_addr == XEN_SWIOTLB_ERROR_CODE;
-}
-
 const struct dma_map_ops xen_swiotlb_dma_ops = {
.alloc = xen_swiotlb_alloc_coherent,
.free = xen_swiotlb_free_coherent,
@@ -719,5 +712,4 @@ const struct dma_map_ops xen_swiotlb_dma_ops = {
.dma_supported = xen_swiotlb_dma_supported,
.mmap = xen_swiotlb_dma_mmap,
.get_sgtable = xen_swiotlb_get_sgtable,
-   .mapping_error  = xen_swiotlb_mapping_error,
 };
-- 
2.19.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 23/23] dma-mapping: return an error code from dma_mapping_error

2018-11-30 Thread Christoph Hellwig
Currently dma_mapping_error returns a boolean as int, with 1 meaning
error.  This is rather unusual and many callers have to convert it to
errno value.  The callers are highly inconsistent with error codes
ranging from -ENOMEM over -EIO, -EINVAL and -EFAULT ranging to -EAGAIN.
Return -ENOMEM which seems to be what the largest number of callers
convert it to, and which also matches the typical error case where
we are out of resources.

Signed-off-by: Christoph Hellwig 
---
 include/linux/dma-mapping.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index 2adef56c6069..b5bd25fc1f81 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -576,7 +576,7 @@ static inline int dma_mapping_error(struct device *dev, 
dma_addr_t dma_addr)
 {
debug_dma_mapping_error(dev, dma_addr);
if (dma_addr == DMA_MAPPING_ERROR)
-   return 1;
+   return -ENOMEM;
return 0;
 }
 
-- 
2.19.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 18/23] iommu/intel: small map_page cleanup

2018-11-30 Thread Christoph Hellwig
Pass the page + offset to the low-level __iommu_map_single helper
(which gets renamed to fit the new calling conventions) as both
callers have the page at hand.

Signed-off-by: Christoph Hellwig 
---
 drivers/iommu/intel-iommu.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 41a4b8808802..66b398ae 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -3597,9 +3597,11 @@ static int iommu_no_mapping(struct device *dev)
return 0;
 }
 
-static dma_addr_t __intel_map_single(struct device *dev, phys_addr_t paddr,
-size_t size, int dir, u64 dma_mask)
+static dma_addr_t __intel_map_page(struct device *dev, struct page *page,
+  unsigned long offset, size_t size, int dir,
+  u64 dma_mask)
 {
+   phys_addr_t paddr = page_to_phys(page) + offset;
struct dmar_domain *domain;
phys_addr_t start_paddr;
unsigned long iova_pfn;
@@ -3661,8 +3663,7 @@ static dma_addr_t intel_map_page(struct device *dev, 
struct page *page,
 enum dma_data_direction dir,
 unsigned long attrs)
 {
-   return __intel_map_single(dev, page_to_phys(page) + offset, size,
- dir, *dev->dma_mask);
+   return __intel_map_page(dev, page, offset, size, dir, *dev->dma_mask);
 }
 
 static void intel_unmap(struct device *dev, dma_addr_t dev_addr, size_t size)
@@ -3753,9 +3754,8 @@ static void *intel_alloc_coherent(struct device *dev, 
size_t size,
return NULL;
memset(page_address(page), 0, size);
 
-   *dma_handle = __intel_map_single(dev, page_to_phys(page), size,
-DMA_BIDIRECTIONAL,
-dev->coherent_dma_mask);
+   *dma_handle = __intel_map_page(dev, page, 0, size, DMA_BIDIRECTIONAL,
+  dev->coherent_dma_mask);
if (*dma_handle)
return page_address(page);
if (!dma_release_from_contiguous(dev, page, size >> PAGE_SHIFT))
-- 
2.19.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 0/2] tools/libxc: Fixes to cpuid logic

2018-11-30 Thread Wei Liu
On Thu, Nov 29, 2018 at 07:20:00PM +, Andrew Cooper wrote:
> Andrew Cooper (2):
>   tools/libxc: Fix issues with libxc and Xen having different featureset 
> lengths
>   tools/libxc: Fix error handling in get_cpuid_domain_info()

With Jan's comments addressed:

Acked-by: Wei Liu 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] Linux 4.19.5 fails to boot as Xen dom0

2018-11-30 Thread Kirill A. Shutemov
On Fri, Nov 30, 2018 at 01:11:56PM +, Hans van Kranenburg wrote:
> On 11/29/18 4:06 PM, Kirill A. Shutemov wrote:
> > On Thu, Nov 29, 2018 at 03:00:45PM +, Juergen Gross wrote:
> >> On 29/11/2018 15:32, Kirill A. Shutemov wrote:
> >>> On Thu, Nov 29, 2018 at 02:24:47PM +, Kirill A. Shutemov wrote:
>  On Thu, Nov 29, 2018 at 01:35:17PM +, Juergen Gross wrote:
> > On 29/11/2018 14:26, Kirill A. Shutemov wrote:
> >> On Thu, Nov 29, 2018 at 09:41:25AM +, Juergen Gross wrote:
> >>> On 29/11/2018 02:22, Hans van Kranenburg wrote:
>  Hi,
> 
>  As also seen at:
>  https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=914951
> 
>  Attached there are two serial console output logs. One is starting 
>  with
>  Xen 4.11 (from debian unstable) as dom0, and the other one without 
>  Xen.
> 
>  [2.085543] BUG: unable to handle kernel paging request at
>  888d9fffc000
>  [2.085610] PGD 200c067 P4D 200c067 PUD 0
>  [2.085674] Oops:  [#1] SMP NOPTI
>  [2.085736] CPU: 1 PID: 1 Comm: swapper/0 Not tainted
>  4.19.0-trunk-amd64 #1 Debian 4.19.5-1~exp1+pvh1
>  [2.085823] Hardware name: HP ProLiant DL360 G7, BIOS P68 
>  05/21/2018
>  [2.085895] RIP: e030:ptdump_walk_pgd_level_core+0x1fd/0x490
>  [...]
> >>>
> >>> The offending stable commit is 
> >>> 4074ca7d8a1832921c865d250bbd08f3441b3657
> >>> ("x86/mm: Move LDT remap out of KASLR region on 5-level paging"), this
> >>> is commit d52888aa2753e3063a9d3a0c9f72f94aa9809c15 upstream.
> >>>
> >>> Current upstream kernel is booting fine under Xen, so in general the
> >>> patch should be fine. Using an upstream kernel built from above commit
> >>> (with the then needed Xen fixup patch 1457d8cf7664f34c4ba534) is fine,
> >>> too.
> >>>
> >>> Kirill, are you aware of any prerequisite patch from 4.20 which could 
> >>> be
> >>> missing in 4.19.5?
> >>
> >> I'm not.
> >>
> >> Let me look into this.
> >>
> >
> > What is making me suspicious is the failure happening just after
> > releasing the init memory. Maybe there is an access to .init.data
> > segment or similar? The native kernel booting could be related to the
> > usage of 2M mappings not being available in a PV-domain.
> 
>  Ahh.. Could you test this:
> 
>  diff --git a/arch/x86/mm/dump_pagetables.c 
>  b/arch/x86/mm/dump_pagetables.c
>  index a12afff146d1..7dec63ec7aab 100644
>  --- a/arch/x86/mm/dump_pagetables.c
>  +++ b/arch/x86/mm/dump_pagetables.c
>  @@ -496,7 +496,7 @@ static inline bool is_hypervisor_range(int idx)
>    * 8000 - 87ff is reserved for
>    * the hypervisor.
>    */
>  -return  (idx >= pgd_index(__PAGE_OFFSET) - 16) &&
>  +return  (idx >= pgd_index(__PAGE_OFFSET) - 17) &&
>   (idx <  pgd_index(__PAGE_OFFSET));
>   #else
>   return false;
> >>>
> >>> Or, better, this:
> >>
> >> That makes it boot again!
> >>
> >> Any idea why upstream doesn't need it?
> > 
> > Nope.
> > 
> > I'll prepare a proper fix.
> > 
> 
> Thanks for looking into this.
> 
> In the meantime, I applied the "Or, better, this" change, and my dom0
> boots again.
> 
> FYI, boot log now: (paste 90d valid)
> https://paste.debian.net/plainh/48940826

I forgot to CC you:

https://lkml.kernel.org/r/20181130121131.g3xvlvixv7mvl...@black.fi.intel.com

Please give it a try.

-- 
 Kirill A. Shutemov

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] Linux 4.19.5 fails to boot as Xen dom0

2018-11-30 Thread Hans van Kranenburg
On 11/30/18 2:26 PM, Kirill A. Shutemov wrote:
> On Fri, Nov 30, 2018 at 01:11:56PM +, Hans van Kranenburg wrote:
>> On 11/29/18 4:06 PM, Kirill A. Shutemov wrote:
>>> On Thu, Nov 29, 2018 at 03:00:45PM +, Juergen Gross wrote:
 On 29/11/2018 15:32, Kirill A. Shutemov wrote:
> On Thu, Nov 29, 2018 at 02:24:47PM +, Kirill A. Shutemov wrote:
>> On Thu, Nov 29, 2018 at 01:35:17PM +, Juergen Gross wrote:
>>> On 29/11/2018 14:26, Kirill A. Shutemov wrote:
 On Thu, Nov 29, 2018 at 09:41:25AM +, Juergen Gross wrote:
> On 29/11/2018 02:22, Hans van Kranenburg wrote:
>> Hi,
>>
>> As also seen at:
>> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=914951
>>
>> Attached there are two serial console output logs. One is starting 
>> with
>> Xen 4.11 (from debian unstable) as dom0, and the other one without 
>> Xen.
>>
>> [2.085543] BUG: unable to handle kernel paging request at
>> 888d9fffc000
>> [2.085610] PGD 200c067 P4D 200c067 PUD 0
>> [2.085674] Oops:  [#1] SMP NOPTI
>> [2.085736] CPU: 1 PID: 1 Comm: swapper/0 Not tainted
>> 4.19.0-trunk-amd64 #1 Debian 4.19.5-1~exp1+pvh1
>> [2.085823] Hardware name: HP ProLiant DL360 G7, BIOS P68 
>> 05/21/2018
>> [2.085895] RIP: e030:ptdump_walk_pgd_level_core+0x1fd/0x490
>> [...]
>
> The offending stable commit is 
> 4074ca7d8a1832921c865d250bbd08f3441b3657
> ("x86/mm: Move LDT remap out of KASLR region on 5-level paging"), this
> is commit d52888aa2753e3063a9d3a0c9f72f94aa9809c15 upstream.
>
> Current upstream kernel is booting fine under Xen, so in general the
> patch should be fine. Using an upstream kernel built from above commit
> (with the then needed Xen fixup patch 1457d8cf7664f34c4ba534) is fine,
> too.
>
> Kirill, are you aware of any prerequisite patch from 4.20 which could 
> be
> missing in 4.19.5?

 I'm not.

 Let me look into this.

>>>
>>> What is making me suspicious is the failure happening just after
>>> releasing the init memory. Maybe there is an access to .init.data
>>> segment or similar? The native kernel booting could be related to the
>>> usage of 2M mappings not being available in a PV-domain.
>>
>> Ahh.. Could you test this:
>>
>> diff --git a/arch/x86/mm/dump_pagetables.c 
>> b/arch/x86/mm/dump_pagetables.c
>> index a12afff146d1..7dec63ec7aab 100644
>> --- a/arch/x86/mm/dump_pagetables.c
>> +++ b/arch/x86/mm/dump_pagetables.c
>> @@ -496,7 +496,7 @@ static inline bool is_hypervisor_range(int idx)
>>   * 8000 - 87ff is reserved for
>>   * the hypervisor.
>>   */
>> -return  (idx >= pgd_index(__PAGE_OFFSET) - 16) &&
>> +return  (idx >= pgd_index(__PAGE_OFFSET) - 17) &&
>>  (idx <  pgd_index(__PAGE_OFFSET));
>>  #else
>>  return false;
>
> Or, better, this:

 That makes it boot again!

 Any idea why upstream doesn't need it?
>>>
>>> Nope.
>>>
>>> I'll prepare a proper fix.
>>>
>>
>> Thanks for looking into this.
>>
>> In the meantime, I applied the "Or, better, this" change, and my dom0
>> boots again.
>>
>> FYI, boot log now: (paste 90d valid)
>> https://paste.debian.net/plainh/48940826
> 
> I forgot to CC you:
> 
> https://lkml.kernel.org/r/20181130121131.g3xvlvixv7mvl...@black.fi.intel.com
> 
> Please give it a try.

Ah, right, thanks. The xen-devel list is also not in Cc.

I'll slam it on top of my 4.19.5 debian package build and test.

Hans


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3] pvcalls-front: Avoid __get_free_pages(GFP_KERNEL) under spinlock

2018-11-30 Thread Juergen Gross
On 30/11/2018 12:01, Wen Yang wrote:
> The problem is that we call this with a spin lock held.
> The call tree is:
> pvcalls_front_accept() holds bedata->socket_lock.
> -> create_active()
> -> __get_free_pages() uses GFP_KERNEL
> 
> The create_active() function is only called from pvcalls_front_accept()
> with a spin_lock held, The allocation is not allowed to sleep and
> GFP_KERNEL is not sufficient.
> 
> This issue was detected by using the Coccinelle software.
> 
> v2: Add a function doing the allocations which is called
> outside the lock and passing the allocated data to
> create_active().
> v3: Use the matching deallocators i.e., free_page()
> and free_pages(), respectively.
> 
> Suggested-by: Juergen Gross 
> Signed-off-by: Wen Yang 
> CC: Julia Lawall 
> CC: Boris Ostrovsky 
> CC: Juergen Gross 
> CC: Stefano Stabellini 
> CC: xen-devel@lists.xenproject.org
> CC: linux-ker...@vger.kernel.org
> ---

This patch is malformed. Please send it via an appropriate tool/mailer.

See e.g. Documentation/process/submitting-patches.rst or
Documentation/process/email-clients.rst in the Linux kernel source tree.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [freebsd-master test] 130849: all pass - PUSHED

2018-11-30 Thread osstest service owner
flight 130849 freebsd-master real [real]
http://logs.test-lab.xenproject.org/osstest/logs/130849/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 freebsd  b1f31943cb61465b80f786de48501e2fb03e1b61
baseline version:
 freebsd  8894b8e317520c031636fdfab98fc0557a0e1f4c

Last test of basis   130810  2018-11-26 09:20:17 Z4 days
Testing same since   130849  2018-11-28 12:49:34 Z2 days1 attempts


People who touched revisions under test:
  0mp <0...@freebsd.org>
  ae 
  arybchik 
  asomers 
  bwidawsk 
  dab 
  des 
  emaste 
  eugen 
  gallatin 
  gjb 
  ian 
  imp 
  jamie 
  jhb 
  jhibbits 
  jilles 
  kib 
  manu 
  marius 
  markj 
  markm 
  mckusick 
  mm 
  vangyzen 
  ygy 
  yuripv 

jobs:
 build-amd64-freebsd-againpass
 build-amd64-freebsd  pass
 build-amd64-xen-freebsd  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/freebsd.git
   8894b8e3175..b1f31943cb6  b1f31943cb61465b80f786de48501e2fb03e1b61 -> 
tested/master

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [linux-3.18 test] 130843: regressions - FAIL

2018-11-30 Thread osstest service owner
flight 130843 linux-3.18 real [real]
http://logs.test-lab.xenproject.org/osstest/logs/130843/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-examine   8 reboot   fail REGR. vs. 128858
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 7 xen-boot fail REGR. vs. 
128858
 test-amd64-i386-xl-qemuu-debianhvm-amd64-shadow 7 xen-boot fail REGR. vs. 
128858
 test-amd64-i386-xl-raw7 xen-boot fail REGR. vs. 128858
 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 7 xen-boot fail REGR. 
vs. 128858
 test-amd64-amd64-xl-multivcpu  7 xen-bootfail REGR. vs. 128858
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 7 xen-boot fail REGR. 
vs. 128858
 test-amd64-i386-xl7 xen-boot fail REGR. vs. 128858
 test-amd64-i386-qemut-rhel6hvm-intel  7 xen-boot fail REGR. vs. 128858
 test-amd64-i386-freebsd10-i386  7 xen-boot   fail REGR. vs. 128858
 test-amd64-i386-xl-qemuu-ovmf-amd64  7 xen-boot  fail REGR. vs. 128858
 test-amd64-i386-libvirt-pair 10 xen-boot/src_hostfail REGR. vs. 128858
 test-amd64-i386-libvirt-pair 11 xen-boot/dst_hostfail REGR. vs. 128858
 test-amd64-i386-xl-qemuu-debianhvm-amd64  7 xen-boot fail REGR. vs. 128858
 test-amd64-amd64-xl-qemuu-ovmf-amd64  7 xen-boot fail REGR. vs. 128858
 test-amd64-i386-pair 10 xen-boot/src_hostfail REGR. vs. 128858
 test-amd64-amd64-rumprun-amd64  7 xen-boot   fail REGR. vs. 128858
 test-amd64-amd64-libvirt-pair 10 xen-boot/src_host   fail REGR. vs. 128858
 test-amd64-amd64-libvirt-pair 11 xen-boot/dst_host   fail REGR. vs. 128858
 test-amd64-amd64-xl-xsm   7 xen-boot fail REGR. vs. 128858
 test-amd64-amd64-pair10 xen-boot/src_hostfail REGR. vs. 128858
 test-amd64-amd64-pair11 xen-boot/dst_hostfail REGR. vs. 128858
 test-amd64-amd64-xl-pvshim7 xen-boot fail REGR. vs. 128858
 test-armhf-armhf-xl-cubietruck 16 guest-start/debian.repeat fail REGR. vs. 
128858
 test-armhf-armhf-libvirt16 guest-start/debian.repeat fail REGR. vs. 128858
 test-armhf-armhf-xl-multivcpu 16 guest-start/debian.repeat fail REGR. vs. 
128858

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-xl-credit2  16 guest-start/debian.repeatfail  like 128691
 test-amd64-i386-xl-qemut-debianhvm-amd64 16 guest-localmigrate/x10 fail like 
128807
 test-armhf-armhf-xl-vhd  10 debian-di-installfail  like 128841
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 128858
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 128858
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 128858
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 128858
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 128858
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 128858
 test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict 10 debian-hvm-install 
fail never pass
 test-amd64-amd64-xl-pvhv2-intel 12 guest-start fail never pass
 test-amd64-amd64-xl-pvhv2-amd 12 guest-start  fail  never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-i386-xl-pvshim12 guest-start  fail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stop fail never pass
 test-amd64-i386-xl-qemut-ws16-amd64 17 gue

[Xen-devel] [xen-unstable-smoke test] 130873: tolerable all pass - PUSHED

2018-11-30 Thread osstest service owner
flight 130873 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/130873/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  2634b997afabfdc5a972e07e536dfbc6febb4385
baseline version:
 xen  402411ec40e451d1ad051fb7e59aa6374cc4212a

Last test of basis   130852  2018-11-28 17:00:31 Z1 days
Testing same since   130873  2018-11-30 12:00:40 Z0 days1 attempts


People who touched revisions under test:
  Brian Woods 
  Jan Beulich 
  Paul Durrant 
  Roger Pau Monné 
  Tim Deegan 
  Wei Liu 

jobs:
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-amd64-amd64-xl-qemuu-debianhvm-i386 pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   402411ec40..2634b997af  2634b997afabfdc5a972e07e536dfbc6febb4385 -> smoke

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 2/2] iommu: elide flushing for higher order map/unmap operations

2018-11-30 Thread Paul Durrant
> -Original Message-
> From: Andrew Cooper
> Sent: 30 November 2018 12:49
> To: Paul Durrant ; xen-devel@lists.xenproject.org
> Cc: Stefano Stabellini ; Julien Grall
> ; Suravee Suthikulpanit
> ; Brian Woods ; Jan
> Beulich ; Kevin Tian ; Wei Liu
> ; Roger Pau Monne 
> Subject: Re: [PATCH 2/2] iommu: elide flushing for higher order map/unmap
> operations
> 
> On 30/11/2018 10:45, Paul Durrant wrote:
> > This patch removes any implicit flushing that occurs in the
> implementation
> > of map and unmap operations and, instead, adds explicit flushing at the
> > end of the loops in the iommu_map/unmap() wrapper functions.
> >
> > Because VT-d currently performs two different types of flush dependent
> upon
> > whether a PTE is being modified versus merely added (i.e. replacing a
> non-
> > present PTE) a 'iommu_flush_type' enumeration is defined by this patch
> and
> > the iommu_ops map method is modified to pass back the type of flush
> > necessary for the PTE that has been populated. When a higher order
> mapping
> > operation is done, the wrapper code performs the 'highest' level of
> flush
> > required by the individual iommu_ops method calls, where a 'modified
> PTE'
> > flush is deemed to be higher than a 'added PTE' flush. The ARM SMMU
> > implementation currently performs no implicit flushing and therefore
> > the modified map method always passes back a flush type of 'none'.
> >
> > NOTE: The per-cpu 'iommu_dont_flush_iotlb' is respected by the
> iommu_map()
> >   wrapper function and therefore this now applies to all IOMMU
> >   implementations rather than just VT-d. Use of the flag has been
> added
> >   to arch_iommu_hwdom_init() so that flushing is now fully elided
> for
> >   the hardware domain's initial table population.
> 
> iommu_dont_flush_iotlb is a misfeature.  While it still exists, the
> flushing API is fundamentally broken.
> 
> Do you have a plan to remove it?  I ask, because the only feasible
> option I see is for iommu_{map,unmap}() to pass the flush accumulation
> out to the caller, and have the caller use the appropriate flush
> interfaces.
> 
> [Edit - lunch happened around about this point, and there was a long
> discussion]
> 
> One idea with be to start with a prep patch renaming iommu_{,un}map() to
> _legacy(), nothing beside them that they have implicit flushing
> characteristics.  Then, the nonflushing versions of iommu_{,un}map() can
> be introduced, which return the accumulated flush flag, and the callers
> can DTRT.

Ok. I'll plan to do a new patch #2 that does the rename, and the move this code 
into a patch #3.

> 
> This way, we can avoid introducing a further user of
> iommu_dont_flush_iotlb in arch_iommu_hwdom_init(), and clean up the
> remaining legacy callsites at a later point when more infrastructure is
> in place.
> 
> In particular, the P2M code cannot be fixed to behave in this way at the
> moment because the point at which the IOMMU changes are hooked in lacks
> an order parameter.

The P2M code is also a mess and calls directly into VT-d and AMD IOMMU 
functions without going through the wrapper code in some cases. There is much 
cleanup to do... these patches are just a start.

  Paul

> 
> > diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
> > index da8294bac8..289e0e2772 100644
> > --- a/xen/include/xen/iommu.h
> > +++ b/xen/include/xen/iommu.h
> > @@ -155,6 +155,13 @@ struct page_info;
> >   */
> >  typedef int iommu_grdm_t(xen_pfn_t start, xen_ulong_t nr, u32 id, void
> *ctxt);
> >
> 
> This wants at least a comment stating that some IOMMUs require that we
> issue a flush when modifying a not-present/otherwise invalid entry.
> 
> > +enum iommu_flush_type
> > +{
> > +IOMMU_FLUSH_none, /* no flush required */
> > +IOMMU_FLUSH_added, /* no modified entries, just additional entries
> */
> 
> IOMMU_FLUSH_invalid ?  I think it is more descriptive of the scenario in
> which it is used.
> 
> > +IOMMU_FLUSH_modified, /* modified entries */
> > +};
> > +
> >  struct iommu_ops {
> >  int (*init)(struct domain *d);
> >  void (*hwdom_init)(struct domain *d);
> > @@ -177,7 +184,8 @@ struct iommu_ops {
> >   * other by the caller in order to have meaningful results.
> >   */
> >  int __must_check (*map_page)(struct domain *d, dfn_t dfn, mfn_t
> mfn,
> > - unsigned int flags);
> > + unsigned int flags,
> > + enum iommu_flush_type *flush_type);
> 
> Maintaining the flush type by pointer is quite awkward.
> 
> How about folding a positive flush type in with negative errors?  i.e.
> map_page() becomes < 0 on error, 0 for success/no flush and >0 for
> success/w flush.
> 
> I think the result would be rather cleaner to read.
> 
> ~Andrew
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] Linux 4.19.5 fails to boot as Xen dom0

2018-11-30 Thread Hans van Kranenburg
On 11/30/18 2:26 PM, Kirill A. Shutemov wrote:
> On Fri, Nov 30, 2018 at 01:11:56PM +, Hans van Kranenburg wrote:
>> On 11/29/18 4:06 PM, Kirill A. Shutemov wrote:
>>> On Thu, Nov 29, 2018 at 03:00:45PM +, Juergen Gross wrote:
 On 29/11/2018 15:32, Kirill A. Shutemov wrote:
> On Thu, Nov 29, 2018 at 02:24:47PM +, Kirill A. Shutemov wrote:
>> On Thu, Nov 29, 2018 at 01:35:17PM +, Juergen Gross wrote:
>>> On 29/11/2018 14:26, Kirill A. Shutemov wrote:
 On Thu, Nov 29, 2018 at 09:41:25AM +, Juergen Gross wrote:
> On 29/11/2018 02:22, Hans van Kranenburg wrote:
>> Hi,
>>
>> As also seen at:
>> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=914951
>>
>> Attached there are two serial console output logs. One is starting 
>> with
>> Xen 4.11 (from debian unstable) as dom0, and the other one without 
>> Xen.
>>
>> [2.085543] BUG: unable to handle kernel paging request at
>> 888d9fffc000
>> [2.085610] PGD 200c067 P4D 200c067 PUD 0
>> [2.085674] Oops:  [#1] SMP NOPTI
>> [2.085736] CPU: 1 PID: 1 Comm: swapper/0 Not tainted
>> 4.19.0-trunk-amd64 #1 Debian 4.19.5-1~exp1+pvh1
>> [2.085823] Hardware name: HP ProLiant DL360 G7, BIOS P68 
>> 05/21/2018
>> [2.085895] RIP: e030:ptdump_walk_pgd_level_core+0x1fd/0x490
>> [...]
>
> The offending stable commit is 
> 4074ca7d8a1832921c865d250bbd08f3441b3657
> ("x86/mm: Move LDT remap out of KASLR region on 5-level paging"), this
> is commit d52888aa2753e3063a9d3a0c9f72f94aa9809c15 upstream.
>
> Current upstream kernel is booting fine under Xen, so in general the
> patch should be fine. Using an upstream kernel built from above commit
> (with the then needed Xen fixup patch 1457d8cf7664f34c4ba534) is fine,
> too.
>
> Kirill, are you aware of any prerequisite patch from 4.20 which could 
> be
> missing in 4.19.5?

 I'm not.

 Let me look into this.

>>>
>>> What is making me suspicious is the failure happening just after
>>> releasing the init memory. Maybe there is an access to .init.data
>>> segment or similar? The native kernel booting could be related to the
>>> usage of 2M mappings not being available in a PV-domain.
>>
>> Ahh.. Could you test this:
>>
>> diff --git a/arch/x86/mm/dump_pagetables.c 
>> b/arch/x86/mm/dump_pagetables.c
>> index a12afff146d1..7dec63ec7aab 100644
>> --- a/arch/x86/mm/dump_pagetables.c
>> +++ b/arch/x86/mm/dump_pagetables.c
>> @@ -496,7 +496,7 @@ static inline bool is_hypervisor_range(int idx)
>>   * 8000 - 87ff is reserved for
>>   * the hypervisor.
>>   */
>> -return  (idx >= pgd_index(__PAGE_OFFSET) - 16) &&
>> +return  (idx >= pgd_index(__PAGE_OFFSET) - 17) &&
>>  (idx <  pgd_index(__PAGE_OFFSET));
>>  #else
>>  return false;
>
> Or, better, this:

 That makes it boot again!

 Any idea why upstream doesn't need it?
>>>
>>> Nope.
>>>
>>> I'll prepare a proper fix.
>>>
>>
>> Thanks for looking into this.
>>
>> In the meantime, I applied the "Or, better, this" change, and my dom0
>> boots again.
>>
>> FYI, boot log now: (paste 90d valid)
>> https://paste.debian.net/plainh/48940826
> 
> I forgot to CC you:
> 
> https://lkml.kernel.org/r/20181130121131.g3xvlvixv7mvl...@black.fi.intel.com
> 
> Please give it a try.

I'm not in that thread, so my response here...

You paste a v2-like patch into 'Re: [PATCH 1/2]'. Juergen says:
s/LDT_PGD_ENTRY/GUARD_HOLE_PGD_ENTRY/, then you say Ughh.., change it to
GUARD_HOLE_ENTRY, which does not exist, and then get a Reviewed-by from
Juergen.

I guess it has to be GUARD_HOLE_PGD_ENTRY after all...

arch/x86/include/asm/pgtable_64_types.h:116:31: error:
'GUARD_HOLE_ENTRY' undeclared (first use in this function); did you mean
'GUARD_HOLE_PGD_ENTRY'?

I'll test that instead.

Hans

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [Xen EFI] Impossible to limit the dom0 memory

2018-11-30 Thread msd+xen-de...@msd.im

Hi Jan, hi Juergen,

I'm trying again this week to install Xen on a OVH server 
(https://www.ovh.com/fr/serveurs_dedies/infra/1801eg02.xml).


It is still impossible to boot Xen with the option "dom0_mem=1G,max:1G" 
(boot : EFI->xen).


I have tried with Debian 9 stable/stretch :
- grub2 (2.02~beta3-5+deb9u1)  : https://packages.debian.org/stretch/grub2
- xen-system-amd64 (4.8.4+xsa273+shim4.10.1+xsa273-1+deb9u10) : 
https://packages.debian.org/stretch/xen-system-amd64
- linux-image-amd64 (4.9+80+deb9u6)  (4.9+80+deb9u6) 
https://packages.debian.org/stretch/linux-image-amd64


I have tried with Debian 10 testing/buster :
- grub2 (2.02+dfsg1-8) : https://packages.debian.org/buster/grub2
- xen-system-amd64 (4.11.1~pre.20180911.5acdd26fdc+dfsg-5) : 
https://packages.debian.org/buster/xen-system-amd64
- linux-image-amd64 (4.18+99) : 
https://packages.debian.org/buster/linux-image-amd64


# cat /proc/cpuinfo
processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model   : 158
model name  : Intel(R) Xeon(R) CPU E3-1270 v6 @ 3.80GHz
stepping: 9
microcode   : 0x8e

Microcode version seems to be up to date :
https://www.intel.com/content/dam/www/public/us/en/documents/sa00115-microcode-update-guidance.pdf

Here is my WORKING xen.cfg file :
```
[global]
default=xen

[xen]
options=smt=on
kernel=$VMLINUZ_NAME root=/dev/md3 ro rootdelay=10 noquiet nosplash
ramdisk=$INITRD_NAME
```

Here is my FAILING xen.cfg file (with dom0_mem=1G,max:1G) :
```
[global]
default=xen

[xen]
options=smt=on dom0_mem=1G,max:1G
kernel=vmlinuz root=/dev/md3 ro rootdelay=10 noquiet nosplash
ramdisk=initrd.img
```

For information, if helpful :
- Booting Linux with EFI->grub->Linux works fine.
- Booting Xen with EFI->grub->Xen fails too (In January I only had 1/8 
core available, now I cannot boot but I will discuss this problem in 
another thread).



Do you have more information on the dom0_mem argument problem ?

Best regards,


Guillaume

Le 25/01/2018 à 17:07, msd+xen-de...@msd.im a écrit :

I have installed `linux-image-amd64-dbg` and `binutils`.

I can now execute `addr2line -pfi -e vmlinux-4.14.0-0.bpo.3-amd64 `.

I have generated a file "commands.txt" with all the addresses after 
"Guest stack trace from rsp=82003cb0:" in my log file 
"dom0_crash_with_dom0_memory.txt".


I attached the result : "result.txt".

We can see inside this file "xen/mmu_pv.c:1548" and 
"drivers/firmware/efi/efi.c:558", so I hope it will be helpful.


Is that ok for you ?
Can I do something more ?

Regards,


Guillaume


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH OSSTEST] ts-xen-build: Enable ITS driver in Xen

2018-11-30 Thread Julien Grall
The ITS driver was added in Xen 4.10 as a technical preview feature.
However, it is required in order to boot Xen as Thunder-X because
PCI devices don't support legacy interrupt.

So enable CONFIG_ITS in our Xen build.

Signed-off-by: Julien Grall 
---
 ts-xen-build | 4 
 1 file changed, 4 insertions(+)

diff --git a/ts-xen-build b/ts-xen-build
index c5d2a1d..dedf474 100755
--- a/ts-xen-build
+++ b/ts-xen-build
@@ -126,6 +126,10 @@ END
echo >>xen/.config CONFIG_EXPERT=y
echo >>xen/.config CONFIG_HVM_FEP=y
echo >>xen/.config CONFIG_VERBOSE_DEBUG=y
+   # ITS driver is required to boot the Hardware Domain
+   # on Xen. For now (Xen 4.10/4.11 at at least),
+   # will be not built by default and gated by expert mode
+   echo >>xen/.config CONFIG_HAS_ITS=y
fi
 END
);
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [qemu-mainline test] 130844: regressions - FAIL

2018-11-30 Thread osstest service owner
flight 130844 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/130844/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict 10 debian-hvm-install 
fail REGR. vs. 129996
 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 10 debian-hvm-install 
fail REGR. vs. 129996

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 129996
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 129996
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 129996
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 129996
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 129996
 test-amd64-i386-xl-pvshim12 guest-start  fail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop  fail never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass

version targeted for testing:
 qemuu039d4e3df0049bdd8f93a2de735a816483b13954
baseline version:
 qemuucb968d275c145467c8b385a3618a207ec111eab1

Last test of basis   129996  2018-11-13 22:49:16 Z   16 days
Failing since130168  2018-11-16 04:27:30 Z   14 days9 attempts
Testing same since   130844  2018-11-28 04:15:43 Z2 days1 attempts


People who touched revisions under test:
  Alberto Garcia 
  Aleksandar Markovic 
  Alex Bennée 
  Alistair Francis 
  baldu...@units.it
  Bandan Das 
  Bastian Koppelmann 
  Cornelia Huck 
  Daniel P. Berrangé 
  David Hildenbrand 
  Dr. David Alan Gilbert 
  Edgar E. Iglesias 
  Eduardo Habkost 
  Eric Auger 
  Eric Blake 
  Erik Skultety 
  Fredrik Noring 
  Gerd Hoffmann 
  Greg Kurz 
  Guenter Roeck 
  Hervé Poussineau 
  Igor Druzhinin 
  Jason Wang 
  John Snow 
  Keith Busch 
  Kevin Wolf 
  Laurent Vivier 
  Li Qiang 
  linzhecheng 
  Logan Gunthorpe 
  Luc Michel 
  Mao Zhongyi 
  Marc-André Lureau 
  Mark Cave-Ayland 
  Markus Armbruster 
  Max Filippov 
  Max Reitz 
  Michael Roth 
  Palmer Dabbelt 
  Paolo Bonzini 
  Peter Maydell 
  Philippe Mathieu-Daudé 
  Philippe Mathieu-Daudé 
  Prasad J Pandit 
  Richard Henderson 
  Richard W.M. Jones 
  Roman Kagan 
  Seth Kintigh 
  Stefan Berger 
  Stefan Berger 
  Stefan Markovic 
  Thomas Huth 
  Wang Xin 
  Zhang Chen 
  Zhang Chen 
  ZhiPeng Lu 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf 

Re: [Xen-devel] [PATCH 7/9] libxl: Make killing of device model asynchronous

2018-11-30 Thread George Dunlap


> On Nov 28, 2018, at 4:43 PM, Ian Jackson  wrote:
> 
> George Dunlap writes ("[PATCH 7/9] libxl: Make killing of device model 
> asynchronous"):
>> Or at least, give it an asynchronous interface so that we can make it
>> actually asynchronous in subsequent patches.
>> 
>> Create state structures and callback function signatures.  Add the
>> state structure to libxl__destroy_domid_state.  Break
>> libxl__destroy_domid down into two functions.
> ...
>> +/* Used to detroy the device model */
>> +_hidden void libxl__destroy_device_model(libxl__egc *egc,
>> + libxl__destroy_devicemodel_state 
>> *ddms);
> 
> I think that comment is rather pointless (but I won't object if you
> really want to keep it).  

Yes; that comment looks rather vestigal.

> Conversely it would be nice to say somewhere
> that ddms->callback may be called reentrantly.

What do you mean by reentrantly?  That libxl__destroy_device_model() may end up 
calling it directly (on this execution stack), rather than arranging for it to 
be called by someone else after returning?

 -George


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 7/9] libxl: Make killing of device model asynchronous

2018-11-30 Thread George Dunlap


> On Nov 30, 2018, at 4:12 PM, George Dunlap  wrote:
> 
> 
> 
>> On Nov 28, 2018, at 4:43 PM, Ian Jackson  wrote:
>> 
>> George Dunlap writes ("[PATCH 7/9] libxl: Make killing of device model 
>> asynchronous"):
>>> Or at least, give it an asynchronous interface so that we can make it
>>> actually asynchronous in subsequent patches.
>>> 
>>> Create state structures and callback function signatures.  Add the
>>> state structure to libxl__destroy_domid_state.  Break
>>> libxl__destroy_domid down into two functions.
>> ...
>>> +/* Used to detroy the device model */
>>> +_hidden void libxl__destroy_device_model(libxl__egc *egc,
>>> + libxl__destroy_devicemodel_state 
>>> *ddms);
>> 
>> I think that comment is rather pointless (but I won't object if you
>> really want to keep it).  
> 
> Yes; that comment looks rather vestigal.

Oh, I see; I was following suit with the code around it:

/* Used to destroy a domain with the passed id (it doesn't check for stubs) */
_hidden void libxl__destroy_domid(libxl__egc *egc,
  libxl__destroy_domid_state *dis);

/* Used to detroy the device model */
_hidden void libxl__destroy_device_model(libxl__egc *egc,
 libxl__destroy_devicemodel_state 
*ddms);

/* Entry point for devices destruction */
_hidden void libxl__devices_destroy(libxl__egc *egc,
libxl__devices_remove_state *drs);

It looks cleaner to me to have *something* there than not, just to visually 
make it clear that it has nothing to do with the previous function.

 -George
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 7/9] libxl: Make killing of device model asynchronous

2018-11-30 Thread Ian Jackson
George Dunlap writes ("Re: [PATCH 7/9] libxl: Make killing of device model 
asynchronous"):
> On Nov 28, 2018, at 4:43 PM, Ian Jackson  wrote:
> > Conversely it would be nice to say somewhere
> > that ddms->callback may be called reentrantly.
> 
> What do you mean by reentrantly?  That libxl__destroy_device_model() may end 
> up calling it directly (on this execution stack), rather than arranging for 
> it to be called by someone else after returning?

Precisely.

This kind of callback can, in some situations, be a hazard for the
caller, because it might result in data structures that the caller is
about to rely on vanishing.

It is conventional in libxl to return after setting up callbacks so
this generally doesn't arise, but adding a note to functions with this
property is a good idea.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] Linux 4.19.5 fails to boot as Xen dom0

2018-11-30 Thread Kirill A. Shutemov
On Fri, Nov 30, 2018 at 02:53:50PM +, Hans van Kranenburg wrote:
> On 11/30/18 2:26 PM, Kirill A. Shutemov wrote:
> > On Fri, Nov 30, 2018 at 01:11:56PM +, Hans van Kranenburg wrote:
> >> On 11/29/18 4:06 PM, Kirill A. Shutemov wrote:
> >>> On Thu, Nov 29, 2018 at 03:00:45PM +, Juergen Gross wrote:
>  On 29/11/2018 15:32, Kirill A. Shutemov wrote:
> > On Thu, Nov 29, 2018 at 02:24:47PM +, Kirill A. Shutemov wrote:
> >> On Thu, Nov 29, 2018 at 01:35:17PM +, Juergen Gross wrote:
> >>> On 29/11/2018 14:26, Kirill A. Shutemov wrote:
>  On Thu, Nov 29, 2018 at 09:41:25AM +, Juergen Gross wrote:
> > On 29/11/2018 02:22, Hans van Kranenburg wrote:
> >> Hi,
> >>
> >> As also seen at:
> >> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=914951
> >>
> >> Attached there are two serial console output logs. One is starting 
> >> with
> >> Xen 4.11 (from debian unstable) as dom0, and the other one without 
> >> Xen.
> >>
> >> [2.085543] BUG: unable to handle kernel paging request at
> >> 888d9fffc000
> >> [2.085610] PGD 200c067 P4D 200c067 PUD 0
> >> [2.085674] Oops:  [#1] SMP NOPTI
> >> [2.085736] CPU: 1 PID: 1 Comm: swapper/0 Not tainted
> >> 4.19.0-trunk-amd64 #1 Debian 4.19.5-1~exp1+pvh1
> >> [2.085823] Hardware name: HP ProLiant DL360 G7, BIOS P68 
> >> 05/21/2018
> >> [2.085895] RIP: e030:ptdump_walk_pgd_level_core+0x1fd/0x490
> >> [...]
> >
> > The offending stable commit is 
> > 4074ca7d8a1832921c865d250bbd08f3441b3657
> > ("x86/mm: Move LDT remap out of KASLR region on 5-level paging"), 
> > this
> > is commit d52888aa2753e3063a9d3a0c9f72f94aa9809c15 upstream.
> >
> > Current upstream kernel is booting fine under Xen, so in general the
> > patch should be fine. Using an upstream kernel built from above 
> > commit
> > (with the then needed Xen fixup patch 1457d8cf7664f34c4ba534) is 
> > fine,
> > too.
> >
> > Kirill, are you aware of any prerequisite patch from 4.20 which 
> > could be
> > missing in 4.19.5?
> 
>  I'm not.
> 
>  Let me look into this.
> 
> >>>
> >>> What is making me suspicious is the failure happening just after
> >>> releasing the init memory. Maybe there is an access to .init.data
> >>> segment or similar? The native kernel booting could be related to the
> >>> usage of 2M mappings not being available in a PV-domain.
> >>
> >> Ahh.. Could you test this:
> >>
> >> diff --git a/arch/x86/mm/dump_pagetables.c 
> >> b/arch/x86/mm/dump_pagetables.c
> >> index a12afff146d1..7dec63ec7aab 100644
> >> --- a/arch/x86/mm/dump_pagetables.c
> >> +++ b/arch/x86/mm/dump_pagetables.c
> >> @@ -496,7 +496,7 @@ static inline bool is_hypervisor_range(int idx)
> >> * 8000 - 87ff is reserved for
> >> * the hypervisor.
> >> */
> >> -  return  (idx >= pgd_index(__PAGE_OFFSET) - 16) &&
> >> +  return  (idx >= pgd_index(__PAGE_OFFSET) - 17) &&
> >>(idx <  pgd_index(__PAGE_OFFSET));
> >>  #else
> >>return false;
> >
> > Or, better, this:
> 
>  That makes it boot again!
> 
>  Any idea why upstream doesn't need it?
> >>>
> >>> Nope.
> >>>
> >>> I'll prepare a proper fix.
> >>>
> >>
> >> Thanks for looking into this.
> >>
> >> In the meantime, I applied the "Or, better, this" change, and my dom0
> >> boots again.
> >>
> >> FYI, boot log now: (paste 90d valid)
> >> https://paste.debian.net/plainh/48940826
> > 
> > I forgot to CC you:
> > 
> > https://lkml.kernel.org/r/20181130121131.g3xvlvixv7mvl...@black.fi.intel.com
> > 
> > Please give it a try.
> 
> I'm not in that thread, so my response here...
> 
> You paste a v2-like patch into 'Re: [PATCH 1/2]'. Juergen says:
> s/LDT_PGD_ENTRY/GUARD_HOLE_PGD_ENTRY/, then you say Ughh.., change it to
> GUARD_HOLE_ENTRY, which does not exist, and then get a Reviewed-by from
> Juergen.
> 
> I guess it has to be GUARD_HOLE_PGD_ENTRY after all...
> 
> arch/x86/include/asm/pgtable_64_types.h:116:31: error:
> 'GUARD_HOLE_ENTRY' undeclared (first use in this function); did you mean
> 'GUARD_HOLE_PGD_ENTRY'?
> 
> I'll test that instead.

Yes, thank you. It was a long week... :/

Let me know if it works. I'll repost the fixed version with your
Tested-by.

-- 
 Kirill A. Shutemov

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 7/9] libxl: Make killing of device model asynchronous

2018-11-30 Thread Ian Jackson
George Dunlap writes ("Re: [PATCH 7/9] libxl: Make killing of device model 
asynchronous"):
> It looks cleaner to me to have *something* there than not, just to visually 
> make it clear that it has nothing to do with the previous function.

That's OK by me.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 2/2] ns16550: enable use of PCI MSI

2018-11-30 Thread Roger Pau Monné
On Mon, Oct 01, 2018 at 10:26:05AM -0600, Jan Beulich wrote:
> Which, on x86, requires fiddling with the INTx bit in PCI config space,
> since for internally used MSI we can't delegate this to Dom0.
> 
> ns16550_init_postirq() also needs (benign) re-ordering of its
> operations.
> 
> Signed-off-by: Jan Beulich 
> ---
> v2: Re-base.
> 
> --- a/docs/misc/xen-command-line.markdown
> +++ b/docs/misc/xen-command-line.markdown
> @@ -307,7 +307,7 @@ Flag to indicate whether to probe for a
>  ACPI indicating none to be there.
>  
>  ### com1,com2
> -> `= 
> [/][,[DPS][,[|pci|amt][,[][,[][,[]]`
> +> `= 
> [/][,[DPS][,[|pci|amt][,[|msi][,[][,[]]`
>  
>  Both option `com1` and `com2` follow the same format.
>  
> @@ -328,7 +328,7 @@ Both option `com1` and `com2` follow the
>  * `` is an integer which specifies the IO base port for UART
>registers.
>  * `` is the IRQ number to use, or `0` to use the UART in poll
> -  mode only.
> +  mode only, or `msi` to set up a Message Signaled Interrupt.
>  * `` is the PCI location of the UART, in
>`:.` notation.
>  * `` is the PCI bridge behind which is the UART, in
> --- a/xen/arch/x86/msi.c
> +++ b/xen/arch/x86/msi.c
> @@ -742,6 +742,16 @@ static int msi_capability_init(struct pc
>  
>  *desc = entry;
>  /* Restore the original MSI enabled bits  */
> +if ( !hardware_domain )
> +{
> +/*
> + * ..., except for internal requests (before Dom0 starts), in which
> + * case we rather need to behave "normally", i.e. not follow the 
> split
> + * brain model where Dom0 actually enables MSI (and disables INTx).
> + */
> +pci_intx(dev, 0);
> +control |= PCI_MSI_FLAGS_ENABLE;

Sorry for the split reply, I've been wondering about the MSI enabling
and INTX disabling done here. Xen already owns other PCI devices (AMD
IOMMU for example, see set_iommu_interrupt_handler) that use MSI, yet
they seem to manage to work without this by doing a manual MSI enable
(and I cannot figure out where the INTX disable is done).

Shouldn't Xen have a more uniform way of dealing with MSI interrupt
setup for such devices?

And doesn't your change here imply that some code from the current
internal MSI users should be dropped? There's a call to
__msi_set_enable in the AMD IOMMU code (amd_iommu_msi_enable) that I
guess can be dropped?

Thanks, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 8/9] libxl: Kill QEMU by uid when possible

2018-11-30 Thread George Dunlap

> On Nov 28, 2018, at 4:56 PM, Ian Jackson  wrote:
> 
>> if (!xs_rm(CTX->xsh, XBT_NULL, path))
>> LOGD(ERROR, domid, "xs_rm failed for %s", path);
>> 
>> -/* We should try to destroy the device model anyway. */
>> -rc = kill_device_model(gc,
>> -  GCSPRINTF("/local/domain/%d/image/device-model-pid", domid));
>> +/* 
>> + * We should try to destroy the device model anyway.  Check to see
>> + * if we can kill by UID
>> + */
>> +ret = libxl__xs_read_checked(gc, XBT_NULL,
>> +
>> GCSPRINTF("/local/domain/%d/image/device-model-uid",
>> +   domid),
>> + &dm_uid_str);
> 
> I know this function is bad in its use of `rc' for syscall return but
> please don't make it worse by introducing `ret' for what should be
> `rc'.  Would you mind adding a pre-patch to change `rc' to `r' and
> then you can use `rc’ ?

Actually, it looks like kill_device_model() returns a libxl error value.  So we 
should use the existing rc for both kill_device_model() and xs_read_checked(), 
and introduce `r` to use for setresuid() &c

 -George

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 6/6] x86/microcode: Synchronize late microcode loading

2018-11-30 Thread Boris Ostrovsky
On 11/30/18 4:46 AM, Jan Beulich wrote:
 On 29.11.18 at 23:43,  wrote:
>> One other comment about this patch (which IIRC was raised by Andrew on
>> an earlier version) is that it may be worth to stop timer calibration. I
>> am pretty sure we've seen deadlocks, which is why we ended up disabling
>> it during microcode updates.
> I recall the claim, but I don't think I've ever seen proof. 

I can't provide proof at this point, only somewhat vague memory of
seeing calibration code in the stack dump.

> My point was
> ans still is that if there's a problem with ucode loading using the
> stop-machine logic here, then there's a problem with the stop-machine
> logic in general, which would make other uses, perhaps most notably
> rcu_barrier(), unsafe too. 

Possibly.

rcu_barrier() appears to be only used in cpu hotplug code and power
management, and I don't know whether either has been tested under stress.

In our case it would take multiple microcode updates on relatively large
(~100 cpus) systems before we'd hit the deadlock.


> Otoh from your reply it's not clear whether
> the observed issue wasn't with our present way of loading ucode,
> but then it would put under question the general correctness of
> continue_hypercall_on_cpu(), which again we use for more than just
> ucode loading.

It was with a variation of this new code, not with what's currently in
the tree.

-boris


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 00/21] QOM'ify SysBusDeviceClass->init

2018-11-30 Thread Peter Maydell
On Fri, 30 Nov 2018 at 09:39, Mao Zhongyi
 wrote:
>
> v3 -> v2:
>
> - rebase to the HEAD
> - use SysBusDevice *sbd variable in patch15

> Mao Zhongyi (21):
>   musicpal: Convert sysbus init function to realize function
>   block/noenand: Convert sysbus init function to realize function
>   char/grlib_apbuart: Convert sysbus init function to realize function
>   core/empty_slot: Convert sysbus init function to realize function
>   display/g364fb: Convert sysbus init function to realize function
>   dma/puv3_dma: Convert sysbus init function to realize function
>   gpio/puv3_gpio: Convert sysbus init function to realize function
>   milkymist-softusb: Convert sysbus init function to realize function
>   input/pl050: Convert sysbus init function to realize function
>   intc/puv3_intc: Convert sysbus init function to realize function
>   milkymist-hpdmc: Convert sysbus init function to realize function
>   milkymist-pfpu: Convert sysbus init function to realize function
>   puv3_pm.c: Convert sysbus init function to realize function
>   nvram/ds1225y: Convert sysbus init function to realize function
>   pci-bridge/dec: Convert sysbus init function to realize function
>   timer/etraxfs_timer: Convert sysbus init function to realize function
>   timer/grlib_gptimer: Convert sysbus init function to realize function
>   timer/puv3_ost: Convert sysbus init function to realize function
>   usb/tusb6010: Convert sysbus init function to realize function
>   xen_backend: remove xen_sysdev_init() function
>   core/sysbus: remove the SysBusDeviceClass::init path
>
>  hw/arm/musicpal.c|  9 -
>  hw/block/onenand.c   | 16 +++-
>  hw/char/grlib_apbuart.c  | 12 +---
>  hw/core/empty_slot.c |  9 -
>  hw/core/sysbus.c | 15 +--
>  hw/display/g364fb.c  |  9 +++--
>  hw/dma/puv3_dma.c| 10 --
>  hw/gpio/puv3_gpio.c  | 29 ++---
>  hw/input/milkymist-softusb.c | 16 +++-
>  hw/input/pl050.c | 11 +--
>  hw/intc/puv3_intc.c  | 11 ---
>  hw/misc/milkymist-hpdmc.c|  9 +++--
>  hw/misc/milkymist-pfpu.c | 12 +---
>  hw/misc/puv3_pm.c| 10 --
>  hw/nvram/ds1225y.c   | 12 +---
>  hw/pci-bridge/dec.c  | 12 ++--
>  hw/timer/etraxfs_timer.c | 14 +++---
>  hw/timer/grlib_gptimer.c | 11 +--
>  hw/timer/puv3_ost.c  | 13 ++---
>  hw/usb/tusb6010.c|  8 +++-
>  hw/xen/xen_backend.c |  7 ---
>  include/hw/sysbus.h  |  3 ---
>  22 files changed, 106 insertions(+), 152 deletions(-)

I think all these patches have now been reviewed. Does anybody
have a preference for which tree it goes through, given that it
touches lots of devices?

I can take this via target-arm, unless anybody would particularly
like to have some or all of it go some other route.

thanks
-- PMM

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 3/9] libxl: Get rid of support for QEMU_USER_BASE (xen-qemuuser-domidNN)

2018-11-30 Thread Wei Liu
On Fri, Nov 30, 2018 at 12:53:20PM +, George Dunlap wrote:
> 
> 
> > On Nov 28, 2018, at 3:32 PM, Wei Liu  wrote:
> > 
> > On Fri, Nov 23, 2018 at 05:14:56PM +, George Dunlap wrote:
> >> QEMU_USER_BASE allows a user to specify the UID to use when running
> >> the devicemodel for a specific domain number.  Unfortunately, this is
> >> not really practical: It requires nearly 32,000 entries in
> >> /etc/passwd.  QEMU_USER_RANGE_BASE is much more practical.
> >> 
> >> Remove support for QEMU_USER_BASE.
> >> 
> >> Signed-off-by: George Dunlap 
> > 
> > I have no problem with you removing this, but you forgot to change
> > xl.cfg.pod.5.in.
> 
> Hmm… it seems what I *really* forgot to do was remove that entire
> section and point to the feature doc instead.  It’s quite out of date
> now (e.g., cdrom insert should work I think, qemu is chrooted so can’t
> read world-readable files, &c).  With your permission I’ll put fixing
> that up properly on a to-do list (to be done before 4.12 release at a
> minimum).

Sure.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 2/2] x86/hvm: Corrections to RDTSCP intercept handling

2018-11-30 Thread Andrew Cooper
For both VT-x and SVM, the RDTSCP intercept will trigger if the pipeline
supports the instruction, but the guest may have not have rdtscp in its
featureset.  Bring the vmexit handlers in line with the main emulator
behaviour by optionally handing back #UD.

Next on the AMD side, if RDTSCP actually ends up being intercepted on a debug
build, we first update regs->rcx, then call __get_instruction_length() asking
for RDTSC.  As the two instructions are different (and indeed, different
lengths!), __get_instruction_length_from_list() fails and hands back a #GP
fault.

This can demonstrated by putting a guest into tsc_mode="always emulate" and
executing an rdtscp instruction:

  (d1) --- Xen Test Framework ---
  (d1) Environment: HVM 64bit (Long mode 4 levels)
  (d1) Test rdtscp
  (d1) TSC mode 1
  (XEN) emulate.c:159:d1v0 __get_instruction_length_from_list: Mismatch between 
expected and actual instruction:
  (XEN) emulate.c:163:d1v0   list[0] val 8, { opc 0xf0031, modrm 0 }, list 
entries: 1
  (XEN) emulate.c:165:d1v0   rip 0x10475f, nextrip 0x104762, len 3
  (XEN) Insn_len emulation failed (1): d1v0 64bit @ 0008:0010475f -> 0f 01 f9 
5b 31 ff 31 c0 e9 c4 db ff ff 00 00 00
  (d1) **
  (d1) PANIC: Unhandled exception at 0008:0010475f
  (d1) Vec 13 #GP[]
  (d1) **

First, teach __get_instruction_length() to cope with RDTSCP, and improve
svm_vmexit_do_rdtsc() to ask for the correct instruction.  Move the regs->rcx
adjustment into this function to ensure it gets done after we are done
potentially raising faults.

Reported-by: Paul Durrant 
Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: Wei Liu 
CC: Roger Pau Monné 
CC: Paul Durrant 
CC: Jun Nakajima 
CC: Kevin Tian 
CC: Boris Ostrovsky 
CC: Suravee Suthikulpanit 
CC: Brian Woods 
CC: Juergen Gross 

There is a further 4.12 bug/regression here.  For some reason, master and
staging are now defaulting VMs into an emulated TSC mode.  I have yet to
figure out why.
---
 xen/arch/x86/hvm/svm/emulate.c|  1 +
 xen/arch/x86/hvm/svm/svm.c| 22 +-
 xen/arch/x86/hvm/vmx/vmx.c|  8 
 xen/include/asm-x86/hvm/svm/emulate.h |  1 +
 4 files changed, 27 insertions(+), 5 deletions(-)

diff --git a/xen/arch/x86/hvm/svm/emulate.c b/xen/arch/x86/hvm/svm/emulate.c
index 71a1b6e..0290264 100644
--- a/xen/arch/x86/hvm/svm/emulate.c
+++ b/xen/arch/x86/hvm/svm/emulate.c
@@ -78,6 +78,7 @@ static const struct {
 [INSTR_STGI]= { X86EMUL_OPC(0x0f, 0x01), MODRM(3, 3, 4) },
 [INSTR_CLGI]= { X86EMUL_OPC(0x0f, 0x01), MODRM(3, 3, 5) },
 [INSTR_INVLPGA] = { X86EMUL_OPC(0x0f, 0x01), MODRM(3, 3, 7) },
+[INSTR_RDTSCP]  = { X86EMUL_OPC(0x0f, 0x01), MODRM(3, 7, 1) },
 [INSTR_INVD]= { X86EMUL_OPC(0x0f, 0x08) },
 [INSTR_WBINVD]  = { X86EMUL_OPC(0x0f, 0x09) },
 [INSTR_WRMSR]   = { X86EMUL_OPC(0x0f, 0x30) },
diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index b9a8900..d8d3813 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -2279,14 +2279,28 @@ static void svm_vmexit_do_hlt(struct vmcb_struct *vmcb,
 hvm_hlt(regs->eflags);
 }
 
-static void svm_vmexit_do_rdtsc(struct cpu_user_regs *regs)
+static void svm_vmexit_do_rdtsc(struct cpu_user_regs *regs, bool rdtscp)
 {
+struct vcpu *curr = current;
+const struct domain *currd = curr->domain;
+enum instruction_index insn = rdtscp ? INSTR_RDTSCP : INSTR_RDTSC;
 unsigned int inst_len;
 
-if ( (inst_len = __get_instruction_length(current, INSTR_RDTSC)) == 0 )
+if ( rdtscp && !currd->arch.cpuid->extd.rdtscp &&
+ currd->arch.tsc_mode != TSC_MODE_PVRDTSCP )
+{
+hvm_inject_hw_exception(TRAP_invalid_op, X86_EVENT_NO_EC);
 return;
+}
+
+if ( (inst_len = __get_instruction_length(curr, insn)) == 0 )
+return;
+
 __update_guest_eip(regs, inst_len);
 
+if ( rdtscp )
+regs->rcx = hvm_msr_tsc_aux(curr);
+
 hvm_rdtsc_intercept(regs);
 }
 
@@ -2968,10 +2982,8 @@ void svm_vmexit_handler(struct cpu_user_regs *regs)
 break;
 
 case VMEXIT_RDTSCP:
-regs->rcx = hvm_msr_tsc_aux(v);
-/* fall through */
 case VMEXIT_RDTSC:
-svm_vmexit_do_rdtsc(regs);
+svm_vmexit_do_rdtsc(regs, exit_reason == VMEXIT_RDTSCP);
 break;
 
 case VMEXIT_MONITOR:
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 365eeb2..a9f9b9b 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -3589,6 +3589,7 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
 unsigned long exit_qualification, exit_reason, idtv_info, intr_info = 0;
 unsigned int vector = 0, mode;
 struct vcpu *v = current;
+struct domain *currd = v->domain;
 
 __vmread(GUEST_RIP,®s->rip);
 __vmread(GUEST_RSP,®s->rsp);
@@ -3956,6 +3957,13 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
 vmx_invlpg_

[Xen-devel] [PATCH 1/2] x86/svm: Improve diagnostics when __get_instruction_length_from_list() fails

2018-11-30 Thread Andrew Cooper
Sadly, a lone:

  (XEN) emulate.c:156:d2v0 __get_instruction_length_from_list: Mismatch between 
expected and actual instruction: eip = f804564139c0

on the console is of no use trying to identify what went wrong.  Dump as much
state as we can to help identify what went wrong.

Reported-by: Paul Durrant 
Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: Wei Liu 
CC: Roger Pau Monné 
CC: Paul Durrant 
CC: Boris Ostrovsky 
CC: Suravee Suthikulpanit 
CC: Brian Woods 

RFC: __get_instruction_length_from_list() tries to cope with VMEXIT_IOIO, but
IN/OUT instructions aren't in the decode list and I can't spot an entry point
from the IOIO path.  Am I missing something?

Also, I'm not entirely convinced that making modrm an annonymous union is
going to work with older CentOS compilers, and therefore am not sure whether
that part of the change is worth it.  The instruction in question can be
obtained from the printed INSN_ constant alone.
---
 xen/arch/x86/hvm/svm/emulate.c | 26 +++---
 1 file changed, 19 insertions(+), 7 deletions(-)

diff --git a/xen/arch/x86/hvm/svm/emulate.c b/xen/arch/x86/hvm/svm/emulate.c
index 3d04af0..71a1b6e 100644
--- a/xen/arch/x86/hvm/svm/emulate.c
+++ b/xen/arch/x86/hvm/svm/emulate.c
@@ -56,11 +56,14 @@ static unsigned long svm_nextrip_insn_length(struct vcpu *v)
 
 static const struct {
 unsigned int opcode;
-struct {
-unsigned int rm:3;
-unsigned int reg:3;
-unsigned int mod:2;
-#define MODRM(mod, reg, rm) { rm, reg, mod }
+union {
+struct {
+unsigned int rm:3;
+unsigned int reg:3;
+unsigned int mod:2;
+};
+unsigned int raw;
+#define MODRM(mod, reg, rm) {{ rm, reg, mod }}
 } modrm;
 } opc_tab[INSTR_MAX_COUNT] = {
 [INSTR_PAUSE]   = { X86EMUL_OPC_F3(0, 0x90) },
@@ -152,8 +155,17 @@ int __get_instruction_length_from_list(struct vcpu *v,
 }
 
 gdprintk(XENLOG_WARNING,
- "%s: Mismatch between expected and actual instruction: "
- "eip = %lx\n",  __func__, (unsigned long)vmcb->rip);
+ "%s: Mismatch between expected and actual instruction:\n",
+ __func__);
+gdprintk(XENLOG_WARNING,
+ "  list[0] val %d, { opc %#x, modrm %#x }, list entries: %u\n",
+ list[0], opc_tab[list[0]].opcode, opc_tab[list[0]].modrm.raw,
+ list_count);
+gdprintk(XENLOG_WARNING, "  rip 0x%lx, nextrip 0x%lx, len %lu\n",
+ vmcb->rip, vmcb->nextrip, vmcb->nextrip - vmcb->rip);
+hvm_dump_emulation_state(XENLOG_G_WARNING, "Insn_len",
+ &ctxt, X86EMUL_UNHANDLEABLE);
+
 hvm_inject_hw_exception(TRAP_gp_fault, 0);
 return 0;
 }
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 0/2] Fixes to RDTSCP interception

2018-11-30 Thread Andrew Cooper
For some reason (on EPYC at least) we've gained a regression into master
whereby VMs are defaulting to one of the emulated TSC modes.  This may be Xen
coming to the conclusion that there isn't a reliable TSC.  Combined with a
debug Xen, this breaks RDTSCP completely.

With this series in place and RDTSCP functioning correctly again, VMs are
still unwilling to boot.  I haven't managed to figure out why yet.

Andrew Cooper (2):
  x86/svm: Improve diagnostics when __get_instruction_length_from_list() fails
  x86/hvm: Corrections to RDTSCP intercept handling

 xen/arch/x86/hvm/svm/emulate.c| 27 ---
 xen/arch/x86/hvm/svm/svm.c| 22 +-
 xen/arch/x86/hvm/vmx/vmx.c|  8 
 xen/include/asm-x86/hvm/svm/emulate.h |  1 +
 4 files changed, 46 insertions(+), 12 deletions(-)

-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 0/3] Remove tmem

2018-11-30 Thread Ian Jackson
Wei Liu writes ("[PATCH v2 0/3] Remove tmem"):
> It is agreed that tmem can be removed from xen.git. See the thread starting   
>   
>   
> from .

Those are notes from some phone call amongst industry stakeholders.
None of the messages have a Subject line mentioning tmem.  There is no
explanation of the basis for the decision; just a confirmation from
the current maintainers that they will ack the removal.

I think this is not really an appropriate way to carry on!  What if
there is someone else who wants to step up to maintain this ?  What
about user communication ?  Going straight from `Supported' to
`Deleted' seems rather vigorous.


In summary I think the claim that "It is agreed" in this cover letter
is false (or, at least, if it is true, the cover letter provides no
references to any basis for thinking that it is true).

If it didn't happen on the mailing list it didn't happen.


Unfortunately, therefore, on process grounds,

Nacked-by: Ian Jackson 


I dare say the decision to remove it now might be right.

Can we please start this again with a proper explanation of why this
should be summarily deleted, rather than (say) made unmaintained and
deprecated for a release ?  Can someone explain why we don't feel the
need to consult anyone by (say) posting to xen-announce ?  etc.

Then we can actually have an on-list discussion where the decision
would be taken.

Next time I suggest a good first step would be a patch which deletes
the M: line from MAINTAINERS and changes the status to Orphan, since
obviously the current maintainers don't want it.  That patch should be
uncontroversial.  Also in general, depending who we think might be
using a feature, a plan which gives some warning to users (by
deprecating the feature, for example) would often be a good idea.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 1/3] tools: remove tmem code and commands

2018-11-30 Thread Ian Jackson
Wei Liu writes ("[PATCH v2 1/3] tools: remove tmem code and commands"):
> Remove all tmem related code in libxc.
> 
> Leave some stubs in libxl in case anyone has linked to those functions
> before the removal.
> 
> Remove all tmem related commands in xl, all tmem related code in other
> utilities we ship.

Amazingly I see nothing in the libxl domain config about this.  If
there were then we would have to decide what to do if the domain had
tmem-related config.  But AFAICT there is could be no such thing.

On a technical level therefore, there is nothing wrong with this
patch.  However, for the process reasons I have explained in my other
message,

Nacked-by: Ian Jackson 

Sorry,
Ian.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 1/2] x86/svm: Improve diagnostics when __get_instruction_length_from_list() fails

2018-11-30 Thread Woods, Brian
On Fri, Nov 30, 2018 at 05:07:19PM +, Andy Cooper wrote:
> Sadly, a lone:
> 
>   (XEN) emulate.c:156:d2v0 __get_instruction_length_from_list: Mismatch 
> between expected and actual instruction: eip = f804564139c0
> 
> on the console is of no use trying to identify what went wrong.  Dump as much
> state as we can to help identify what went wrong.
> 
> Reported-by: Paul Durrant 
> Signed-off-by: Andrew Cooper 

Acked-by: Brian Woods 

-- 
Brian Woods

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH for-4.12] xen/arm: gic: Make sure the number of interrupt lines is valid before using it

2018-11-30 Thread Julien Grall
GICv2 and GICv3 supports up to 1020 interrupts. However, the value computed
from GICD_TYPER.ITLinesNumber can be up to 1024. On GICv3, we will end up to
write in reserved registers that are right after the IROUTERs one as the
value is not capped early enough.

Cap the number of interrupts as soon as we compute it so we know we can
safely using it afterwards.

Signed-off-by: Julien Grall 
Reported-by: Jan-Peter Larsson 

---
This patch should be backport up to Xen 4.9.
---
 xen/arch/arm/gic-v2.c | 7 ---
 xen/arch/arm/gic-v3.c | 7 ---
 2 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/xen/arch/arm/gic-v2.c b/xen/arch/arm/gic-v2.c
index 1a744c576f..e9fb8a01ab 100644
--- a/xen/arch/arm/gic-v2.c
+++ b/xen/arch/arm/gic-v2.c
@@ -353,6 +353,10 @@ static void __init gicv2_dist_init(void)
 
 type = readl_gicd(GICD_TYPER);
 nr_lines = 32 * ((type & GICD_TYPE_LINES) + 1);
+/* Only 1020 interrupts are supported */
+nr_lines = min(1020U, nr_lines);
+gicv2_info.nr_lines = nr_lines;
+
 gic_cpus = 1 + ((type & GICD_TYPE_CPUS) >> 5);
 printk("GICv2: %d lines, %d cpu%s%s (IID %8.8x).\n",
nr_lines, gic_cpus, (gic_cpus == 1) ? "" : "s",
@@ -377,9 +381,6 @@ static void __init gicv2_dist_init(void)
 for ( i = 32; i < nr_lines; i += 32 )
 writel_gicd(~0x0, GICD_ICENABLER + (i / 32) * 4);
 
-/* Only 1020 interrupts are supported */
-gicv2_info.nr_lines = min(1020U, nr_lines);
-
 /* Turn on the distributor */
 writel_gicd(GICD_CTL_ENABLE, GICD_CTLR);
 }
diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
index 6fbc106757..c9200d24e1 100644
--- a/xen/arch/arm/gic-v3.c
+++ b/xen/arch/arm/gic-v3.c
@@ -607,6 +607,10 @@ static void __init gicv3_dist_init(void)
 if ( type & GICD_TYPE_LPIS )
 gicv3_lpi_init_host_lpis(GICD_TYPE_ID_BITS(type));
 
+/* Only 1020 interrupts are supported */
+nr_lines = min(1020U, nr_lines);
+gicv3_info.nr_lines = nr_lines;
+
 printk("GICv3: %d lines, (IID %8.8x).\n",
nr_lines, readl_relaxed(GICD + GICD_IIDR));
 
@@ -646,9 +650,6 @@ static void __init gicv3_dist_init(void)
 
 for ( i = NR_GIC_LOCAL_IRQS; i < nr_lines; i++ )
 writeq_relaxed(affinity, GICD + GICD_IROUTER + i * 8);
-
-/* Only 1020 interrupts are supported */
-gicv3_info.nr_lines = min(1020U, nr_lines);
 }
 
 static int gicv3_enable_redist(void)
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 2/2] x86/hvm: Corrections to RDTSCP intercept handling

2018-11-30 Thread Woods, Brian
On Fri, Nov 30, 2018 at 05:07:20PM +, Andy Cooper wrote:
> For both VT-x and SVM, the RDTSCP intercept will trigger if the pipeline
> supports the instruction, but the guest may have not have rdtscp in its
> featureset.  Bring the vmexit handlers in line with the main emulator
> behaviour by optionally handing back #UD.
> 
> Next on the AMD side, if RDTSCP actually ends up being intercepted on a debug
> build, we first update regs->rcx, then call __get_instruction_length() asking
> for RDTSC.  As the two instructions are different (and indeed, different
> lengths!), __get_instruction_length_from_list() fails and hands back a #GP
> fault.
> 
> This can demonstrated by putting a guest into tsc_mode="always emulate" and
> executing an rdtscp instruction:
> 
>   (d1) --- Xen Test Framework ---
>   (d1) Environment: HVM 64bit (Long mode 4 levels)
>   (d1) Test rdtscp
>   (d1) TSC mode 1
>   (XEN) emulate.c:159:d1v0 __get_instruction_length_from_list: Mismatch 
> between expected and actual instruction:
>   (XEN) emulate.c:163:d1v0   list[0] val 8, { opc 0xf0031, modrm 0 }, list 
> entries: 1
>   (XEN) emulate.c:165:d1v0   rip 0x10475f, nextrip 0x104762, len 3
>   (XEN) Insn_len emulation failed (1): d1v0 64bit @ 0008:0010475f -> 0f 01 f9 
> 5b 31 ff 31 c0 e9 c4 db ff ff 00 00 00
>   (d1) **
>   (d1) PANIC: Unhandled exception at 0008:0010475f
>   (d1) Vec 13 #GP[]
>   (d1) **
> 
> First, teach __get_instruction_length() to cope with RDTSCP, and improve
> svm_vmexit_do_rdtsc() to ask for the correct instruction.  Move the regs->rcx
> adjustment into this function to ensure it gets done after we are done
> potentially raising faults.
> 
> Reported-by: Paul Durrant 
> Signed-off-by: Andrew Cooper 

As far as the SVM part:
Reviewed-by: Brian Woods 

-- 
Brian Woods

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] Backport candidate for Arm

2018-11-30 Thread Julien Grall

Hi,

Below a list of backport candidate for Arm.


For Xen 4.10+ to handle correctly SMC call parameters and result

35fc608612  xen/arm: smccc-1.1: Make return values unsigned long
fa7974f743  xen/arm: smccc-1.1: Handle function result as parameters

For Xen 4.9+ to avoid Dom0 crash when using less vCPUs than pCPUs on GICv3

703d9d5ec1  xen/arm: vgic-v3: Delay the initialization of the domain 
information
54ec59f6b0   	xen/arm: vgic-v3: Don't create empty re-distributor regions 



The following patch is required in Xen 4.11 to avoid breaking the new vGIC after 
applying the 2 previous patches.


62aa9e7f1b  xen/arm: Don't build GICv3 with the new vGIC

For 4.9+ to make the interrupt path safer by adding missing barriers

177afec455  xen/arm: gic: Ensure we have an ISB between ack and do_IRQ()
555e5f1bd2  xen/arm: gic: Ensure ordering between read of INTACK and shared 
data

For 4.9+ to comply with the binding description

3689c54630  xen/arm: check for multiboot nodes only under /chosen

Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [xen-unstable-smoke test] 130877: tolerable all pass - PUSHED

2018-11-30 Thread osstest service owner
flight 130877 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/130877/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  82855aba5bf91e50c81526167c11d4aeaf665e66
baseline version:
 xen  2634b997afabfdc5a972e07e536dfbc6febb4385

Last test of basis   130873  2018-11-30 12:00:40 Z0 days
Testing same since   130877  2018-11-30 15:00:26 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Olaf Hering 
  Wei Liu 

jobs:
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-amd64-amd64-xl-qemuu-debianhvm-i386 pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   2634b997af..82855aba5b  82855aba5bf91e50c81526167c11d4aeaf665e66 -> smoke

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] Linux 4.19.5 fails to boot as Xen dom0

2018-11-30 Thread Hans van Kranenburg
On 11/30/18 5:21 PM, Kirill A. Shutemov wrote:
> On Fri, Nov 30, 2018 at 02:53:50PM +, Hans van Kranenburg wrote:
>> On 11/30/18 2:26 PM, Kirill A. Shutemov wrote:
>>> On Fri, Nov 30, 2018 at 01:11:56PM +, Hans van Kranenburg wrote:
 On 11/29/18 4:06 PM, Kirill A. Shutemov wrote:
> On Thu, Nov 29, 2018 at 03:00:45PM +, Juergen Gross wrote:
>> On 29/11/2018 15:32, Kirill A. Shutemov wrote:
>>> On Thu, Nov 29, 2018 at 02:24:47PM +, Kirill A. Shutemov wrote:
 On Thu, Nov 29, 2018 at 01:35:17PM +, Juergen Gross wrote:
> On 29/11/2018 14:26, Kirill A. Shutemov wrote:
>> On Thu, Nov 29, 2018 at 09:41:25AM +, Juergen Gross wrote:
>>> On 29/11/2018 02:22, Hans van Kranenburg wrote:
 Hi,

 As also seen at:
 https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=914951

 Attached there are two serial console output logs. One is starting 
 with
 Xen 4.11 (from debian unstable) as dom0, and the other one without 
 Xen.

 [2.085543] BUG: unable to handle kernel paging request at
 888d9fffc000
 [2.085610] PGD 200c067 P4D 200c067 PUD 0
 [2.085674] Oops:  [#1] SMP NOPTI
 [2.085736] CPU: 1 PID: 1 Comm: swapper/0 Not tainted
 4.19.0-trunk-amd64 #1 Debian 4.19.5-1~exp1+pvh1
 [2.085823] Hardware name: HP ProLiant DL360 G7, BIOS P68 
 05/21/2018
 [2.085895] RIP: e030:ptdump_walk_pgd_level_core+0x1fd/0x490
 [...]
>>>
>>> The offending stable commit is 
>>> 4074ca7d8a1832921c865d250bbd08f3441b3657
>>> ("x86/mm: Move LDT remap out of KASLR region on 5-level paging"), 
>>> this
>>> is commit d52888aa2753e3063a9d3a0c9f72f94aa9809c15 upstream.
>>>
>>> Current upstream kernel is booting fine under Xen, so in general the
>>> patch should be fine. Using an upstream kernel built from above 
>>> commit
>>> (with the then needed Xen fixup patch 1457d8cf7664f34c4ba534) is 
>>> fine,
>>> too.
>>>
>>> Kirill, are you aware of any prerequisite patch from 4.20 which 
>>> could be
>>> missing in 4.19.5?
>>
>> I'm not.
>>
>> Let me look into this.
>>
>
> What is making me suspicious is the failure happening just after
> releasing the init memory. Maybe there is an access to .init.data
> segment or similar? The native kernel booting could be related to the
> usage of 2M mappings not being available in a PV-domain.

 Ahh.. Could you test this:

 diff --git a/arch/x86/mm/dump_pagetables.c 
 b/arch/x86/mm/dump_pagetables.c
 index a12afff146d1..7dec63ec7aab 100644
 --- a/arch/x86/mm/dump_pagetables.c
 +++ b/arch/x86/mm/dump_pagetables.c
 @@ -496,7 +496,7 @@ static inline bool is_hypervisor_range(int idx)
 * 8000 - 87ff is reserved for
 * the hypervisor.
 */
 -  return  (idx >= pgd_index(__PAGE_OFFSET) - 16) &&
 +  return  (idx >= pgd_index(__PAGE_OFFSET) - 17) &&
(idx <  pgd_index(__PAGE_OFFSET));
  #else
return false;
>>>
>>> Or, better, this:
>>
>> That makes it boot again!
>>
>> Any idea why upstream doesn't need it?
>
> Nope.
>
> I'll prepare a proper fix.
>

 Thanks for looking into this.

 In the meantime, I applied the "Or, better, this" change, and my dom0
 boots again.

 FYI, boot log now: (paste 90d valid)
 https://paste.debian.net/plainh/48940826
>>>
>>> I forgot to CC you:
>>>
>>> https://lkml.kernel.org/r/20181130121131.g3xvlvixv7mvl...@black.fi.intel.com
>>>
>>> Please give it a try.
>>
>> I'm not in that thread, so my response here...
>>
>> You paste a v2-like patch into 'Re: [PATCH 1/2]'. Juergen says:
>> s/LDT_PGD_ENTRY/GUARD_HOLE_PGD_ENTRY/, then you say Ughh.., change it to
>> GUARD_HOLE_ENTRY, which does not exist, and then get a Reviewed-by from
>> Juergen.
>>
>> I guess it has to be GUARD_HOLE_PGD_ENTRY after all...
>>
>> arch/x86/include/asm/pgtable_64_types.h:116:31: error:
>> 'GUARD_HOLE_ENTRY' undeclared (first use in this function); did you mean
>> 'GUARD_HOLE_PGD_ENTRY'?
>>
>> I'll test that instead.
> 
> Yes, thank you. It was a long week... :/
> 
> Let me know if it works. I'll repost the fixed version with your
> Tested-by.

Ok. It boots fine as Xen dom0. \o/

You can use "Hans van Kranenburg "
(lowercase please) for reported/tested in the real v2.

Hans


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mail

Re: [Xen-devel] [PATCH v3] pvcalls-front: Avoid __get_free_pages(GFP_KERNEL) under spinlock

2018-11-30 Thread Boris Ostrovsky
On 11/30/18 6:01 AM, Wen Yang wrote:
> The problem is that we call this with a spin lock held.
> The call tree is:
> pvcalls_front_accept() holds bedata->socket_lock.
> -> create_active()
> -> __get_free_pages() uses GFP_KERNEL
>
> The create_active() function is only called from pvcalls_front_accept()
> with a spin_lock held, The allocation is not allowed to sleep and
> GFP_KERNEL is not sufficient.
>
> This issue was detected by using the Coccinelle software.
>
> v2: Add a function doing the allocations which is called
> outside the lock and passing the allocated data to
> create_active().
> v3: Use the matching deallocators i.e., free_page()
> and free_pages(), respectively.
>
> Suggested-by: Juergen Gross 
> Signed-off-by: Wen Yang 
> CC: Julia Lawall 
> CC: Boris Ostrovsky 
> CC: Juergen Gross 
> CC: Stefano Stabellini 
> CC: xen-devel@lists.xenproject.org
> CC: linux-ker...@vger.kernel.org
> ---
>  drivers/xen/pvcalls-front.c | 71 ++---
>  1 file changed, 59 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
> index 2f11ca72a281..a26f416daf46 100644
> --- a/drivers/xen/pvcalls-front.c
> +++ b/drivers/xen/pvcalls-front.c
> @@ -335,7 +335,43 @@ int pvcalls_front_socket(struct socket *sock)
>   return ret;
>  }
>  
> -static int create_active(struct sock_mapping *map, int *evtchn)
> +struct sock_mapping_active_ring {
> + struct pvcalls_data_intf *ring;
> + RING_IDX ring_order;
> + void *bytes;
> +};
> +
> +static int alloc_active_ring(struct sock_mapping_active_ring *active_ring)
> +{
> + active_ring->ring = NULL;

This is not necessary.

> + active_ring->bytes = NULL;
> +
> + active_ring->ring = (struct pvcalls_data_intf *)
> + __get_free_page(GFP_KERNEL | __GFP_ZERO);
> + if (active_ring->ring == NULL)
> + goto out_error;
> + active_ring->ring_order = PVCALLS_RING_ORDER;
> + active_ring->bytes = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO,
> + PVCALLS_RING_ORDER);
> + if (active_ring->bytes == NULL)
> + goto out_error;
> +
> + return 0;
> +
> +out_error:
> + free_pages((unsigned long)active_ring->bytes, active_ring->ring_order);
> + free_page((unsigned long)active_ring->ring);
> + return -ENOMEM;
> +}
> +



> @@ -397,6 +427,7 @@ int pvcalls_front_connect(struct socket *sock, struct 
> sockaddr *addr,
>   struct sock_mapping *map = NULL;
>   struct xen_pvcalls_request *req;
>   int notify, req_id, ret, evtchn;
> + struct sock_mapping_active_ring active_ring;
>  
>   if (addr->sa_family != AF_INET || sock->type != SOCK_STREAM)
>   return -EOPNOTSUPP;
> @@ -406,15 +437,21 @@ int pvcalls_front_connect(struct socket *sock, struct 
> sockaddr *addr,
>   return PTR_ERR(map);
>  
>   bedata = dev_get_drvdata(&pvcalls_front_dev->dev);
> + ret = alloc_active_ring(&active_ring);

Why not just alloc_active_ring(map)?


-boris


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH RFCv2 1/4] mm/memory_hotplug: Introduce memory block types

2018-11-30 Thread David Hildenbrand
Memory onlining should always be handled by user space, because only user
space knows which use cases it wants to satisfy. E.g. memory might be
onlined to the MOVABLE zone even if it can never be removed from the
system, e.g. to make usage of huge pages more reliable.

However to implement such rules (especially default rules in distributions)
we need more information about the memory that was added in user space.

E.g. on x86 we want to online memory provided by balloon devices (e.g.
XEN, Hyper-V) differently (-> will not be unplugged by offlining the whole
block) than ordinary DIMMs (-> might eventually be unplugged by offlining
the whole block). This might also become relevat for other architectures.

Also, udev rules right now check if running on s390x and treat all added
memory blocks as standby memory (-> don't online automatically). As soon as
we support other memory hotplug mechanism (e.g. virtio-mem) checks would
have to get more involved (e.g. also check if under KVM) but eventually
also wrong (e.g. if KVM ever supports standby memory we are doomed).

I decided to allow to specify the type of memory that is getting added
to the system. Let's start with two types, BOOT and UNSPECIFIED to get the
basic infrastructure running. We'll introduce and use further types in
follow-up patches. For now we classify any hotplugged memory temporarily
as as UNSPECIFIED (which will eventually be dropped later on).

Cc: Greg Kroah-Hartman 
Cc: "Rafael J. Wysocki" 
Cc: Andrew Morton 
Cc: Ingo Molnar 
Cc: Pavel Tatashin 
Cc: Stephen Rothwell 
Cc: Andrew Banman 
Cc: "mike.tra...@hpe.com" 
Cc: Oscar Salvador 
Cc: Dave Hansen 
Cc: Michal Hocko 
Cc: Michal Suchánek 
Cc: Vitaly Kuznetsov 
Cc: Dan Williams 
Cc: Pavel Tatashin 
Cc: Martin Schwidefsky 
Cc: Heiko Carstens 
Signed-off-by: David Hildenbrand 
---
 drivers/base/memory.c  | 38 +++---
 include/linux/memory.h | 27 +++
 2 files changed, 62 insertions(+), 3 deletions(-)

diff --git a/drivers/base/memory.c b/drivers/base/memory.c
index 0c290f86ab20..17f2985c07c5 100644
--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -381,6 +381,29 @@ static ssize_t show_phys_device(struct device *dev,
return sprintf(buf, "%d\n", mem->phys_device);
 }
 
+static ssize_t type_show(struct device *dev, struct device_attribute *attr,
+char *buf)
+{
+   struct memory_block *mem = to_memory_block(dev);
+   ssize_t len = 0;
+
+   switch (mem->type) {
+   case MEMORY_BLOCK_UNSPECIFIED:
+   len = sprintf(buf, "unspecified\n");
+   break;
+   case MEMORY_BLOCK_BOOT:
+   len = sprintf(buf, "boot\n");
+   break;
+   default:
+   len = sprintf(buf, "ERROR-UNKNOWN-%ld\n",
+   mem->state);
+   WARN_ON(1);
+   break;
+   }
+
+   return len;
+}
+
 #ifdef CONFIG_MEMORY_HOTREMOVE
 static void print_allowed_zone(char *buf, int nid, unsigned long start_pfn,
unsigned long nr_pages, int online_type,
@@ -442,6 +465,7 @@ static DEVICE_ATTR(phys_index, 0444, 
show_mem_start_phys_index, NULL);
 static DEVICE_ATTR(state, 0644, show_mem_state, store_mem_state);
 static DEVICE_ATTR(phys_device, 0444, show_phys_device, NULL);
 static DEVICE_ATTR(removable, 0444, show_mem_removable, NULL);
+static DEVICE_ATTR_RO(type);
 
 /*
  * Block size attribute stuff
@@ -620,6 +644,7 @@ static struct attribute *memory_memblk_attrs[] = {
&dev_attr_state.attr,
&dev_attr_phys_device.attr,
&dev_attr_removable.attr,
+   &dev_attr_type.attr,
 #ifdef CONFIG_MEMORY_HOTREMOVE
&dev_attr_valid_zones.attr,
 #endif
@@ -657,13 +682,17 @@ int register_memory(struct memory_block *memory)
 }
 
 static int init_memory_block(struct memory_block **memory,
-struct mem_section *section, unsigned long state)
+struct mem_section *section, unsigned long state,
+int type)
 {
struct memory_block *mem;
unsigned long start_pfn;
int scn_nr;
int ret = 0;
 
+   if (type == MEMORY_BLOCK_NONE)
+   return -EINVAL;
+
mem = kzalloc(sizeof(*mem), GFP_KERNEL);
if (!mem)
return -ENOMEM;
@@ -675,6 +704,7 @@ static int init_memory_block(struct memory_block **memory,
mem->state = state;
start_pfn = section_nr_to_pfn(mem->start_section_nr);
mem->phys_device = arch_get_memory_phys_device(start_pfn);
+   mem->type = type;
 
ret = register_memory(mem);
 
@@ -699,7 +729,8 @@ static int add_memory_block(int base_section_nr)
 
if (section_count == 0)
return 0;
-   ret = init_memory_block(&mem, __nr_to_section(section_nr), MEM_ONLINE);
+   ret = init_memory_block(&mem, __nr_to_section(section_nr), MEM_ONLINE,
+   MEMORY_BLOCK_BOOT);
  

[Xen-devel] [PATCH RFCv2 3/4] mm/memory_hotplug: Introduce and use more memory types

2018-11-30 Thread David Hildenbrand
Let's introduce new types for different kinds of memory blocks and use
them in existing code. As I don't see an easy way to split this up,
do it in one hunk for now.

acpi:
 Use DIMM or DIMM_UNREMOVABLE depending on hotremove support in the kernel.
 Properly change the type when trying to add memory that was already
 detected and used during boot (so this memory will correctly end up as
 "acpi" in user space).

pseries:
 Use DIMM or DIMM_UNREMOVABLE depending on hotremove support in the kernel.
 As far as I see, handling like in the acpi case for existing blocks is
 not required.

probed memory from user space:
 Use DIMM_UNREMOVABLE as there is no interface to get rid of this code
 again.

hv_balloon,xen/balloon:
 Use BALLOON. As simple as that :)

s390x/sclp:
 Use a dedicated type S390X_STANDBY as this type of memory and it's
 semantics are very s390x specific.

powernv/memtrace:
 Only allow to use BOOT memory for memtrace. I consider this code in
 general dangerous, but we have to keep it working ... most probably just
 a debug feature.

Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: Michael Ellerman 
Cc: "Rafael J. Wysocki" 
Cc: Len Brown 
Cc: Greg Kroah-Hartman 
Cc: "K. Y. Srinivasan" 
Cc: Haiyang Zhang 
Cc: Stephen Hemminger 
Cc: Martin Schwidefsky 
Cc: Heiko Carstens 
Cc: Boris Ostrovsky 
Cc: Juergen Gross 
Cc: Stefano Stabellini 
Cc: Rashmica Gupta 
Cc: Andrew Morton 
Cc: Pavel Tatashin 
Cc: Balbir Singh 
Cc: Michael Neuling 
Cc: Nathan Fontenot 
Cc: YueHaibing 
Cc: Vasily Gorbik 
Cc: Ingo Molnar 
Cc: Stephen Rothwell 
Cc: "mike.tra...@hpe.com" 
Cc: Oscar Salvador 
Cc: Joonsoo Kim 
Cc: Mathieu Malaterre 
Cc: Michal Hocko 
Cc: Arun KS 
Cc: Andrew Banman 
Cc: Dave Hansen 
Cc: Michal Suchánek 
Cc: Vitaly Kuznetsov 
Cc: Dan Williams 
Signed-off-by: David Hildenbrand 

---

At first I tried to abstract the types quite a lot, but I think there
are subtle differences that are worth differentiating. More details about
the types can be found in the excessive documentation.

It is wort noting that BALLOON_MOVABLE has no user yet, but I have
something in mind that might want to make use of that (virtio-mem).
Just included it to discuss the general approach. I can drop it from
this patch.
---
 arch/powerpc/platforms/powernv/memtrace.c |  9 ++--
 .../platforms/pseries/hotplug-memory.c|  7 ++-
 drivers/acpi/acpi_memhotplug.c| 16 ++-
 drivers/base/memory.c | 18 ++-
 drivers/hv/hv_balloon.c   |  3 +-
 drivers/s390/char/sclp_cmd.c  |  3 +-
 drivers/xen/balloon.c |  2 +-
 include/linux/memory.h| 47 ++-
 include/linux/memory_hotplug.h|  6 +--
 mm/memory_hotplug.c   | 15 +++---
 10 files changed, 104 insertions(+), 22 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/memtrace.c 
b/arch/powerpc/platforms/powernv/memtrace.c
index 248a38ad25c7..5d08db87091e 100644
--- a/arch/powerpc/platforms/powernv/memtrace.c
+++ b/arch/powerpc/platforms/powernv/memtrace.c
@@ -54,9 +54,9 @@ static const struct file_operations memtrace_fops = {
.open   = simple_open,
 };
 
-static int check_memblock_online(struct memory_block *mem, void *arg)
+static int check_memblock_boot_and_online(struct memory_block *mem, void *arg)
 {
-   if (mem->state != MEM_ONLINE)
+   if (mem->type != MEM_BLOCK_BOOT || mem->state != MEM_ONLINE)
return -1;
 
return 0;
@@ -77,7 +77,7 @@ static bool memtrace_offline_pages(u32 nid, u64 start_pfn, 
u64 nr_pages)
u64 end_pfn = start_pfn + nr_pages - 1;
 
if (walk_memory_range(start_pfn, end_pfn, NULL,
-   check_memblock_online))
+   check_memblock_boot_and_online))
return false;
 
walk_memory_range(start_pfn, end_pfn, (void *)MEM_GOING_OFFLINE,
@@ -233,7 +233,8 @@ static int memtrace_online(void)
ent->mem = 0;
}
 
-   if (add_memory(ent->nid, ent->start, ent->size)) {
+   if (add_memory(ent->nid, ent->start, ent->size,
+  MEMORY_BLOCK_BOOT)) {
pr_err("Failed to add trace memory to node %d\n",
ent->nid);
ret += 1;
diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c 
b/arch/powerpc/platforms/pseries/hotplug-memory.c
index 2a983b5a52e1..5f91359c7993 100644
--- a/arch/powerpc/platforms/pseries/hotplug-memory.c
+++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
@@ -651,7 +651,7 @@ static int dlpar_memory_remove_by_ic(u32 lmbs_to_remove, 
u32 drc_index)
 static int dlpar_add_lmb(struct drmem_lmb *lmb)
 {
unsigned long block_sz;
-   int nid, rc;
+   int nid, rc, type = MEMORY_BLOCK_DIMM;
 
if (lmb->flags & DRCONF_MEM_ASSIGNED)
return -EINVAL;
@@ -667,8 +667,11 @@ static int dlpar_add_lmb(struct drmem_

[Xen-devel] [PATCH RFCv2 0/4] mm/memory_hotplug: Introduce memory block types

2018-11-30 Thread David Hildenbrand
This is the second approach, introducing more meaningful memory block
types and not changing online behavior in the kernel. It is based on
latest linux-next.

As we found out during dicussion, user space should always handle onlining
of memory, in any case. However in order to make smart decisions in user
space about if and how to online memory, we have to export more information
about memory blocks. This way, we can formulate rules in user space.

One such information is the type of memory block we are talking about.
This helps to answer some questions like:
- Does this memory block belong to a DIMM?
- Can this DIMM theoretically ever be unplugged again?
- Was this memory added by a balloon driver that will rely on balloon
  inflation to remove chunks of that memory again? Which zone is advised?
- Is this special standby memory on s390x that is usually not automatically
  onlined?

And in short it helps to answer to some extend (excluding zone imbalances)
- Should I online this memory block?
- To which zone should I online this memory block?
... of course special use cases will result in different anwers. But that's
why user space has control of onlining memory.

More details can be found in Patch 1 and Patch 3.
Tested on x86 with hotplugged DIMMs. Cross-compiled for PPC and s390x.


Example:
$ udevadm info -q all -a /sys/devices/system/memory/memory0
KERNEL=="memory0"
SUBSYSTEM=="memory"
DRIVER==""
ATTR{online}=="1"
ATTR{phys_device}=="0"
ATTR{phys_index}==""
ATTR{removable}=="0"
ATTR{state}=="online"
ATTR{type}=="boot"
ATTR{valid_zones}=="none"
$ udevadm info -q all -a /sys/devices/system/memory/memory90
KERNEL=="memory90"
SUBSYSTEM=="memory"
DRIVER==""
ATTR{online}=="1"
ATTR{phys_device}=="0"
ATTR{phys_index}=="005a"
ATTR{removable}=="1"
ATTR{state}=="online"
ATTR{type}=="dimm"
ATTR{valid_zones}=="Normal"


RFC -> RFCv2:
- Now also taking care of PPC (somehow missed it :/ )
- Split the series up to some degree (some ideas on how to split up patch 3
  would be very welcome)
- Introduce more memory block types. Turns out abstracting too much was
  rather confusing and not helpful. Properly document them.

Notes:
- I wanted to convert the enum of types into a named enum but this
  provoked all kinds of different errors. For now, I am doing it just like
  the other types (e.g. online_type) we are using in that context.
- The "removable" property should never have been named like that. It
  should have been "offlinable". Can we still rename that? E.g. boot memory
  is sometimes marked as removable ...

David Hildenbrand (4):
  mm/memory_hotplug: Introduce memory block types
  mm/memory_hotplug: Replace "bool want_memblock" by "int type"
  mm/memory_hotplug: Introduce and use more memory types
  mm/memory_hotplug: Drop MEMORY_TYPE_UNSPECIFIED

 arch/ia64/mm/init.c   |  4 +-
 arch/powerpc/mm/mem.c |  4 +-
 arch/powerpc/platforms/powernv/memtrace.c |  9 +--
 .../platforms/pseries/hotplug-memory.c|  7 +-
 arch/s390/mm/init.c   |  4 +-
 arch/sh/mm/init.c |  4 +-
 arch/x86/mm/init_32.c |  4 +-
 arch/x86/mm/init_64.c |  8 +--
 drivers/acpi/acpi_memhotplug.c| 16 -
 drivers/base/memory.c | 60 ++--
 drivers/hv/hv_balloon.c   |  3 +-
 drivers/s390/char/sclp_cmd.c  |  3 +-
 drivers/xen/balloon.c |  2 +-
 include/linux/memory.h| 69 ++-
 include/linux/memory_hotplug.h| 18 ++---
 kernel/memremap.c |  6 +-
 mm/memory_hotplug.c   | 29 
 17 files changed, 194 insertions(+), 56 deletions(-)

-- 
2.17.2


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH RFCv2 2/4] mm/memory_hotplug: Replace "bool want_memblock" by "int type"

2018-11-30 Thread David Hildenbrand
Let's pass a memory block type instead. Pass "MEMORY_BLOCK_NONE" for device
memory and for now "MEMORY_BLOCK_UNSPECIFIED" for anything else. No
functional change.

Cc: Tony Luck 
Cc: Fenghua Yu 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: Michael Ellerman 
Cc: Martin Schwidefsky 
Cc: Heiko Carstens 
Cc: Yoshinori Sato 
Cc: Rich Felker 
Cc: Dave Hansen 
Cc: Andy Lutomirski 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: "H. Peter Anvin" 
Cc: x...@kernel.org
Cc: Greg Kroah-Hartman 
Cc: "Rafael J. Wysocki" 
Cc: Andrew Morton 
Cc: Mike Rapoport 
Cc: Michal Hocko 
Cc: Dan Williams 
Cc: "Kirill A. Shutemov" 
Cc: Oscar Salvador 
Cc: Nicholas Piggin 
Cc: Stephen Rothwell 
Cc: Christophe Leroy 
Cc: "Jonathan Neuschäfer" 
Cc: Mauricio Faria de Oliveira 
Cc: Vasily Gorbik 
Cc: Arun KS 
Cc: Rob Herring 
Cc: Pavel Tatashin 
Cc: "mike.tra...@hpe.com" 
Cc: Joonsoo Kim 
Cc: Wei Yang 
Cc: Logan Gunthorpe 
Cc: "Jérôme Glisse" 
Cc: "Jan H. Schönherr" 
Cc: Dave Jiang 
Cc: Matthew Wilcox 
Cc: Mathieu Malaterre 
Signed-off-by: David Hildenbrand 
---
 arch/ia64/mm/init.c|  4 ++--
 arch/powerpc/mm/mem.c  |  4 ++--
 arch/s390/mm/init.c|  4 ++--
 arch/sh/mm/init.c  |  4 ++--
 arch/x86/mm/init_32.c  |  4 ++--
 arch/x86/mm/init_64.c  |  8 
 drivers/base/memory.c  | 11 +++
 include/linux/memory.h |  2 +-
 include/linux/memory_hotplug.h | 12 ++--
 kernel/memremap.c  |  6 --
 mm/memory_hotplug.c| 16 
 11 files changed, 40 insertions(+), 35 deletions(-)

diff --git a/arch/ia64/mm/init.c b/arch/ia64/mm/init.c
index 904fe55e10fc..408635d2902f 100644
--- a/arch/ia64/mm/init.c
+++ b/arch/ia64/mm/init.c
@@ -646,13 +646,13 @@ mem_init (void)
 
 #ifdef CONFIG_MEMORY_HOTPLUG
 int arch_add_memory(int nid, u64 start, u64 size, struct vmem_altmap *altmap,
-   bool want_memblock)
+   int type)
 {
unsigned long start_pfn = start >> PAGE_SHIFT;
unsigned long nr_pages = size >> PAGE_SHIFT;
int ret;
 
-   ret = __add_pages(nid, start_pfn, nr_pages, altmap, want_memblock);
+   ret = __add_pages(nid, start_pfn, nr_pages, altmap, type);
if (ret)
printk("%s: Problem encountered in __add_pages() as ret=%d\n",
   __func__,  ret);
diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index b3c9ee5c4f78..e394637da270 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -118,7 +118,7 @@ int __weak remove_section_mapping(unsigned long start, 
unsigned long end)
 }
 
 int __meminit arch_add_memory(int nid, u64 start, u64 size, struct vmem_altmap 
*altmap,
-   bool want_memblock)
+ int type)
 {
unsigned long start_pfn = start >> PAGE_SHIFT;
unsigned long nr_pages = size >> PAGE_SHIFT;
@@ -135,7 +135,7 @@ int __meminit arch_add_memory(int nid, u64 start, u64 size, 
struct vmem_altmap *
}
flush_inval_dcache_range(start, start + size);
 
-   return __add_pages(nid, start_pfn, nr_pages, altmap, want_memblock);
+   return __add_pages(nid, start_pfn, nr_pages, altmap, type);
 }
 
 #ifdef CONFIG_MEMORY_HOTREMOVE
diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c
index 3e82f66d5c61..ba2c56328e6d 100644
--- a/arch/s390/mm/init.c
+++ b/arch/s390/mm/init.c
@@ -225,7 +225,7 @@ device_initcall(s390_cma_mem_init);
 #endif /* CONFIG_CMA */
 
 int arch_add_memory(int nid, u64 start, u64 size, struct vmem_altmap *altmap,
-   bool want_memblock)
+   int type)
 {
unsigned long start_pfn = PFN_DOWN(start);
unsigned long size_pages = PFN_DOWN(size);
@@ -235,7 +235,7 @@ int arch_add_memory(int nid, u64 start, u64 size, struct 
vmem_altmap *altmap,
if (rc)
return rc;
 
-   rc = __add_pages(nid, start_pfn, size_pages, altmap, want_memblock);
+   rc = __add_pages(nid, start_pfn, size_pages, altmap, type);
if (rc)
vmem_remove_mapping(start, size);
return rc;
diff --git a/arch/sh/mm/init.c b/arch/sh/mm/init.c
index 1a483a008872..5fbb8724e0f2 100644
--- a/arch/sh/mm/init.c
+++ b/arch/sh/mm/init.c
@@ -419,14 +419,14 @@ void free_initrd_mem(unsigned long start, unsigned long 
end)
 
 #ifdef CONFIG_MEMORY_HOTPLUG
 int arch_add_memory(int nid, u64 start, u64 size, struct vmem_altmap *altmap,
-   bool want_memblock)
+   int type)
 {
unsigned long start_pfn = PFN_DOWN(start);
unsigned long nr_pages = size >> PAGE_SHIFT;
int ret;
 
/* We only have ZONE_NORMAL, so this is easy.. */
-   ret = __add_pages(nid, start_pfn, nr_pages, altmap, want_memblock);
+   ret = __add_pages(nid, start_pfn, nr_pages, altmap, type);
if (unlikely(ret))
printk("%s: Failed, __add_pages() == %d\n", __func__, ret);
 
diff --git a/arch/x86/mm

[Xen-devel] [PATCH RFCv2 4/4] mm/memory_hotplug: Drop MEMORY_TYPE_UNSPECIFIED

2018-11-30 Thread David Hildenbrand
We now have proper types for all users, we can drop this one.

Cc: Greg Kroah-Hartman 
Cc: "Rafael J. Wysocki" 
Cc: Andrew Morton 
Cc: Ingo Molnar 
Cc: Pavel Tatashin 
Cc: Stephen Rothwell 
Cc: Andrew Banman 
Cc: "mike.tra...@hpe.com" 
Cc: Oscar Salvador 
Cc: Dave Hansen 
Cc: Michal Hocko 
Cc: Michal Suchánek 
Cc: Vitaly Kuznetsov 
Cc: Dan Williams 
Cc: Pavel Tatashin 
Signed-off-by: David Hildenbrand 
---
 drivers/base/memory.c  | 3 ---
 include/linux/memory.h | 5 -
 2 files changed, 8 deletions(-)

diff --git a/drivers/base/memory.c b/drivers/base/memory.c
index c5fdca7a3009..a6e524f0ea38 100644
--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -388,9 +388,6 @@ static ssize_t type_show(struct device *dev, struct 
device_attribute *attr,
ssize_t len = 0;
 
switch (mem->type) {
-   case MEMORY_BLOCK_UNSPECIFIED:
-   len = sprintf(buf, "unspecified\n");
-   break;
case MEMORY_BLOCK_BOOT:
len = sprintf(buf, "boot\n");
break;
diff --git a/include/linux/memory.h b/include/linux/memory.h
index a3a1e9764805..11679622f743 100644
--- a/include/linux/memory.h
+++ b/include/linux/memory.h
@@ -50,10 +50,6 @@ int set_memory_block_size_order(unsigned int order);
  *  No memory block is to be created (e.g. device memory). Not exposed to
  *  user space.
  *
- * MEMORY_BLOCK_UNSPECIFIED:
- *  The type of memory block was not further specified when adding the
- *  memory block.
- *
  * MEMORY_BLOCK_BOOT:
  *  This memory block was added during boot by the basic system. No
  *  specific device driver takes care of this memory block. This memory
@@ -103,7 +99,6 @@ int set_memory_block_size_order(unsigned int order);
  */
 enum {
MEMORY_BLOCK_NONE = 0,
-   MEMORY_BLOCK_UNSPECIFIED,
MEMORY_BLOCK_BOOT,
MEMORY_BLOCK_DIMM,
MEMORY_BLOCK_DIMM_UNREMOVABLE,
-- 
2.17.2


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 0/3] Remove tmem

2018-11-30 Thread Wei Liu
On Fri, Nov 30, 2018 at 05:09:42PM +, Ian Jackson wrote:
> Wei Liu writes ("[PATCH v2 0/3] Remove tmem"):
> > It is agreed that tmem can be removed from xen.git. See the thread starting 
> > 
> >   
> > from .
> 
> Those are notes from some phone call amongst industry stakeholders.
> None of the messages have a Subject line mentioning tmem.  There is no
> explanation of the basis for the decision; just a confirmation from
> the current maintainers that they will ack the removal.
> 
> I think this is not really an appropriate way to carry on!  What if
> there is someone else who wants to step up to maintain this ?  What
> about user communication ?  Going straight from `Supported' to
> `Deleted' seems rather vigorous.

Step up to maintain> I would rather say step up to develop.

The status in MAINTAINERS is wrong. According to SUPPORT.md, it is only
experimental. Our definition of "experimental" is:

   Functional completeness: No
   Functional stability: Here be dragons
   Interface stability: Not stable
   Security supported: No

(https://wiki.xenproject.org/wiki/Xen_Project_Release_Features/Definitions)

This means without putting in significant effort, no-one would be able
to use TMEM. There is no stability guarantee at all for TMEM interface.
Deleting something experimental doesn't seem controversial to me.

I dare say no-one cared because it has got zero development effort in
years since 4.6. Also as you already noticed, no-one can possibly uses
TMEM since the switch to xl (that' even earlier than 4.6).

> 
> 
> In summary I think the claim that "It is agreed" in this cover letter
> is false (or, at least, if it is true, the cover letter provides no
> references to any basis for thinking that it is true).
> 
> If it didn't happen on the mailing list it didn't happen.
> 
> 
> Unfortunately, therefore, on process grounds,
> 
> Nacked-by: Ian Jackson 
> 

Yet the removal of ia64 port wasn't warned or announced on xen-announce,
so I disagree the removal is wrong on process grounds -- since there has
already been a precedence.

If there is a policy file, I would be happy to comply.

> 
> I dare say the decision to remove it now might be right.
> 
> Can we please start this again with a proper explanation of why this
> should be summarily deleted, rather than (say) made unmaintained and
> deprecated for a release ?  Can someone explain why we don't feel the
> need to consult anyone by (say) posting to xen-announce ?  etc.
> 

See above.

> Then we can actually have an on-list discussion where the decision
> would be taken.
> 
> Next time I suggest a good first step would be a patch which deletes
> the M: line from MAINTAINERS and changes the status to Orphan, since
> obviously the current maintainers don't want it.  That patch should be
> uncontroversial.  Also in general, depending who we think might be
> using a feature, a plan which gives some warning to users (by
> deprecating the feature, for example) would often be a good idea.
> 

Can we not invent policy and ask for compliance on the fly?

Wei.

> Ian.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] remove the ->mapping_error method from dma_map_ops V3

2018-11-30 Thread Russell King - ARM Linux
On Fri, Nov 30, 2018 at 02:22:08PM +0100, Christoph Hellwig wrote:
> Error reporting for the dma_map_single and dma_map_page operations is
> currently a mess.  Both APIs directly return the dma_addr_t to be used for
> the DMA, with a magic error escape that is specific to the instance and
> checked by another method provided.  This has a few downsides:
> 
>  - the error check is easily forgotten and a __must_check marker doesn't
>help as the value always is consumed anyway
>  - the error checking requires another indirect call, which have gotten
>incredibly expensive
>  - a lot of code is wasted on implementing these methods
> 
> The historical reason for this is that people thought DMA mappings would
> not fail when the API was created, which sounds like a really bad
> assumption in retrospective, and then we tried to cram error handling
> onto it later on.
> 
> There basically are two variants:  the error code is 0 because the
> implementation will never return 0 as a valid DMA address, or the error
> code is all-F as the implementation won't ever return an address that
> high.  The old AMD GART is the only one not falling into these two camps
> as it picks sort of a relative zero relative to where it is mapped.
> 
> The 0 return doesn't work for direct mappings that have ram at address
> zero and a lot of IOMMUs that start allocating bus space from address
> zero, so we can't consolidate on that, but I think we can move everyone
> to all-Fs, which the patch here does.  The reason for that is that
> there is only one way to ever get this address: by doing a 1-byte long,
> 1-byte aligned mapping, but all our mappings are not only longer but
> generally aligned, and the mappings have to keep at least the basic
> alignment.
> 
> A git tree is also available here:
> 
> git://git.infradead.org/users/hch/misc.git dma-mapping-error.3

Hi Chris,

For patches 1, 3 and 23,

Acked-by: Russell King 

Thanks.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] remove the ->mapping_error method from dma_map_ops V3

2018-11-30 Thread Linus Torvalds
On Fri, Nov 30, 2018 at 5:23 AM Christoph Hellwig  wrote:
>
> Error reporting for the dma_map_single and dma_map_page operations is
> currently a mess. [..]

I don't see anything objectionable there, but it's not like I checked
any of the context of the patches. For all I know there are odd corner
cases where some zero vs DMA_MAPPING_ERROR confusion lurks, but I
guess we'll find out..

So you can add my acked-by, fwiw.

Linus

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 1/4] xen/arm: gic: Ensure we have an ISB between ack and do_IRQ()

2018-11-30 Thread Andrii Anisov

Hello Andre,

Please see my comments below:

On 23.11.18 14:18, Andre Przywara wrote:
Fundamentally there is a semantic difference between edge and level 
triggered IRQs: When the guest has handled an *edge* IRQ (EOIed so

the LR's state goes to 0), this is done and dusted, and Xen doesn't
need to care about this anymore until the next IRQ occurs.> For level
triggered IRQs, even though the guest has handled it, we need to
resample the (potentially virtual) IRQ line, as it may come up or 
down at the *device*'s discretion: the interrupt reason might have

gone away (GPIO condition no longer true), even before we were able
to inject it, or there might be another interrupt reason not yet
handled (incoming serial character while serving a transmit
interrupt). Also typically it's up to the interrupt handler to
confirm handling the interrupt, either explicitly by clearing an
interrupt bit in some status register or implicitly, for instance by
draining a FIFO, say on a serial device. So even though from the
(V)GIC's point of view the interrupt has been processed (EOIed), it
might still be pending.

So, as I understand the intended behavior of a vGIC for the level
interrupt is following cases:
1. in case the interrupt line is still active from HW side, but
   interrupt handler from VM EOIs the interrupt, it should
   be signaled to vCPU by vGIC again
2. in case a peripheral deactivated interrupt line, but VM did not
   activated it yet, this interrupt should be removed from pending for
   VM

IMO, case 1 is indirectly supported by old vgic. For HW interrupts its 
pretty natural: deactivation by VM in VGIC leads to deactivation in GIC, 
so the interrupt priority is restored and GIC will trap PCPU to reinsert 
it. This will be seen by VM as immediate IRQ trap after EOI. Also
Case 2 is not implemented in the old vgic. It is somehow supported by 
new vgic, though it also relies on the trap to the hypervisor to commit 
the update to LRs. But it's rather a problem of GIC arch/implementation, 
which does not signal CPU nor updates associated LR on level interrupt 
deassertion.



My intimate "old Xen VGIC" knowledge has been swapped out from my
brain meanwhile, but IIRC Xen treats every IRQ as if it would be an
edge IRQ. Which works if the guest's interrupt handler behaves
correctly. Most IRQ handlers have a loop to iterate over all possible
interrupt reasons and process them, so the line goes indeed down
before they EOI the IRQ.


I've spent some time to look through the new vgic implementation and I 
have a note about it:
It's not clear why are you probing level interrupts on guest->hyp 
transition. While it targets case 2 described above, it seems to be more 
relevant to probe the level interrupts right before hyp->guest 
transition. Because vcpu might be descheduled and while it hangs on 
scheduler queues interrupt line level has more chances to be changed by 
peripheral itself.
Also I'm pretty scared of new vgic locking scheme with per-irq locks and 
locking logic i.e. in vgic_queue_irq_unlock() function. Also sorting 
list in vgic_flush_lr_state() with vgic_irq_cmp() looks very expensive.


But, for sure all that stuff performance should be properly evaluated 
and measured.


--
Sincerely,
Andrii Anisov.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [xen-4.8-testing test] 130847: FAIL

2018-11-30 Thread osstest service owner
flight 130847 xen-4.8-testing real [real]
http://logs.test-lab.xenproject.org/osstest/logs/130847/

Failures and problems with tests :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-arm64  broken  in 130804
 build-arm64-pvopsbroken  in 130804
 build-arm64-xsm  broken  in 130804

Tests which are failing intermittently (not blocking):
 test-armhf-armhf-xl-vhd   6 xen-installfail pass in 130804
 test-amd64-i386-xl-raw   10 debian-di-install  fail pass in 130804

Regressions which are regarded as allowable (not blocking):
 build-arm64-xsm 2 hosts-allocate broken in 130804 REGR. vs. 129810
 build-arm64 2 hosts-allocate broken in 130804 REGR. vs. 129810
 build-arm64-pvops   2 hosts-allocate broken in 130804 REGR. vs. 129810

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked in 130804 n/a
 test-arm64-arm64-xl-credit2   1 build-check(1)   blocked in 130804 n/a
 build-arm64-libvirt   1 build-check(1)   blocked in 130804 n/a
 test-arm64-arm64-xl-credit1   1 build-check(1)   blocked in 130804 n/a
 test-arm64-arm64-xl-xsm   1 build-check(1)   blocked in 130804 n/a
 test-arm64-arm64-xl   1 build-check(1)   blocked in 130804 n/a
 build-arm64-xsm  3 capture-logs broken in 130804 blocked in 129810
 build-arm64-pvops3 capture-logs broken in 130804 blocked in 129810
 build-arm64  3 capture-logs broken in 130804 blocked in 129810
 test-xtf-amd64-amd64-4   69 xtf/test-hvm64-xsa-278  fail blocked in 129810
 test-xtf-amd64-amd64-1 69 xtf/test-hvm64-xsa-278 fail in 130804 blocked in 
129810
 test-xtf-amd64-amd64-1 50 xtf/test-hvm64-lbr-tsx-vmentry fail in 130804 like 
129729
 test-armhf-armhf-xl-vhd 12 migrate-support-check fail in 130804 never pass
 test-armhf-armhf-xl-vhd 13 saverestore-support-check fail in 130804 never pass
 test-xtf-amd64-amd64-4  50 xtf/test-hvm64-lbr-tsx-vmentry fail like 129570
 test-xtf-amd64-amd64-3  50 xtf/test-hvm64-lbr-tsx-vmentry fail like 129729
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 129810
 test-armhf-armhf-xl-rtds 16 guest-start/debian.repeatfail  like 129810
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 129810
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 129810
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 129810
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop fail like 129810
 test-amd64-amd64-xl-rtds 10 debian-install   fail  like 129810
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 129810
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stopfail like 129810
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop fail like 129810
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-

[Xen-devel] [PATCHv2 1/2] x86/mm: Fix guard hole handling

2018-11-30 Thread Kirill A. Shutemov
There is a guard hole at the beginning of kernel address space, also
used by hypervisors. It occupies 16 PGD entries.

We do not state the reserved range directly, but calculate it relative
to other entities: direct mapping and user space ranges.

The calculation got broken by recent change in kernel memory layout: LDT
remap range is now mapped before direct mapping and makes the calculation
invalid.

The breakage leads to crash on Xen dom0 boot[1].

State the reserved range directly. It's part of kernel ABI (hypervisors
expect it to be stable) and must not depend on changes in the rest of
kernel memory layout.

[1] https://lists.xenproject.org/archives/html/xen-devel/2018-11/msg03313.html

Signed-off-by: Kirill A. Shutemov 
Reported-by: Hans van Kranenburg 
Tested-by: Hans van Kranenburg 
Reviewed-by: Juergen Gross 
Fixes: d52888aa2753 ("x86/mm: Move LDT remap out of KASLR region on 5-level 
paging")
Signed-off-by: Kirill A. Shutemov 
---
 arch/x86/include/asm/pgtable_64_types.h |  5 +
 arch/x86/mm/dump_pagetables.c   |  8 
 arch/x86/xen/mmu_pv.c   | 11 ++-
 3 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/arch/x86/include/asm/pgtable_64_types.h 
b/arch/x86/include/asm/pgtable_64_types.h
index 84bd9bdc1987..88bca456da99 100644
--- a/arch/x86/include/asm/pgtable_64_types.h
+++ b/arch/x86/include/asm/pgtable_64_types.h
@@ -111,6 +111,11 @@ extern unsigned int ptrs_per_p4d;
  */
 #define MAXMEM (1UL << MAX_PHYSMEM_BITS)
 
+#define GUARD_HOLE_PGD_ENTRY   -256UL
+#define GUARD_HOLE_SIZE(16UL << PGDIR_SHIFT)
+#define GUARD_HOLE_BASE_ADDR   (GUARD_HOLE_PGD_ENTRY << PGDIR_SHIFT)
+#define GUARD_HOLE_END_ADDR(GUARD_HOLE_BASE_ADDR + GUARD_HOLE_SIZE)
+
 #define LDT_PGD_ENTRY  -240UL
 #define LDT_BASE_ADDR  (LDT_PGD_ENTRY << PGDIR_SHIFT)
 #define LDT_END_ADDR   (LDT_BASE_ADDR + PGDIR_SIZE)
diff --git a/arch/x86/mm/dump_pagetables.c b/arch/x86/mm/dump_pagetables.c
index fc37bbd23eb8..dad153e5a427 100644
--- a/arch/x86/mm/dump_pagetables.c
+++ b/arch/x86/mm/dump_pagetables.c
@@ -512,11 +512,11 @@ static inline bool is_hypervisor_range(int idx)
 {
 #ifdef CONFIG_X86_64
/*
-* 8000 - 87ff is reserved for
-* the hypervisor.
+* A hole in the beginning of kernel address space reserved
+* for a hypervisor.
 */
-   return  (idx >= pgd_index(__PAGE_OFFSET) - 16) &&
-   (idx <  pgd_index(__PAGE_OFFSET));
+   return  (idx >= pgd_index(GUARD_HOLE_BASE_ADDR)) &&
+   (idx <  pgd_index(GUARD_HOLE_END_ADDR));
 #else
return false;
 #endif
diff --git a/arch/x86/xen/mmu_pv.c b/arch/x86/xen/mmu_pv.c
index a5d7ed125337..0f4fe206dcc2 100644
--- a/arch/x86/xen/mmu_pv.c
+++ b/arch/x86/xen/mmu_pv.c
@@ -648,19 +648,20 @@ static int __xen_pgd_walk(struct mm_struct *mm, pgd_t 
*pgd,
  unsigned long limit)
 {
int i, nr, flush = 0;
-   unsigned hole_low, hole_high;
+   unsigned hole_low = 0, hole_high = 0;
 
/* The limit is the last byte to be touched */
limit--;
BUG_ON(limit >= FIXADDR_TOP);
 
+#ifdef CONFIG_X86_64
/*
 * 64-bit has a great big hole in the middle of the address
-* space, which contains the Xen mappings.  On 32-bit these
-* will end up making a zero-sized hole and so is a no-op.
+* space, which contains the Xen mappings.
 */
-   hole_low = pgd_index(USER_LIMIT);
-   hole_high = pgd_index(PAGE_OFFSET);
+   hole_low = pgd_index(GUARD_HOLE_BASE_ADDR);
+   hole_high = pgd_index(GUARD_HOLE_END_ADDR);
+#endif
 
nr = pgd_index(limit) + 1;
for (i = 0; i < nr; i++) {
-- 
2.19.2


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCHv2 0/2] Fixups for LDT remap placement change

2018-11-30 Thread Kirill A. Shutemov
There's a couple fixes for the recent LDT remap placement change.

The first patch fixes crash when kernel booted as Xen dom0.

The second patch fixes address space markers in dump_pagetables output.
It's purely cosmetic change, backporting to the stable tree is optional.

v2:
 - Fix typo

Kirill A. Shutemov (2):
  x86/mm: Fix guard hole handling
  x86/dump_pagetables: Fix LDT remap address marker

 arch/x86/include/asm/pgtable_64_types.h |  5 +
 arch/x86/mm/dump_pagetables.c   | 15 ++-
 arch/x86/xen/mmu_pv.c   | 11 ++-
 3 files changed, 17 insertions(+), 14 deletions(-)

-- 
2.19.2


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

  1   2   >