[Xen-devel] [libvirt test] 100404: regressions - FAIL

2016-08-11 Thread osstest service owner
flight 100404 libvirt real [real]
http://logs.test-lab.xenproject.org/osstest/logs/100404/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64-xsm   5 xen-buildfail REGR. vs. 100381

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-xsm   1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 13 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 14 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-qcow2 13 guest-saverestorefail never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 14 guest-saverestorefail   never pass

version targeted for testing:
 libvirt  9aea8cd4ae76b5f62ea365dd56d4d9beb96bb024
baseline version:
 libvirt  5b8643099a99dc4ee0dac4bf543a874ffc4c314f

Last test of basis   100381  2016-08-10 04:20:25 Z1 days
Testing same since   100404  2016-08-11 04:20:33 Z0 days1 attempts


People who touched revisions under test:
  Chen Hanxiao 
  Cole Robinson 
  Erik Skultety 
  Jiri Denemark 
  Laine Stump 
  Michal Privoznik 

jobs:
 build-amd64-xsm  fail
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm   blocked 
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmblocked 
 test-amd64-amd64-libvirt-xsm blocked 
 test-armhf-armhf-libvirt-xsm fail
 test-amd64-i386-libvirt-xsm  blocked 
 test-amd64-amd64-libvirt pass
 test-armhf-armhf-libvirt fail
 test-amd64-i386-libvirt  pass
 test-amd64-amd64-libvirt-pairpass
 test-amd64-i386-libvirt-pair pass
 test-armhf-armhf-libvirt-qcow2   fail
 test-armhf-armhf-libvirt-raw fail
 test-amd64-amd64-libvirt-vhd pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.


commit 9aea8cd4ae76b5f62ea365dd56d4d9beb96bb024
Author: Michal Privoznik 
Date:   Tue Aug 9 19:25:44 2016 +0200

virNetDevMacVLanCreateWithVPortProfile: Drop @ret

Usually, this variable is used to hold the return value for a
function of ours. Well, this is not the case. Its use does not
match our pattern and therefore it is very misleading. Drop it
and define an alternative @rc variable, but only in that single
block where it is needed.

Signed-off-by: Michal Privoznik 

commit 42712002fd49e

[Xen-devel] [xen-unstable test] 100395: regressions - FAIL

2016-08-11 Thread osstest service owner
flight 100395 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/100395/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-armhf-armhf-libvirt-raw  6 xen-boot fail REGR. vs. 100377
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm 15 guest-localmigrate/x10 fail 
REGR. vs. 100377
 test-amd64-amd64-xl-qemuu-win7-amd64  9 windows-install  fail REGR. vs. 100377

Regressions which are regarded as allowable (not blocking):
 build-amd64-rumpuserxen   6 xen-buildfail  like 100377
 build-i386-rumpuserxen6 xen-buildfail  like 100377
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop fail like 100377
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stopfail like 100377
 test-amd64-amd64-xl-rtds  9 debian-install   fail  like 100377
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail like 100377

Tests which did not succeed, but are not blocking:
 test-amd64-i386-rumpuserxen-i386  1 build-check(1)   blocked  n/a
 test-amd64-amd64-rumpuserxen-amd64  1 build-check(1)   blocked n/a
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-xsm  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 14 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt 14 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-qcow2 13 guest-saverestorefail never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  9b3f9b9c30f8dc121fe1bbf915a31e46cb926e83
baseline version:
 xen  7f5c8075364776eb139bbd421ad443ae9e4465dc

Last test of basis   100377  2016-08-10 03:02:59 Z1 days
Testing same since   100395  2016-08-10 13:47:13 Z0 days1 attempts


People who touched revisions under test:
  Boris Ostrovsky 
  George Dunlap 
  Jan Beulich 

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-oldkern  pass
 build-i386-oldkern  

Re: [Xen-devel] [BUG] kernel BUG at drivers/block/xen-blkfront.c:1711

2016-08-11 Thread Evgenii Shatokhin

On 11.08.2016 05:10, Bob Liu wrote:


On 08/10/2016 10:54 PM, Evgenii Shatokhin wrote:

On 10.08.2016 15:49, Bob Liu wrote:


On 08/10/2016 08:33 PM, Evgenii Shatokhin wrote:

On 14.07.2016 15:04, Bob Liu wrote:


On 07/14/2016 07:49 PM, Evgenii Shatokhin wrote:

On 11.07.2016 15:04, Bob Liu wrote:



On 07/11/2016 04:50 PM, Evgenii Shatokhin wrote:

On 06.06.2016 11:42, Dario Faggioli wrote:

Just Cc-ing some Linux, block, and Xen on CentOS people...



Ping.

Any suggestions how to debug this or what might cause the problem?

Obviously, we cannot control Xen on the Amazon's servers. But perhaps there is 
something we can do at the kernel's side, is it?


On Mon, 2016-06-06 at 11:24 +0300, Evgenii Shatokhin wrote:

(Resending this bug report because the message I sent last week did
not
make it to the mailing list somehow.)

Hi,

One of our users gets kernel panics from time to time when he tries
to
use his Amazon EC2 instance with CentOS7 x64 in it [1]. Kernel panic
happens within minutes from the moment the instance starts. The
problem
does not show up every time, however.

The user first observed the problem with a custom kernel, but it was
found later that the stock kernel 3.10.0-327.18.2.el7.x86_64 from
CentOS7 was affected as well.


Please try this patch:
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=7b0767502b5db11cb1f0daef2d01f6d71b1192dc

Regards,
Bob



Unfortunately, it did not help. The same BUG_ON() in blkfront_setup_indirect() 
still triggers in our kernel based on RHEL's 3.10.0-327.18.2, where I added the 
patch.

As far as I can see, the patch makes sure the indirect pages are added to the list 
only if (!info->feature_persistent) holds. I suppose it holds in our case and 
the pages are added to the list because the triggered BUG_ON() is here:

   if (!info->feature_persistent && info->max_indirect_segments) {
   <...>
   BUG_ON(!list_empty(&info->indirect_pages));
   <...>
   }



That's odd.
Could you please try to reproduce this issue with a recent upstream kernel?

Thanks,
Bob


No luck with the upstream kernel 4.7.0 so far due to unrelated issues (bad 
initrd, I suppose, so the system does not even boot).

However, the problem reproduced with the stable upstream kernel 3.14.74. After 
the system booted the second time with this kernel, that BUG_ON triggered:
   kernel BUG at drivers/block/xen-blkfront.c:1701



Could you please provide more detail on how to reproduce this bug? I'd like to 
have a test.

Thanks!
Bob


As the user says, he uses an Amazon EC2 instance. Namely: HVM CentOS7 AMI on a 
c3.large instance with EBS magnetic storage.



Oh, then it would be difficult to debug this issue.
The xen-blkfront communicates with xen-blkback(in dom0 or driver domain), but 
that part is a black box when running Amazon EC2.
We can't see the source code of the backend side!


Yes, and another problem is, I am still unable to reproduce the issue in 
my EC2 instance. However, the problem shows up rather often in the 
user's instance.




Can this bug be reproduced on your own environment(xen + dom0)?


I haven't tried this yet.




At least 2 LVM partitions are needed:
* /, 20-30 Gb should be enough, ext4
* /vz, 5-10 Gb should be enough, ext4

Kernel 3.14.74 I was talking about: 
https://www.dropbox.com/s/bhus3mubza87z86/kernel-3.14.74-1.test.x86_64.rpm?dl=1

Not sure if it is relevant, but the user may have installed additional packages 
from https://download.openvz.org/virtuozzo/releases/7.0-rtm/x86_64/os/ 
repository. Namely: vzctl, vzmigrate, vzprocps, vztt-lib, vzctcalc, ploop, 
prlctl, centos-7-x86_64-ez.

After the kernel and the other mentioned packages have been installed,
the user rebooted the instance to run that kernel 3.14.74.

Then - start the instance, wait 5 minutes, stop the instance, repeat. 2-20 such 
iterations were usually enough to reproduce the problem. Can be automated with 
the help of Amazon's API.

BTW, before the BUG_ON triggered this time, there was the following in dmesg. 
Not sure if it is related but still:



Attach the full dmesg would be better.


Well, there is not much in the part the user was able to retrieve 
besides what I have sent and the BUG_ON() splat. But here it is, anyway.


Regards,
Evgenii



Regards,
Bob


--
[2.835034] scsi0 : ata_piix
[2.840317] scsi1 : ata_piix
[2.842267] ata1: PATA max MWDMA2 cmd 0x1f0 ctl 0x3f6 bmdma 0xc100 irq 14
[2.845861] ata2: PATA max MWDMA2 cmd 0x170 ctl 0x376 bmdma 0xc108 irq 15
[2.853840] AVX version of gcm_enc/dec engaged.
[2.859963] xen_netfront: Initialising Xen virtual ethernet driver
[2.867156] alg: No test for __gcm-aes-aesni (__driver-gcm-aes-aesni)
[2.885861] blkfront: xvda: barrier or flush: disabled; persistent grants: 
disabled; indirect descriptors: enabled;
[2.889046] alg: No test for crc32 (crc32-pclmul)
[2.899290]  xvda: xvda1
[2.997751] blkfront: xvdc: flush diskcache: 

Re: [Xen-devel] Livepatch, symbol resolutions between two livepatchs (new_symbol=0)

2016-08-11 Thread Ross Lagerwall

On 08/11/2016 02:28 AM, Konrad Rzeszutek Wilk wrote:

Hey Ross,

I am running in a symbol dependency issue that I am not exactly
sure how to solve.

I have an payload that introduces a new function (xen_foobar) which
will patch over xen_extra_version().


snip


As livepatch_symbols_lookup_by_name only looks for symbols that
have the ->new_symbol set. And xen_foobar does not. So the loading is
aborted.

Which makes sense - we don't want to match the symbols as they haven't
really been "finally loaded" in.

But what if the xen_foobar is applied. In that case we should
change the xen_foobar to be new_symbol=1?


I think you're confused about the purpose of new_symbol. The purpose is 
to ensure that you link against the correct symbol from the base 
hypervisor or the live patch that first introduced it. So, new_symbol=0 
is when a symbol overrides an existing symbol. new_symbol=1 is set when 
a symbol is new introduced in a live patch.


Since all the linking happens during load and not apply, it is perfectly 
OK to link against a symbol that hasn't been applied -- the dependencies 
are there to ensure that you can't apply a patch which links against 
unapplied symbols.


The assumption is that when overriding an existing symbol, the symbol in 
the payload has the same name as the one it is overriding. You're having 
issues above because you're breaking this assumption.




This following patch does that, but I am wondering if there is a better
way?


The patch is misusing new_symbol for something completely different from 
how it was intended so I hope there is a better way :-P




P.S.
The reason for this is that I am trying to implement NOP patching.
And to have some regression testing of this I wrote an function
(xen_foobar) which calls two functions: foo and bar - and their output is what
the call to XENVER_extra_version will show (b/c we patch over
xen_extra_version()).

Then there is another payload - which will want to NOP the call to
the 'bar' function inside xen_foobar. And for that I need to be able to
lookup the symbol of xen_foobar.


This is quite a different use case from what currently exists. Currently 
we're only ever interested in writing over the start of the function 
pointed to by a symbol from the base hypervisor or first instance of a 
symbol in a live patch (aka new_symbol=1). Now you need to be able to 
lookup and write over an arbitrary symbol -- how do you choose between 
the n different loaded versions of the same symbol?


I must admit to not seeing the point in NOP patching. It just seems to 
be a special case of arbitrary data patching that could be more easily 
achieved using other means.


Let's have a discussion about this and the symbol issues here at the Xen 
Summit in a couple of weeks time.


--
Ross Lagerwall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 01/25] arm/altp2m: Add first altp2m HVMOP stubs.

2016-08-11 Thread Julien Grall

Hello Tamas,

On 10/08/2016 16:49, Tamas K Lengyel wrote:

On Aug 10, 2016 03:52, "Julien Grall" mailto:julien.gr...@arm.com>> wrote:

On 09/08/2016 21:16, Tamas K Lengyel wrote:

On Wed, Aug 3, 2016 at 10:54 AM, Julien Grall 
> wrote:

There is a rcu_lock_domain_by_any_id before we get to this check here,
so any other CPU looking to disable altp2m would be waiting there for
the current op to finish up, so there is no race condition AFAICT.



No, rcu_lock_domain_by_any_id only prevents the domain to be fully

destroyed by "locking" the rcu. It does not prevent multiple concurrent
access. You can look at the code if you are not convinced.




Ah thanks for clarifying. Then indeed there could be concurrency issues
if there are multiple tools accessing this interface. Normally that
doesn't happen though but probably a good idea to enforce it anyway.


Well, you need to think about the worst case scenario when you implement 
an interface. If you don't lock properly, the state in Xen may be 
corrupted. For instance Xen may think altp2m is active whilst it is not 
properly initialized.


Regards,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [XTF PATCH 3/3] xtf-runner: support two modes for getting output

2016-08-11 Thread Wei Liu
On Wed, Aug 10, 2016 at 04:07:30PM +0100, Wei Liu wrote:
[...]
>  
> +def run_test_logfile(opts, test):
> +""" Run a specific test via grepping log file"""
> +
> +fn = opts.logfile_dir + (opts.logfile_pattern % test)
> +local_time = time.strftime('%Y-%m-%d %H:%M:%S', time.localtime())
> +
> +# Use time to generate unique stamps
> +start_stamp = "= XTF TEST START %s =" % local_time
> +end_stamp = "= XTF TEST END %s =" % local_time
> +
> +print "Using %s" % fn
> +
> +f = open(fn, "ab")
> +f.write(start_stamp + "\n")
> +f.close()
> +

I think it would make more sense for the micro VM itself to write
stamps?

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 14/25] arm/altp2m: Make get_page_from_gva ready for altp2m.

2016-08-11 Thread Julien Grall



On 06/08/2016 18:58, Sergej Proskurin wrote:

Hi Julien,


Hello Sergej,


On 08/06/2016 03:45 PM, Julien Grall wrote:



On 06/08/2016 11:38, Sergej Proskurin wrote:

Hi Julien,


Hello Serge,


On 08/04/2016 01:59 PM, Julien Grall wrote:

Hello Sergej,

On 01/08/16 18:10, Sergej Proskurin wrote:

The function get_page_from_gva uses ARM's hardware support to
translate
gva's to machine addresses. This function is used, among others, for
memory regulation purposes, e.g, within the context of memory
ballooning.
To ensure correct behavior while altp2m is in use, we use the
host's p2m
table for the associated gva to ma translation. This is required at
this
point, as altp2m lazily copies pages from the host's p2m and even
might
be flushed because of changes to the host's p2m (as it is done within
the context of memory ballooning).


I was expecting to see some change in
p2m_mem_access_check_and_get_page. Is there any reason to not fix it?




I did not yet encounter any issues with
p2m_mem_access_check_and_get_page. According to ARM ARM, ATS1C** (see
gva_to_ipa_par) translates VA to IPA in non-secure privilege levels (as
it is the the case here). Thus, the 2nd level translation represented by
the (alt)p2m is not really considered at this point and hence make an
extension obsolete.

Or did you have anything else in mind?


The stage-1 page tables are living in the guest memory. So every time
you access an entry in the page table, you have to translate the IPA
(guest physical address) into a PA.

However, the underlying memory of those page table may have
restriction permission or does not exist in the altp2m at all. So the
translation will fail.



Please correct me if I am wrong but as far as I understand: the function
p2m_mem_access_check_and_get_page is called only from get_page_from_gva.
Also it is called only if the page translation within the function
get_page_from_gva was not successful. Because of the fact that we use
the hostp2m's 2nd stage translation table including the original memory
access permissions (please note the short sequence, where we temporarily
reset the VTTBR_EL2 of the hostp2m if altp2m is active), potential
faults (which would lead to the call of the function
p2m_mem_access_check_and_get_page) must have reasons beyond altp2m.


The translation in get_page_from_gva may fail if the permission in the 
hostp2m has been restricted by memaccess (for instance because 
default_access is not p2m_access_rwx).


So you will fallback to p2m_mem_access_check_and_get_page. This function 
is calling gva_to_ipa that will use the altp2m to do the translation.


Therefore I think you need to modify p2m_mem_access_check_and_get_page 
to cope with altp2m.


Regards,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 3/3] x86/microcode: Avoid undefined behaviour from signed integer overflow

2016-08-11 Thread Tian, Kevin
> From: Andrew Cooper [mailto:andrew.coop...@citrix.com]
> Sent: Friday, August 05, 2016 9:50 PM
> To: Xen-devel
> Cc: Andrew Cooper; Jan Beulich; Tian, Kevin; Nakajima, Jun
> Subject: [PATCH 3/3] x86/microcode: Avoid undefined behaviour from signed 
> integer
> overflow
> 
> The checksum should be calculated using unsigned 32bit integers, as it is
> intended to overflow and end at 0.
> 
> Signed-off-by: Andrew Cooper 

Acked-by: Kevin Tian 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 1/2] x86/vmx: dump MSR load area

2016-08-11 Thread Jan Beulich
>>> On 10.08.16 at 16:25,  wrote:
> On Wed, Aug 10, 2016 at 04:44:21AM -0600, Jan Beulich wrote:
>> >>> On 10.08.16 at 08:59,  wrote:
>> > @@ -1879,6 +1893,13 @@ void vmcs_dump_vcpu(struct vcpu *v)
>> >   (SECONDARY_EXEC_ENABLE_VPID | 
>> > SECONDARY_EXEC_ENABLE_VM_FUNCTIONS) 
> 
>> > )
>> >  printk("Virtual processor ID = 0x%04x VMfunc controls = %016lx\n",
>> > vmr16(VIRTUAL_PROCESSOR_ID), vmr(VM_FUNCTION_CONTROL));
>> > +printk("EXIT MSR load count = 0x%04x\n",
>> > +   (uint32_t)vmr(VM_EXIT_MSR_LOAD_COUNT));
>> > +printk("EXIT MSR store count = 0x%04x\n",
>> > +   (uint32_t)vmr(VM_EXIT_MSR_STORE_COUNT));
>> > +printk("ENTRY MSR load count = 0x%04x\n",
>> > +   (uint32_t)vmr(VM_ENTRY_MSR_LOAD_COUNT));
>> 
>> First - do you really need to make three log lines out of these? And
>> then, please use vmr32(), as the neighboring vmr16() suggests.
>> Plus finally - please log all four counts consistently either in hex
>> or in dec.
> 
> With one line, output might look something like:
> (XEN) MSR load/store count ExitLoad=0x0001 ExitStore=0x0023 EntryLoad=0x0023
> 
> Spaces around = are inconsistent in the existing output and it seems
> that no space is more popular. Does this format seem better to you?

Yes.

> I see three counts here - are you talking about the msr_count above?

Yes.

> For msr_count I was thinking that this is internal Xen state, whereas
> the other values are VMCS fields where everything else is dumped in
> hex. I think printing msr_count is redundant (one could just count the
> lines of output), so I'll just remove it.

That's fine as an option of course.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 20/25] arm/altp2m: Add altp2m paging mechanism.

2016-08-11 Thread Julien Grall



On 10/08/2016 11:32, Sergej Proskurin wrote:

Hi Julien,


Hello Sergej,


[...]


 switch ( fsc )
 {
+case FSC_FLT_TRANS:
+{
+if ( altp2m_active(d) )
+{
+const struct npfec npfec = {
+.insn_fetch = 1,
+.gla_valid = 1,
+.kind = hsr.iabt.s1ptw ? npfec_kind_in_gpt :
npfec_kind_with_gla
+};
+
+/*
+ * Copy the entire page of the failing instruction
into the
+ * currently active altp2m view.
+ */
+if ( altp2m_lazy_copy(v, gpa, gva, npfec, &p2m) )
+return;


I forgot to mention that I think there is a race condition here. If
multiple vCPU (let say A and B) use the same altp2m, they may fault
here.

If vCPU A already fixed the fault, this function will return false and
continue. So this will lead to inject an instruction abort to the
guest.



I have solved this issue as well:

In altp2m_lazy_copy, we check whether the faulting address is already
mapped in the current altp2m view. The only reason why the current
altp2m should have a valid entry for the apparently faulting address is
that it was previously (almost simultaneously) mapped by another vcpu.
That is, if the mapping for the faulting address is valid in the altp2m,
we return true and hence let the guest retry (without injecting an
instruction/data abort exception) to access the address in question.


I am afraid that your description does not match the implementation of 
altp2m_lazy_copy in this version of the patch series.


If you find a valid entry in the altp2m, you will return 0 (i.e false). 
This will lead to inject an abort into the guest.


Regards,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 2/2] x86/vmx: conditionally disable LBR support due to TSX format quirk

2016-08-11 Thread Jan Beulich
>>> On 10.08.16 at 17:47,  wrote:
> On Wed, Aug 10, 2016 at 06:34:10AM -0600, Jan Beulich wrote:
>> >>> On 10.08.16 at 08:59,  wrote:
>> > --- a/xen/arch/x86/hvm/vmx/vmx.c
>> > +++ b/xen/arch/x86/hvm/vmx/vmx.c
>> > @@ -2576,8 +2576,22 @@ static const struct lbr_info 
>> > *last_branch_msr_get(void)
>> >  /* Haswell */
>> >  case 60: case 63: case 69: case 70:
>> >  /* Broadwell */
>> > -case 61: case 71: case 79: case 86:
>> > +case 61: case 71: case 79: case 86: {
>> > +u64 caps;
>> > +bool_t tsx_support = boot_cpu_has(X86_FEATURE_HLE) ||
>> > + boot_cpu_has(X86_FEATURE_RTM);
>> > +
>> > +rdmsrl(MSR_IA32_PERF_CAPABILITIES, caps);
>> 
>> This is guarded by a X86_FEATURE_PDCM check in Linux - why
>> would we not need the same here?
> 
> You're right, it should be. It also seems to be missing from
> core2_vpmu_init().

Feel free to take the liberty to fix it there at once (but please
briefly mention this as an independent change in the description).

>> Also I think this RDMSR should be performed once at boot, not every
>> time we come here.
> 
> I thought you might say that. It didn't seem obviously right to put
> this in boot_cpu_data -- is that what you suggest?

Why boot_cpu_data? Just an ordinary (possibly per-CPU) static
in vmx.c.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 3/4] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server.

2016-08-11 Thread Yu Zhang



On 8/10/2016 6:43 PM, Yu Zhang wrote:



On 8/10/2016 6:33 PM, Jan Beulich wrote:

On 10.08.16 at 10:09,  wrote:

On 8/8/2016 11:40 PM, Jan Beulich wrote:

On 12.07.16 at 11:02,  wrote:

@@ -178,8 +179,34 @@ static int hvmemul_do_io(
   break;
   case X86EMUL_UNHANDLEABLE:
   {
-struct hvm_ioreq_server *s =
-hvm_select_ioreq_server(curr->domain, &p);
+struct hvm_ioreq_server *s;
+
+if ( is_mmio )
+{
+unsigned long gmfn = paddr_to_pfn(addr);
+p2m_type_t p2mt;
+
+(void) get_gfn_query_unlocked(currd, gmfn, &p2mt);
+
+if ( p2mt == p2m_ioreq_server )
+{
+unsigned int flags;
+
+if ( dir != IOREQ_WRITE )
+s = NULL;
+else
+{
+s = p2m_get_ioreq_server(currd, &flags);
+
+if ( !(flags & P2M_IOREQ_HANDLE_WRITE_ACCESS) )
+s = NULL;
+}
+}
+else
+s = hvm_select_ioreq_server(currd, &p);
+}
+else
+s = hvm_select_ioreq_server(currd, &p);

Wouldn't it both be more natural and make the logic even easier
to follow if s got set to NULL up front, all the "else"-s dropped,
and a simple

  if ( !s )
  s = hvm_select_ioreq_server(currd, &p);

be done in the end?


Sorry, Jan. I tried to simplify above code, but found the new code is
still not very
clean,  because in some cases the s is supposed to return NULL instead
of to be
set from the hvm_select_ioreq_server().
To keep the same logic, the simplified code looks like this:

   case X86EMUL_UNHANDLEABLE:
   {
-struct hvm_ioreq_server *s =
-hvm_select_ioreq_server(curr->domain, &p);
+struct hvm_ioreq_server *s = NULL;
+p2m_type_t p2mt = p2m_invalid;
+
+if ( is_mmio && dir == IOREQ_WRITE )
+{
+unsigned long gmfn = paddr_to_pfn(addr);
+
+(void) get_gfn_query_unlocked(currd, gmfn, &p2mt);
+
+if ( p2mt == p2m_ioreq_server )
+{
+unsigned int flags;
+
+s = p2m_get_ioreq_server(currd, &flags);
+if ( !(flags & XEN_HVMOP_IOREQ_MEM_ACCESS_WRITE) )
+s = NULL;
+}
+}
+
+if ( !s && p2mt != p2m_ioreq_server )
+s = hvm_select_ioreq_server(currd, &p);

   /* If there is no suitable backing DM, just ignore 
accesses */

   if ( !s )

As you can see, definition of p2mt is moved outside the if ( is_mmio )
judgement,
and is checked against p2m_ioreq_server before we search the ioreq
server's rangeset
in hvm_select_ioreq_server(). So I am not quite satisfied with this
simplification.
Any suggestions?

I think it's better than the code was before, but an implicit part of
my suggestion was that I'm not really convinced the
" && p2mt != p2m_ioreq_server" part of your new conditional is
really needed: Would it indeed be wrong to hand such a request
to the "normal" ioreq server, instead of terminating it right away?
(I guess that's a question to you as much as to Paul.)



Thanks for your reply, Jan.
For " && p2mt != p2m_ioreq_server" condition, it is just to guarantee 
that if a write
operation is trapped, and at the same period, device model changed the 
status of

ioreq server, it should be discarded.


Hi Paul & Jan, any comments?


A second thought is, I am now more worried about the " && dir == 
IOREQ_WRITE"
condition, which we used previously to set s to NULL if it is not a 
write operation.
However, if HVM uses a read-modify-write instruction to operate on a 
write-protected
address, it will be treated as both read and write accesses in 
ept_handle_violation(). In
such situation, we need to emulate the read access first(by just 
returning the value being
fetched either in hypervisor or in device model), instead of 
discarding the read access.




Any suggestions about this guest read-modify-write instruction situation?
Is my depiction clear? :)

Thanks
Yu


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/3] livepach: Add .livepatch.hooks functions and test-case

2016-08-11 Thread Jan Beulich
>>> On 10.08.16 at 11:46,  wrote:
> Odd. I've tried this simple example:
> 
> typedef int fn_t(void);
> 
> struct s {
>   unsigned n;
>   fn_t**fn;
>   fn_t*const*fnc;
>   const fn_t**cfn;
> };
> 
> int test1(const struct s*ps) {
>   unsigned i;
>   int rc = 0;
> 
>   for(i = 0; !rc && i < ps->n; ++i)
>   rc = ps->fn[i]();
> 
>   return rc;
> }
> 
> int test2(const struct s*ps) {
>   unsigned i;
>   int rc = 0;
> 
>   for(i = 0; !rc && i < ps->n; ++i)
>   rc = ps->fnc[i]();
> 
>   return rc;
> }
> 
> int test3(const struct s*ps) {
>   unsigned i;
>   int rc = 0;
> 
>   for(i = 0; !rc && i < ps->n; ++i)
>   rc = ps->cfn[i]();
> 
>   return rc;
> }
> 
> test1() and test2() get compiled identically. test3(), using the field
> with the misplaced const, oddly enough gets compiled slightly
> differently (and without a warning despite one would seem
> warranted), yet the call doesn't get omitted. If, however, I change
> the return type of fn_t to void, the function body of test3() ends
> up empty, which is a compiler bug afaict, but which also suggests
> that you've tried the variant with the misplaced const.

FTR: This is not a compiler bug, as specifically named undefined
in the C spec.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [qemu-mainline test] 100397: tolerable FAIL - PUSHED

2016-08-11 Thread osstest service owner
flight 100397 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/100397/

Failures :-/ but no regressions.

Regressions which are regarded as allowable (not blocking):
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail like 100379
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stopfail like 100379
 test-armhf-armhf-xl-rtds 15 guest-start/debian.repeatfail  like 100379
 test-amd64-amd64-xl-rtds  9 debian-install   fail  like 100379

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-credit2  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-xsm  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 14 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt 14 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-qcow2 13 guest-saverestorefail never pass
 test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-libvirt-raw 13 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 saverestore-support-checkfail   never pass

version targeted for testing:
 qemuu4b3e5c06a15298d870e81c2d3a5a16dc2a93f5cc
baseline version:
 qemuu2bb15bddf2607110820d5ce5aa43baac27292fb3

Last test of basis   100379  2016-08-10 04:09:54 Z1 days
Testing same since   100397  2016-08-10 17:16:06 Z0 days1 attempts


People who touched revisions under test:
  Cornelia Huck 
  Cédric Le Goater 
  David Gibson 
  Gonglei 
  Laurent Vivier 
  Marc-André Lureau 
  Paolo Bonzini 
  Peter Maydell 
  Pranith Kumar 
  Radim Krčmář 
  Thomas Huth 

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl  pass
 test-armhf-armhf-xl  pass
 test-amd64-i386-xl   pass
 test-amd64

Re: [Xen-devel] [PATCH v5 3/4] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server.

2016-08-11 Thread Jan Beulich
>>> On 11.08.16 at 10:47,  wrote:
> On 8/10/2016 6:43 PM, Yu Zhang wrote:
>> For " && p2mt != p2m_ioreq_server" condition, it is just to guarantee 
>> that if a write
>> operation is trapped, and at the same period, device model changed the 
>> status of
>> ioreq server, it should be discarded.
> 
> Hi Paul & Jan, any comments?

Didn't Paul's "should behave like p2m_ram_rw" reply clarify things
sufficiently?

>> A second thought is, I am now more worried about the " && dir == 
>> IOREQ_WRITE"
>> condition, which we used previously to set s to NULL if it is not a 
>> write operation.
>> However, if HVM uses a read-modify-write instruction to operate on a 
>> write-protected
>> address, it will be treated as both read and write accesses in 
>> ept_handle_violation(). In
>> such situation, we need to emulate the read access first(by just 
>> returning the value being
>> fetched either in hypervisor or in device model), instead of 
>> discarding the read access.
> 
> Any suggestions about this guest read-modify-write instruction situation?
> Is my depiction clear? :)

Well, from your earlier reply I concluded that you'd just go ahead
and put this into patch form, which we'd then look at.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 08/25] arm/altp2m: Add HVMOP_altp2m_set_domain_state.

2016-08-11 Thread Julien Grall

Hello Sergej,

On 06/08/2016 11:36, Sergej Proskurin wrote:

+
+/* Initialize the new altp2m view. */
+rc = p2m_init_one(d, p2m);
+if ( rc )
+goto err;
+
+/* Allocate a root table for the altp2m view. */
+rc = p2m_alloc_table(p2m);
+if ( rc )
+goto err;
+
+p2m->p2m_class = p2m_alternate;
+p2m->access_required = 1;


Please use true here. Although, I am not sure why you want to enable
the access by default.



Will do.

p2m->access_required is true by default in the x86 implementation. Also,
there is currently no way to manually set access_required on altp2m.
Besides, I do not see a scenario, where it makes sense to run altp2m
without access_required set to true.


Please add a comment in the code to explain it.

[...]




+
+/*
+ * The altp2m_active state has been deactivated. It is
now safe to
+ * flush all altp2m views -- including altp2m[0].
+ */
+if ( ostate )
+altp2m_flush(d);


The function altp2m_flush is defined afterwards (in patch #9). Please
make sure that all the patches compile one by one.



The patches compile one by one. Please note that there is an
altp2m_flush stub inside of this patch.

+/* Flush all the alternate p2m's for a domain */
+static inline void altp2m_flush(struct domain *d)
+{
+/* Not yet implemented. */
+}


I don't want to see stubs that are been replaced later on within the 
same series. The patch #9 does not seem to depend on patch #8, so I 
don't see any reason why you can't swap the 2 patches.


Regards,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 06/19] mini-os: let memory allocation fail if no free page available

2016-08-11 Thread Juergen Gross
Instead of panicing when no page can be allocated try to fail the
memory allocation by returning NULL instead.

Signed-off-by: Juergen Gross 
Reviewed-by: Wei Liu 
Acked-by: Samuel Thibault 
---
V2: fixed minor style issue
---
 mm.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/mm.c b/mm.c
index 263a356..8cf3210 100644
--- a/mm.c
+++ b/mm.c
@@ -335,6 +335,13 @@ void *sbrk(ptrdiff_t increment)
 
 if (new_brk > heap_mapped) {
 unsigned long n = (new_brk - heap_mapped + PAGE_SIZE - 1) / PAGE_SIZE;
+
+if ( n > nr_free_pages )
+{
+printk("Memory exhausted: want %ld pages, but only %ld are left\n",
+   n, nr_free_pages);
+return NULL;
+}
 do_map_zero(heap_mapped, n);
 heap_mapped += n * PAGE_SIZE;
 }
-- 
2.6.6


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 03/19] mini-os: remove MM_DEBUG code

2016-08-11 Thread Juergen Gross
mm.c contains unused code inside #ifdef MM_DEBUG areas. Its usability
is rather questionable and some parts are even wrong (e.g.
print_chunks() called with nr_pages > 1000 will clobber an arbitrary
stack content with a 0 byte).

Remove this code.

Signed-off-by: Juergen Gross 
Reviewed-by: Wei Liu 
Acked-by: Samuel Thibault 
---
 mm.c | 60 
 1 file changed, 60 deletions(-)

diff --git a/mm.c b/mm.c
index 31aaf83..0dd4862 100644
--- a/mm.c
+++ b/mm.c
@@ -42,13 +42,6 @@
 #include 
 #include 
 
-#ifdef MM_DEBUG
-#define DEBUG(_f, _a...) \
-printk("MINI_OS(file=mm.c, line=%d) " _f "\n", __LINE__, ## _a)
-#else
-#define DEBUG(_f, _a...)((void)0)
-#endif
-
 /*
  * ALLOCATION BITMAP
  *  One bit per page of memory. Bit set => page is allocated.
@@ -140,59 +133,6 @@ static chunk_head_t  free_tail[FREELIST_SIZE];
 #define round_pgdown(_p)  ((_p)&PAGE_MASK)
 #define round_pgup(_p)(((_p)+(PAGE_SIZE-1))&PAGE_MASK)
 
-#ifdef MM_DEBUG
-/*
- * Prints allocation[0/1] for @nr_pages, starting at @start
- * address (virtual).
- */
-USED static void print_allocation(void *start, int nr_pages)
-{
-unsigned long pfn_start = virt_to_pfn(start);
-int count;
-for(count = 0; count < nr_pages; count++)
-if(allocated_in_map(pfn_start + count)) printk("1");
-else printk("0");
-
-printk("\n");
-}
-
-/*
- * Prints chunks (making them with letters) for @nr_pages starting
- * at @start (virtual).
- */
-USED static void print_chunks(void *start, int nr_pages)
-{
-char chunks[1001], current='A';
-int order, count;
-chunk_head_t *head;
-unsigned long pfn_start = virt_to_pfn(start);
-   
-memset(chunks, (int)'_', 1000);
-if(nr_pages > 1000) 
-{
-DEBUG("Can only pring 1000 pages. Increase buffer size.");
-}
-
-for(order=0; order < FREELIST_SIZE; order++)
-{
-head = free_head[order];
-while(!FREELIST_EMPTY(head))
-{
-for(count = 0; count < 1UL<< head->level; count++)
-{
-if(count + virt_to_pfn(head) - pfn_start < 1000)
-chunks[count + virt_to_pfn(head) - pfn_start] = current;
-}
-head = head->next;
-current++;
-}
-}
-chunks[nr_pages] = '\0';
-printk("%s\n", chunks);
-}
-#endif
-
-
 /*
  * Initialise allocator, placing addresses [@min,@max] in free pool.
  * @min and @max are PHYSICAL addresses.
-- 
2.6.6


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 07/19] mini-os: add ballooning config item

2016-08-11 Thread Juergen Gross
Add CONFIG_BALLOON defaulting to 'n' as a config item to Mini-OS.

Add balloon.c, balloon.h and arch/*/balloon.c for future use.

Signed-off-by: Juergen Gross 
Acked-by: Samuel Thibault 
---
V2: Added dummy sources and header
---
 Makefile   |  3 +++
 arch/arm/balloon.c | 28 
 arch/x86/balloon.c | 28 
 balloon.c  | 24 
 include/balloon.h  | 32 
 5 files changed, 115 insertions(+)
 create mode 100644 arch/arm/balloon.c
 create mode 100644 arch/x86/balloon.c
 create mode 100644 balloon.c
 create mode 100644 include/balloon.h

diff --git a/Makefile b/Makefile
index 2e4bdba..f5b7011 100644
--- a/Makefile
+++ b/Makefile
@@ -33,6 +33,7 @@ CONFIG_CONSFRONT ?= y
 CONFIG_XENBUS ?= y
 CONFIG_XC ?=y
 CONFIG_LWIP ?= $(lwip)
+CONFIG_BALLOON ?= n
 
 # Export config items as compiler directives
 flags-$(CONFIG_START_NETWORK) += -DCONFIG_START_NETWORK
@@ -48,6 +49,7 @@ flags-$(CONFIG_KBDFRONT) += -DCONFIG_KBDFRONT
 flags-$(CONFIG_FBFRONT) += -DCONFIG_FBFRONT
 flags-$(CONFIG_CONSFRONT) += -DCONFIG_CONSFRONT
 flags-$(CONFIG_XENBUS) += -DCONFIG_XENBUS
+flags-$(CONFIG_BALLOON) += -DCONFIG_BALLOON
 
 DEF_CFLAGS += $(flags-y)
 
@@ -96,6 +98,7 @@ src-$(CONFIG_NETFRONT) += netfront.c
 src-$(CONFIG_PCIFRONT) += pcifront.c
 src-y += sched.c
 src-$(CONFIG_TEST) += test.c
+src-$(CONFIG_BALLOON) += balloon.c
 
 src-y += lib/ctype.c
 src-y += lib/math.c
diff --git a/arch/arm/balloon.c b/arch/arm/balloon.c
new file mode 100644
index 000..28021d6
--- /dev/null
+++ b/arch/arm/balloon.c
@@ -0,0 +1,28 @@
+/* -*-  Mode:C; c-basic-offset:4; tab-width:4 -*-
+ *
+ * (C) 2016 - Juergen Gross, SUSE Linux GmbH
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include 
+
+#ifdef CONFIG_BALLOON
+
+#endif
diff --git a/arch/x86/balloon.c b/arch/x86/balloon.c
new file mode 100644
index 000..28021d6
--- /dev/null
+++ b/arch/x86/balloon.c
@@ -0,0 +1,28 @@
+/* -*-  Mode:C; c-basic-offset:4; tab-width:4 -*-
+ *
+ * (C) 2016 - Juergen Gross, SUSE Linux GmbH
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include 
+
+#ifdef CONFIG_BALLOON
+
+#endif
diff --git a/balloon.c b/balloon.c
new file mode 100644
index 000..f9cf23b
--- /dev/null
+++ b/balloon.c
@@ -0,0 +1,24 @@
+/* -*-  Mode:C; c-basic-offset:4; tab-width:4 -*-
+ *
+ * (C) 2016 - Juergen Gross, SUSE Linux GmbH
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice an

[Xen-devel] [PATCH v3 04/19] mini-os: add description of x86 memory usage

2016-08-11 Thread Juergen Gross
Add a brief description how the physical and virtual address usage
looks like on x86 to include/x86/arch_mm.h

Signed-off-by: Juergen Gross 
Reviewed-by: Wei Liu 
Acked-by: Samuel Thibault 
---
 include/x86/arch_mm.h | 20 
 1 file changed, 20 insertions(+)

diff --git a/include/x86/arch_mm.h b/include/x86/arch_mm.h
index 58f29fc..f756dab 100644
--- a/include/x86/arch_mm.h
+++ b/include/x86/arch_mm.h
@@ -36,6 +36,26 @@
 #endif
 #endif
 
+/*
+ * Physical address space usage:
+ *
+ * 0..._edata: kernel text/data
+ * *stack: kernel stack (thread 0)
+ * hypervisor allocated data: p2m_list, start_info page, xenstore page,
+ *console page, initial page tables
+ * bitmap of allocated pages
+ * pages controlled by the page allocator
+ *
+ *
+ * Virtual address space usage:
+ *
+ * 1:1 mapping of physical memory starting at VA(0)
+ * 1 unallocated page
+ * demand map area (32 bits: 2 GB, 64 bits: 128 GB) for virtual allocations
+ * 1 unallocated page
+ * with libc: heap area (32 bits: 1 GB, 64 bits: 128 GB)
+ */
+
 #define L1_FRAME1
 #define L2_FRAME2
 #define L3_FRAME3
-- 
2.6.6


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 00/19] mini-os: support of auto-ballooning

2016-08-11 Thread Juergen Gross
Support ballooning Mini-OS automatically up in case of memory shortage.

Do some cleanups, a small correction and add some basic features to
lay groundwork for support of ballooning in Mini-OS (patches 1-14).

The main visible change is the virtual memory layout: to be able to
add memory to the running Mini-OS we need to have some spare areas
especially after the 1:1 mapping of physical memory.

Then add the ballooning functionality: the p2m map must be expanded,
the page allocator's bitmap must  be expanded and we must get new
memory from the hypervisor.

In case of a detected memory shortage the domain will balloon up until
either enough memory is available or the upper limit has been reached.

Ballooning has been tested with a xenstore stubdom.
Regression tests have been done with:
- pure mini-os
- ioemu stubdom
- pvgrub 64 bit

pvgrub 32 bit didn't work before applying the series, it just entered
the grub shell. With the series applied the behavior was exactly the
same. The grub shell however was working (I tried "help" and "reboot").

I tried to modify arm specific files in order not to break the
non-ballooning case, but I haven't tested it to either work or to
compile.

V1 of this series consisted of patches 1-9 only.

Changes in V3:
- some minor adjustments as requested by Samuel Thibault
- added patch 19

Changes in V2:
- added patches 10-18
- some coding style corrections
- patch 7: introduced balloon specific source files
- moved ballooning specific functions/definitions to ballon specific
  files
- patch 9: avoid conflict with hypervisor mapped area on 32 bits

Juergen Gross (19):
  mini-os: correct first free pfn
  mini-os: remove unused alloc_contig_pages() function
  mini-os: remove MM_DEBUG code
  mini-os: add description of x86 memory usage
  mini-os: add nr_free_pages counter
  mini-os: let memory allocation fail if no free page available
  mini-os: add ballooning config item
  mini-os: get maximum memory size from hypervisor
  mini-os: modify virtual memory layout for support of ballooning
  mini-os: remove unused mem_test() function
  mini-os: add checks for out of memory
  mini-os: don't allocate new pages for level 1 p2m tree
  mini-os: add function to map one frame
  mini-os: move p2m related macros to header file
  mini-os: remap p2m list in case of ballooning
  mini-os: map page allocator's bitmap to virtual kernel area for
ballooning
  mini-os: add support for ballooning up
  mini-os: balloon up in case of oom
  mini-os: repair build system

 Config.mk |  93 +++
 Makefile  |  43 +--
 arch/arm/balloon.c|  39 +++
 arch/arm/mm.c |  10 +-
 arch/x86/Makefile |   3 -
 arch/x86/balloon.c| 147 +++
 arch/x86/mm.c | 314 +++---
 balloon.c | 160 +
 config/MiniOS.mk  |  10 --
 config/StdGNU.mk  |  47 
 config/arm32.mk   |  22 
 config/arm64.mk   |  19 ---
 config/x86_32.mk  |  20 
 config/x86_64.mk  |  33 --
 include/arm/arch_mm.h |   2 +
 include/balloon.h |  59 ++
 include/mm.h  |  13 ++-
 include/x86/arch_mm.h |  70 +++
 minios.mk |   4 +-
 mm.c  | 131 -
 20 files changed, 682 insertions(+), 557 deletions(-)
 create mode 100644 arch/arm/balloon.c
 create mode 100644 arch/x86/balloon.c
 create mode 100644 balloon.c
 delete mode 100644 config/MiniOS.mk
 delete mode 100644 config/StdGNU.mk
 delete mode 100644 config/arm32.mk
 delete mode 100644 config/arm64.mk
 delete mode 100644 config/x86_32.mk
 delete mode 100644 config/x86_64.mk
 create mode 100644 include/balloon.h

-- 
2.6.6


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 02/19] mini-os: remove unused alloc_contig_pages() function

2016-08-11 Thread Juergen Gross
alloc_contig_pages() is never used anywhere in mini-os. Remove it.

Signed-off-by: Juergen Gross 
Reviewed-by: Wei Liu 
Acked-by: Samuel Thibault 
---
 arch/x86/mm.c | 142 --
 include/mm.h  |   1 -
 2 files changed, 143 deletions(-)

diff --git a/arch/x86/mm.c b/arch/x86/mm.c
index ae1036e..c59a5d3 100644
--- a/arch/x86/mm.c
+++ b/arch/x86/mm.c
@@ -652,148 +652,6 @@ int unmap_frames(unsigned long va, unsigned long 
num_frames)
 }
 
 /*
- * Allocate pages which are contiguous in machine memory.
- * Returns a VA to where they are mapped or 0 on failure.
- * 
- * addr_bits indicates if the region has restrictions on where it is
- * located. Typical values are 32 (if for example PCI devices can't access
- * 64bit memory) or 0 for no restrictions.
- *
- * Allocated pages can be freed using the page allocators free_pages() 
- * function.
- *
- * based on Linux function xen_create_contiguous_region()
- */
-#define MAX_CONTIG_ORDER 9 /* 2MB */
-unsigned long alloc_contig_pages(int order, unsigned int addr_bits)
-{
-unsigned long in_va, va;
-unsigned long in_frames[1UL << order], out_frames, mfn;
-multicall_entry_t call[1UL << order];
-unsigned int i, num_pages = 1UL << order;
-int ret, exch_success;
-
-/* pass in num_pages 'extends' of size 1 and
- * request 1 extend of size 'order */
-struct xen_memory_exchange exchange = {
-.in = {
-.nr_extents   = num_pages,
-.extent_order = 0,
-.domid= DOMID_SELF
-},
-.out = {
-.nr_extents   = 1,
-.extent_order = order,
-.address_bits = addr_bits,
-.domid= DOMID_SELF
-},
-.nr_exchanged = 0
-};
-
-if ( order > MAX_CONTIG_ORDER )
-{
-printk("alloc_contig_pages: order too large 0x%x > 0x%x\n",
-   order, MAX_CONTIG_ORDER);
-return 0;
-}
-
-/* Allocate some potentially discontiguous pages */
-in_va = alloc_pages(order);
-if ( !in_va )
-{
-printk("alloc_contig_pages: could not get enough pages (order=0x%x\n",
-   order);
-return 0;
-}
-
-/* set up arguments for exchange hyper call */
-set_xen_guest_handle(exchange.in.extent_start, in_frames);
-set_xen_guest_handle(exchange.out.extent_start, &out_frames);
-
-/* unmap current frames, keep a list of MFNs */
-for ( i = 0; i < num_pages; i++ )
-{
-int arg = 0;
-
-va = in_va + (PAGE_SIZE * i);
-in_frames[i] = virt_to_mfn(va);
-
-/* update P2M mapping */
-phys_to_machine_mapping[virt_to_pfn(va)] = INVALID_P2M_ENTRY;
-
-/* build multi call */
-call[i].op = __HYPERVISOR_update_va_mapping;
-call[i].args[arg++] = va;
-call[i].args[arg++] = 0;
-#ifdef __i386__
-call[i].args[arg++] = 0;
-#endif  
-call[i].args[arg++] = UVMF_INVLPG;
-}
-
-ret = HYPERVISOR_multicall(call, i);
-if ( ret )
-{
-printk("Odd, update_va_mapping hypercall failed with rc=%d.\n", ret);
-return 0;
-}
-
-/* try getting a contig range of MFNs */
-out_frames = virt_to_pfn(in_va); /* PFNs to populate */
-ret = HYPERVISOR_memory_op(XENMEM_exchange, &exchange);
-if ( ret ) {
-printk("mem exchanged order=0x%x failed with rc=%d, 
nr_exchanged=%lu\n",
-   order, ret, exchange.nr_exchanged);
-/* we still need to return the allocated pages above to the pool
- * ie. map them back into the 1:1 mapping etc. so we continue but 
- * in the end return the pages to the page allocator and return 0. */
-exch_success = 0;
-}
-else
-exch_success = 1;
-
-/* map frames into 1:1 and update p2m */
-for ( i = 0; i < num_pages; i++ )
-{
-int arg = 0;
-pte_t pte;
-
-va = in_va + (PAGE_SIZE * i);
-mfn = i < exchange.nr_exchanged ? (out_frames + i) : in_frames[i];
-pte = __pte(mfn << PAGE_SHIFT | L1_PROT);
-
-/* update P2M mapping */
-phys_to_machine_mapping[virt_to_pfn(va)] = mfn;
-
-/* build multi call */
-call[i].op = __HYPERVISOR_update_va_mapping;
-call[i].args[arg++] = va;
-#ifdef __x86_64__
-call[i].args[arg++] = (pgentry_t)pte.pte;
-#else
-call[i].args[arg++] = pte.pte_low;
-call[i].args[arg++] = pte.pte_high;
-#endif  
-call[i].args[arg++] = UVMF_INVLPG;
-}
-ret = HYPERVISOR_multicall(call, i);
-if ( ret )
-{
-printk("update_va_mapping hypercall no. 2 failed with rc=%d.\n", ret);
-return 0;
-}
-
-if ( !exch_success )
-{
-/* since the exchanged failed we just free the pages as well */
-free_pages((void *) in_va, order);
-return 0;
-}
-
-return in_va;
-}
-
-/*
  * Clear some of the bootstrap memory
  */
 static void clear_bootstrap(void)
diff --git a/incl

[Xen-devel] [PATCH v3 05/19] mini-os: add nr_free_pages counter

2016-08-11 Thread Juergen Gross
Add a variable holding the number of available memory pages. This will
aid auto-ballooning later.

Signed-off-by: Juergen Gross 
Reviewed-by: Wei Liu 
Acked-by: Samuel Thibault 
---
 include/mm.h | 1 +
 mm.c | 6 ++
 2 files changed, 7 insertions(+)

diff --git a/include/mm.h b/include/mm.h
index a48f485..b97b43e 100644
--- a/include/mm.h
+++ b/include/mm.h
@@ -42,6 +42,7 @@
 #define STACK_SIZE_PAGE_ORDER __STACK_SIZE_PAGE_ORDER
 #define STACK_SIZE __STACK_SIZE
 
+extern unsigned long nr_free_pages;
 
 void init_mm(void);
 unsigned long alloc_pages(int order);
diff --git a/mm.c b/mm.c
index 0dd4862..263a356 100644
--- a/mm.c
+++ b/mm.c
@@ -53,6 +53,8 @@ static unsigned long *alloc_bitmap;
 #define allocated_in_map(_pn) \
 (alloc_bitmap[(_pn)/PAGES_PER_MAPWORD] & (1UL<<((_pn)&(PAGES_PER_MAPWORD-1
 
+unsigned long nr_free_pages;
+
 /*
  * Hint regarding bitwise arithmetic in map_{alloc,free}:
  *  -(1<= n. 
@@ -81,6 +83,8 @@ static void map_alloc(unsigned long first_page, unsigned long 
nr_pages)
 while ( ++curr_idx < end_idx ) alloc_bitmap[curr_idx] = ~0UL;
 alloc_bitmap[curr_idx] |= (1UL

[Xen-devel] [PATCH v3 08/19] mini-os: get maximum memory size from hypervisor

2016-08-11 Thread Juergen Gross
Add support for obtaining the maximum memory size from the hypervisor.
This will make it possible to support ballooning.

Signed-off-by: Juergen Gross 
Acked-by: Samuel Thibault 
---
V2: Moved new stuff to balloon.c
---
 balloon.c | 22 ++
 include/balloon.h |  6 ++
 mm.c  |  2 ++
 3 files changed, 30 insertions(+)

diff --git a/balloon.c b/balloon.c
index f9cf23b..1ec113d 100644
--- a/balloon.c
+++ b/balloon.c
@@ -21,4 +21,26 @@
  * DEALINGS IN THE SOFTWARE.
  */
 
+#include 
 #include 
+#include 
+#include 
+#include 
+
+unsigned long nr_max_pages;
+
+void get_max_pages(void)
+{
+long ret;
+domid_t domid = DOMID_SELF;
+
+ret = HYPERVISOR_memory_op(XENMEM_maximum_reservation, &domid);
+if ( ret < 0 )
+{
+printk("Could not get maximum pfn\n");
+return;
+}
+
+nr_max_pages = ret;
+printk("Maximum memory size: %ld pages\n", nr_max_pages);
+}
diff --git a/include/balloon.h b/include/balloon.h
index 9756a3f..cd79017 100644
--- a/include/balloon.h
+++ b/include/balloon.h
@@ -26,7 +26,13 @@
 
 #ifdef CONFIG_BALLOON
 
+extern unsigned long nr_max_pages;
+
+void get_max_pages(void);
+
 #else /* CONFIG_BALLOON */
 
+static inline void get_max_pages(void) { }
+
 #endif /* CONFIG_BALLOON */
 #endif /* _BALLOON_H_ */
diff --git a/mm.c b/mm.c
index 8cf3210..6d82f2a 100644
--- a/mm.c
+++ b/mm.c
@@ -38,6 +38,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -361,6 +362,7 @@ void init_mm(void)
 
 printk("MM: Init\n");
 
+get_max_pages();
 arch_init_mm(&start_pfn, &max_pfn);
 /*
  * now we can initialise the page allocator
-- 
2.6.6


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 10/19] mini-os: remove unused mem_test() function

2016-08-11 Thread Juergen Gross
mem_test() isn't used anywhere and its value is rather questionable
with mini-os being in a mature state. Remove the function.

Signed-off-by: Juergen Gross 
Reviewed-by: Samuel Thibault 
---
 arch/x86/mm.c | 55 ---
 1 file changed, 55 deletions(-)

diff --git a/arch/x86/mm.c b/arch/x86/mm.c
index 6aa4468..e2f026b 100644
--- a/arch/x86/mm.c
+++ b/arch/x86/mm.c
@@ -302,61 +302,6 @@ static void set_readonly(void *text, void *etext)
 }
 
 /*
- * A useful mem testing function. Write the address to every address in the
- * range provided and read back the value. If verbose, print page walk to
- * some VA
- * 
- * If we get MEM_TEST_MAX_ERRORS we might as well stop
- */
-#define MEM_TEST_MAX_ERRORS 10 
-int mem_test(unsigned long *start_va, unsigned long *end_va, int verbose)
-{
-unsigned long mask = 0x1;
-unsigned long *pointer;
-int error_count = 0;
- 
-/* write values and print page walks */
-if ( verbose && (((unsigned long)start_va) & 0xf) )
-{
-printk("MemTest Start: 0x%p\n", start_va);
-page_walk((unsigned long)start_va);
-}
-for ( pointer = start_va; pointer < end_va; pointer++ )
-{
-if ( verbose && !(((unsigned long)pointer) & 0xf) )
-{
-printk("Writing to %p\n", pointer);
-page_walk((unsigned long)pointer);
-}
-*pointer = (unsigned long)pointer & ~mask;
-}
-if ( verbose && (((unsigned long)end_va) & 0xf) )
-{
-printk("MemTest End: %p\n", end_va-1);
-page_walk((unsigned long)end_va-1);
-}
- 
-/* verify values */
-for ( pointer = start_va; pointer < end_va; pointer++ )
-{
-if ( ((unsigned long)pointer & ~mask) != *pointer )
-{
-printk("Read error at 0x%lx. Read: 0x%lx, should read 0x%lx\n",
-   (unsigned long)pointer, *pointer, 
-   ((unsigned long)pointer & ~mask));
-error_count++;
-if ( error_count >= MEM_TEST_MAX_ERRORS )
-{
-printk("mem_test: too many errors\n");
-return -1;
-}
-}
-}
-return 0;
-}
-
-
-/*
  * get the PTE for virtual address va if it exists. Otherwise NULL.
  */
 static pgentry_t *get_pgt(unsigned long va)
-- 
2.6.6


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 09/19] mini-os: modify virtual memory layout for support of ballooning

2016-08-11 Thread Juergen Gross
In order to be able to support ballooning the virtual memory layout
of Mini-OS has to be modified: instead of a (nearly) consecutive
area used for physical memory mapping, on demand mappings, and heap
we need enough spare place for adding new memory.

So instead of dynamically place the different regions based on found
memory size locate them statically at fixed virtual addresses:

area   x86-64   x86-32

mapped physical memory    
kernel virtual mappings80 3f00
demand mappings  1000 4000
heap 2000 b000

This will enable Mini-OS to support up to 512GB of domain memory with
a 64 bit kernel and nearly 1GB with a 32 bit kernel.

For a 32 bit Mini-OS we have to avoid a conflict between heap and
m2p table which the hypervisor maps at f560. So the demand mapping
size is reduced by 256MB in order to keep the heap at about 1GB.

The kernel virtual mappings are a new area needed for being able to
grow the p2m list without having to relocate it in physical memory.

Modify the placement of the demand mappings and heap and adjust the
memory layout description.

Signed-off-by: Juergen Gross 
Reviewed-by: Samuel Thibault 
---
V2: avoid conflict with hypervisor mapped area on 32 bits
---
 arch/arm/mm.c |  2 +-
 arch/x86/mm.c | 44 +++-
 include/mm.h  |  2 +-
 include/x86/arch_mm.h | 35 ++-
 mm.c  |  2 +-
 5 files changed, 44 insertions(+), 41 deletions(-)

diff --git a/arch/arm/mm.c b/arch/arm/mm.c
index efecc51..f75888d 100644
--- a/arch/arm/mm.c
+++ b/arch/arm/mm.c
@@ -75,7 +75,7 @@ void arch_init_p2m(unsigned long max_pfn)
 {
 }
 
-void arch_init_demand_mapping_area(unsigned long cur_pfn)
+void arch_init_demand_mapping_area(void)
 {
 }
 
diff --git a/arch/x86/mm.c b/arch/x86/mm.c
index c59a5d3..6aa4468 100644
--- a/arch/x86/mm.c
+++ b/arch/x86/mm.c
@@ -442,37 +442,21 @@ pgentry_t *need_pgt(unsigned long va)
  * Reserve an area of virtual address space for mappings and Heap
  */
 static unsigned long demand_map_area_start;
-#ifdef __x86_64__
-#define DEMAND_MAP_PAGES ((128ULL << 30) / PAGE_SIZE)
-#else
-#define DEMAND_MAP_PAGES ((2ULL << 30) / PAGE_SIZE)
-#endif
-
-#ifndef HAVE_LIBC
-#define HEAP_PAGES 0
-#else
+static unsigned long demand_map_area_end;
+#ifdef HAVE_LIBC
 unsigned long heap, brk, heap_mapped, heap_end;
-#ifdef __x86_64__
-#define HEAP_PAGES ((128ULL << 30) / PAGE_SIZE)
-#else
-#define HEAP_PAGES ((1ULL << 30) / PAGE_SIZE)
-#endif
 #endif
 
-void arch_init_demand_mapping_area(unsigned long cur_pfn)
+void arch_init_demand_mapping_area(void)
 {
-cur_pfn++;
-
-demand_map_area_start = (unsigned long) pfn_to_virt(cur_pfn);
-cur_pfn += DEMAND_MAP_PAGES;
-printk("Demand map pfns at %lx-%p.\n", 
-   demand_map_area_start, pfn_to_virt(cur_pfn));
+demand_map_area_start = VIRT_DEMAND_AREA;
+demand_map_area_end = demand_map_area_start + DEMAND_MAP_PAGES * PAGE_SIZE;
+printk("Demand map pfns at %lx-%lx.\n", demand_map_area_start,
+   demand_map_area_end);
 
 #ifdef HAVE_LIBC
-cur_pfn++;
-heap_mapped = brk = heap = (unsigned long) pfn_to_virt(cur_pfn);
-cur_pfn += HEAP_PAGES;
-heap_end = (unsigned long) pfn_to_virt(cur_pfn);
+heap_mapped = brk = heap = VIRT_HEAP_AREA;
+heap_end = heap_mapped + HEAP_PAGES * PAGE_SIZE;
 printk("Heap resides at %lx-%lx.\n", brk, heap_end);
 #endif
 }
@@ -729,14 +713,8 @@ void arch_init_mm(unsigned long* start_pfn_p, unsigned 
long* max_pfn_p)
 start_pfn = PFN_UP(to_phys(start_info.pt_base)) + start_info.nr_pt_frames;
 max_pfn = start_info.nr_pages;
 
-/* We need room for demand mapping and heap, clip available memory */
-#if defined(__i386__)
-{
-unsigned long virt_pfns = 1 + DEMAND_MAP_PAGES + 1 + HEAP_PAGES;
-if (max_pfn + virt_pfns >= 0x10)
-max_pfn = 0x10 - virt_pfns - 1;
-}
-#endif
+if ( max_pfn >= MAX_MEM_SIZE / PAGE_SIZE )
+max_pfn = MAX_MEM_SIZE / PAGE_SIZE - 1;
 
 printk("  start_pfn: %lx\n", start_pfn);
 printk("max_pfn: %lx\n", max_pfn);
diff --git a/include/mm.h b/include/mm.h
index b97b43e..a22dcd1 100644
--- a/include/mm.h
+++ b/include/mm.h
@@ -59,7 +59,7 @@ static __inline__ int get_order(unsigned long size)
 return order;
 }
 
-void arch_init_demand_mapping_area(unsigned long max_pfn);
+void arch_init_demand_mapping_area(void);
 void arch_init_mm(unsigned long* start_pfn_p, unsigned long* max_pfn_p);
 void arch_init_p2m(unsigned long max_pfn_p);
 
diff --git a/include/x86/arch_mm.h b/include/x86/arch_mm.h
index f756dab..d87fe55 100644
--- a/include/x86/arch_mm.h
+++ b/include/x86/arch_mm.h
@@ -49,11 +49,13 @@
  *
  * Virtual address space usage:
  *
- * 1:1 mapping of physical

[Xen-devel] [PATCH v3 15/19] mini-os: remap p2m list in case of ballooning

2016-08-11 Thread Juergen Gross
In case of enabled ballooning we must be prepared for a growing p2m
list. If the maximum memory size of the domain can't be covered by the
actual p2m list remap it to the kernel virtual mapping area and leave
enough space at the end.

Signed-off-by: Juergen Gross 
Reviewed-by: Samuel Thibault 
---
V3: add assertion as requested by Samuel Thibault
---
 arch/arm/balloon.c|  2 ++
 arch/x86/balloon.c| 25 +
 arch/x86/mm.c |  3 +++
 include/balloon.h |  3 +++
 include/x86/arch_mm.h |  4 
 5 files changed, 37 insertions(+)

diff --git a/arch/arm/balloon.c b/arch/arm/balloon.c
index 28021d6..549e51b 100644
--- a/arch/arm/balloon.c
+++ b/arch/arm/balloon.c
@@ -25,4 +25,6 @@
 
 #ifdef CONFIG_BALLOON
 
+unsigned long virt_kernel_area_end;   /* TODO: find a virtual area */
+
 #endif
diff --git a/arch/x86/balloon.c b/arch/x86/balloon.c
index 28021d6..a7f20e4 100644
--- a/arch/x86/balloon.c
+++ b/arch/x86/balloon.c
@@ -21,8 +21,33 @@
  * DEALINGS IN THE SOFTWARE.
  */
 
+#include 
 #include 
+#include 
+#include 
 
 #ifdef CONFIG_BALLOON
 
+unsigned long virt_kernel_area_end = VIRT_KERNEL_AREA;
+
+void arch_remap_p2m(unsigned long max_pfn)
+{
+unsigned long pfn;
+
+if ( p2m_pages(nr_max_pages) <= p2m_pages(max_pfn) )
+return;
+
+for ( pfn = 0; pfn < max_pfn; pfn += P2M_ENTRIES )
+{
+map_frame_rw(virt_kernel_area_end + PAGE_SIZE * (pfn / P2M_ENTRIES),
+ virt_to_mfn(phys_to_machine_mapping + pfn));
+}
+
+phys_to_machine_mapping = (unsigned long *)virt_kernel_area_end;
+printk("remapped p2m list to %p\n", phys_to_machine_mapping);
+
+virt_kernel_area_end += PAGE_SIZE * p2m_pages(nr_max_pages);
+ASSERT(virt_kernel_area_end <= VIRT_DEMAND_AREA);
+}
+
 #endif
diff --git a/arch/x86/mm.c b/arch/x86/mm.c
index a5c8959..8fa3b4c 100644
--- a/arch/x86/mm.c
+++ b/arch/x86/mm.c
@@ -37,6 +37,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -626,6 +627,8 @@ void arch_init_p2m(unsigned long max_pfn)
 HYPERVISOR_shared_info->arch.pfn_to_mfn_frame_list_list = 
 virt_to_mfn(l3_list);
 HYPERVISOR_shared_info->arch.max_pfn = max_pfn;
+
+arch_remap_p2m(max_pfn);
 }
 
 void arch_init_mm(unsigned long* start_pfn_p, unsigned long* max_pfn_p)
diff --git a/include/balloon.h b/include/balloon.h
index cd79017..b0d0ebf 100644
--- a/include/balloon.h
+++ b/include/balloon.h
@@ -27,12 +27,15 @@
 #ifdef CONFIG_BALLOON
 
 extern unsigned long nr_max_pages;
+extern unsigned long virt_kernel_area_end;
 
 void get_max_pages(void);
+void arch_remap_p2m(unsigned long max_pfn);
 
 #else /* CONFIG_BALLOON */
 
 static inline void get_max_pages(void) { }
+static inline void arch_remap_p2m(unsigned long max_pfn) { }
 
 #endif /* CONFIG_BALLOON */
 #endif /* _BALLOON_H_ */
diff --git a/include/x86/arch_mm.h b/include/x86/arch_mm.h
index 7283f64..e5d9c57 100644
--- a/include/x86/arch_mm.h
+++ b/include/x86/arch_mm.h
@@ -198,6 +198,10 @@ static inline void p2m_chk_pfn(unsigned long pfn)
 do_exit();
 }
 }
+static inline unsigned long p2m_pages(unsigned long pages)
+{
+return (pages + P2M_ENTRIES - 1) >> L1_P2M_SHIFT;
+}
 
 #include "arch_limits.h"
 #define PAGE_SIZE   __PAGE_SIZE
-- 
2.6.6


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 11/19] mini-os: add checks for out of memory

2016-08-11 Thread Juergen Gross
There are several core functions in Mini-OS not checking for failed
memory allocations. Add such checks.

Add do_map_frames() dummy function to arm architecture as it will be
needed in future for compilations to succeed.

Signed-off-by: Juergen Gross 
Reviewed-by: Samuel Thibault 
---
 arch/arm/mm.c |  8 
 arch/x86/mm.c | 26 +++---
 include/mm.h  |  2 +-
 3 files changed, 28 insertions(+), 8 deletions(-)

diff --git a/arch/arm/mm.c b/arch/arm/mm.c
index f75888d..fc8d4bc 100644
--- a/arch/arm/mm.c
+++ b/arch/arm/mm.c
@@ -1,6 +1,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -79,6 +80,13 @@ void arch_init_demand_mapping_area(void)
 {
 }
 
+int do_map_frames(unsigned long addr,
+const unsigned long *f, unsigned long n, unsigned long stride,
+unsigned long increment, domid_t id, int *err, unsigned long prot)
+{
+return -ENOSYS;
+}
+
 /* Get Xen's suggested physical page assignments for the grant table. */
 static paddr_t get_gnttab_base(void)
 {
diff --git a/arch/x86/mm.c b/arch/x86/mm.c
index e2f026b..12f7fe4 100644
--- a/arch/x86/mm.c
+++ b/arch/x86/mm.c
@@ -34,6 +34,7 @@
  * DEALINGS IN THE SOFTWARE.
  */
 
+#include 
 #include 
 #include 
 #include 
@@ -354,6 +355,8 @@ pgentry_t *need_pgt(unsigned long va)
 if ( !(tab[offset] & _PAGE_PRESENT) )
 {
 pt_pfn = virt_to_pfn(alloc_page());
+if ( !pt_pfn )
+return NULL;
 new_pt_frame(&pt_pfn, pt_mfn, offset, L3_FRAME);
 }
 ASSERT(tab[offset] & _PAGE_PRESENT);
@@ -364,6 +367,8 @@ pgentry_t *need_pgt(unsigned long va)
 if ( !(tab[offset] & _PAGE_PRESENT) ) 
 {
 pt_pfn = virt_to_pfn(alloc_page());
+if ( !pt_pfn )
+return NULL;
 new_pt_frame(&pt_pfn, pt_mfn, offset, L2_FRAME);
 }
 ASSERT(tab[offset] & _PAGE_PRESENT);
@@ -373,6 +378,8 @@ pgentry_t *need_pgt(unsigned long va)
 if ( !(tab[offset] & _PAGE_PRESENT) )
 {
 pt_pfn = virt_to_pfn(alloc_page());
+if ( !pt_pfn )
+return NULL;
 new_pt_frame(&pt_pfn, pt_mfn, offset, L1_FRAME);
 }
 ASSERT(tab[offset] & _PAGE_PRESENT);
@@ -445,10 +452,10 @@ unsigned long allocate_ondemand(unsigned long n, unsigned 
long alignment)
  * va. map f[i*stride]+i*increment for i in 0..n-1.
  */
 #define MAP_BATCH ((STACK_SIZE / 2) / sizeof(mmu_update_t))
-void do_map_frames(unsigned long va,
-   const unsigned long *mfns, unsigned long n, 
-   unsigned long stride, unsigned long incr, 
-   domid_t id, int *err, unsigned long prot)
+int do_map_frames(unsigned long va,
+  const unsigned long *mfns, unsigned long n,
+  unsigned long stride, unsigned long incr,
+  domid_t id, int *err, unsigned long prot)
 {
 pgentry_t *pgt = NULL;
 unsigned long done = 0;
@@ -458,7 +465,7 @@ void do_map_frames(unsigned long va,
 if ( !mfns ) 
 {
 printk("do_map_frames: no mfns supplied\n");
-return;
+return -EINVAL;
 }
 DEBUG("va=%p n=0x%lx, mfns[0]=0x%lx stride=0x%lx incr=0x%lx prot=0x%lx\n",
   va, n, mfns[0], stride, incr, prot);
@@ -484,7 +491,9 @@ void do_map_frames(unsigned long va,
 {
 if ( !pgt || !(va & L1_MASK) )
 pgt = need_pgt(va);
-
+if ( !pgt )
+return -ENOMEM;
+
 mmu_updates[i].ptr = virt_to_mach(pgt) | MMU_NORMAL_PT_UPDATE;
 mmu_updates[i].val = ((pgentry_t)(mfns[(done + i) * stride] +
   (done + i) * incr)
@@ -505,6 +514,8 @@ void do_map_frames(unsigned long va,
 }
 done += todo;
 }
+
+return 0;
 }
 
 /*
@@ -521,7 +532,8 @@ void *map_frames_ex(const unsigned long *mfns, unsigned 
long n,
 if ( !va )
 return NULL;
 
-do_map_frames(va, mfns, n, stride, incr, id, err, prot);
+if ( do_map_frames(va, mfns, n, stride, incr, id, err, prot) )
+return NULL;
 
 return (void *)va;
 }
diff --git a/include/mm.h b/include/mm.h
index a22dcd1..9244e26 100644
--- a/include/mm.h
+++ b/include/mm.h
@@ -68,7 +68,7 @@ unsigned long allocate_ondemand(unsigned long n, unsigned 
long alignment);
 void *map_frames_ex(const unsigned long *f, unsigned long n, unsigned long 
stride,
unsigned long increment, unsigned long alignment, domid_t id,
int *err, unsigned long prot);
-void do_map_frames(unsigned long addr,
+int do_map_frames(unsigned long addr,
 const unsigned long *f, unsigned long n, unsigned long stride,
unsigned long increment, domid_t id, int *err, unsigned long prot);
 int unmap_frames(unsigned long va, unsigned long num_frames);
-- 
2.6.6


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 18/19] mini-os: balloon up in case of oom

2016-08-11 Thread Juergen Gross
If a memory shortage is detected balloon up.

Be careful to always leave some pages free as ballooning up might need
some memory, too:

- new p2m frames
- page tables for addressing new p2m frame
- new frame for page allocation bitmap
- page table for addressing new page allocation bitmap frame
- page tables for addressing new 1:1 mapped frames

For the moment we only balloon up synchronously when memory shortage
is detected in allocation routines with irqs on.

Signed-off-by: Juergen Gross 
Reviewed-by: Samuel Thibault 
---
V3: Reverse chk_free_pages() return value as requested by Samuel Thibault
---
 balloon.c | 32 
 include/balloon.h | 11 +++
 mm.c  |  4 +++-
 3 files changed, 46 insertions(+), 1 deletion(-)

diff --git a/balloon.c b/balloon.c
index 07ef532..afec757 100644
--- a/balloon.c
+++ b/balloon.c
@@ -126,3 +126,35 @@ int balloon_up(unsigned long n_pages)
 
 return rc;
 }
+
+static int in_balloon;
+
+int chk_free_pages(unsigned long needed)
+{
+unsigned long n_pages;
+
+/* No need for ballooning if plenty of space available. */
+if ( needed + BALLOON_EMERGENCY_PAGES <= nr_free_pages )
+return 1;
+
+/* If we are already ballooning up just hope for the best. */
+if ( in_balloon )
+return 1;
+
+/* Interrupts disabled can't be handled right now. */
+if ( irqs_disabled() )
+return 1;
+
+in_balloon = 1;
+
+while ( needed + BALLOON_EMERGENCY_PAGES > nr_free_pages )
+{
+n_pages = needed + BALLOON_EMERGENCY_PAGES - nr_free_pages;
+if ( !balloon_up(n_pages) )
+break;
+}
+
+in_balloon = 0;
+
+return needed <= nr_free_pages;
+}
diff --git a/include/balloon.h b/include/balloon.h
index 5ec1bbb..d8710ad 100644
--- a/include/balloon.h
+++ b/include/balloon.h
@@ -26,6 +26,12 @@
 
 #ifdef CONFIG_BALLOON
 
+/*
+ * Always keep some pages free for allocations while ballooning or
+ * interrupts disabled.
+ */
+#define BALLOON_EMERGENCY_PAGES   64
+
 extern unsigned long nr_max_pages;
 extern unsigned long virt_kernel_area_end;
 extern unsigned long nr_mem_pages;
@@ -37,12 +43,17 @@ void arch_remap_p2m(unsigned long max_pfn);
 void mm_alloc_bitmap_remap(void);
 int arch_expand_p2m(unsigned long max_pfn);
 void arch_pfn_add(unsigned long pfn, unsigned long mfn);
+int chk_free_pages(unsigned long needed);
 
 #else /* CONFIG_BALLOON */
 
 static inline void get_max_pages(void) { }
 static inline void arch_remap_p2m(unsigned long max_pfn) { }
 static inline void mm_alloc_bitmap_remap(void) { }
+static inline int chk_free_pages(unsigned long needed)
+{
+return needed <= nr_free_pages;
+}
 
 #endif /* CONFIG_BALLOON */
 #endif /* _BALLOON_H_ */
diff --git a/mm.c b/mm.c
index 5364079..b1f8f34 100644
--- a/mm.c
+++ b/mm.c
@@ -209,6 +209,8 @@ unsigned long alloc_pages(int order)
 chunk_head_t *alloc_ch, *spare_ch;
 chunk_tail_t*spare_ct;
 
+if ( !chk_free_pages(1UL << order) )
+goto no_memory;
 
 /* Find smallest order which can satisfy the request. */
 for ( i = order; i < FREELIST_SIZE; i++ ) {
@@ -343,7 +345,7 @@ void *sbrk(ptrdiff_t increment)
 if (new_brk > heap_mapped) {
 unsigned long n = (new_brk - heap_mapped + PAGE_SIZE - 1) / PAGE_SIZE;
 
-if ( n > nr_free_pages )
+if ( !chk_free_pages(n) )
 {
 printk("Memory exhausted: want %ld pages, but only %ld are left\n",
n, nr_free_pages);
-- 
2.6.6


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 14/19] mini-os: move p2m related macros to header file

2016-08-11 Thread Juergen Gross
In order to be able to use p2m related macros for ballooning move
their definitions to arch/x86/mm.h.

There is no need to define different macros regarding index masks and
number of entries for the different levels, as all levels share the
same entry format (a plain mfn). So reduce the number of macros
accordingly.

Add some macros to get the indices into p2m pages from a pfn and make
use of them in current p2m code.

Signed-off-by: Juergen Gross 
Reviewed-by: Samuel Thibault 
---
 arch/x86/mm.c | 31 +--
 include/x86/arch_mm.h | 21 +
 2 files changed, 26 insertions(+), 26 deletions(-)

diff --git a/arch/x86/mm.c b/arch/x86/mm.c
index e10c2c5..a5c8959 100644
--- a/arch/x86/mm.c
+++ b/arch/x86/mm.c
@@ -609,40 +609,19 @@ static void clear_bootstrap(void)
 
 void arch_init_p2m(unsigned long max_pfn)
 {
-#ifdef __x86_64__
-#define L1_P2M_SHIFT9
-#define L2_P2M_SHIFT18
-#define L3_P2M_SHIFT27
-#else
-#define L1_P2M_SHIFT10
-#define L2_P2M_SHIFT20
-#define L3_P2M_SHIFT30
-#endif
-#define L1_P2M_ENTRIES  (1 << L1_P2M_SHIFT)
-#define L2_P2M_ENTRIES  (1 << (L2_P2M_SHIFT - L1_P2M_SHIFT))
-#define L3_P2M_ENTRIES  (1 << (L3_P2M_SHIFT - L2_P2M_SHIFT))
-#define L1_P2M_MASK (L1_P2M_ENTRIES - 1)
-#define L2_P2M_MASK (L2_P2M_ENTRIES - 1)
-#define L3_P2M_MASK (L3_P2M_ENTRIES - 1)
-
 unsigned long *l2_list = NULL, *l3_list;
 unsigned long pfn;
 
+p2m_chk_pfn(max_pfn - 1);
 l3_list = (unsigned long *)alloc_page(); 
-for ( pfn = 0; pfn < max_pfn; pfn += L1_P2M_ENTRIES )
+for ( pfn = 0; pfn < max_pfn; pfn += P2M_ENTRIES )
 {
-if ( !(pfn % (L1_P2M_ENTRIES * L2_P2M_ENTRIES)) )
+if ( !(pfn % (P2M_ENTRIES * P2M_ENTRIES)) )
 {
 l2_list = (unsigned long*)alloc_page();
-if ( (pfn >> L3_P2M_SHIFT) > 0 )
-{
-printk("Error: Too many pfns.\n");
-do_exit();
-}
-l3_list[(pfn >> L2_P2M_SHIFT)] = virt_to_mfn(l2_list);  
+l3_list[L3_P2M_IDX(pfn)] = virt_to_mfn(l2_list);
 }
-l2_list[(pfn >> L1_P2M_SHIFT) & L2_P2M_MASK] =
-virt_to_mfn(phys_to_machine_mapping + pfn);
+l2_list[L2_P2M_IDX(pfn)] = virt_to_mfn(phys_to_machine_mapping + pfn);
 }
 HYPERVISOR_shared_info->arch.pfn_to_mfn_frame_list_list = 
 virt_to_mfn(l3_list);
diff --git a/include/x86/arch_mm.h b/include/x86/arch_mm.h
index d87fe55..7283f64 100644
--- a/include/x86/arch_mm.h
+++ b/include/x86/arch_mm.h
@@ -176,7 +176,28 @@ typedef unsigned long pgentry_t;
 #define IO_PROT_NOCACHE (L1_PROT | _PAGE_PCD)
 
 /* for P2M */
+#ifdef __x86_64__
+#define P2M_SHIFT   9
+#else
+#define P2M_SHIFT   10
+#endif
+#define P2M_ENTRIES (1UL << P2M_SHIFT)
+#define P2M_MASK(P2M_ENTRIES - 1)
+#define L1_P2M_SHIFTP2M_SHIFT
+#define L2_P2M_SHIFT(2 * P2M_SHIFT)
+#define L3_P2M_SHIFT(3 * P2M_SHIFT)
+#define L1_P2M_IDX(pfn) ((pfn) & P2M_MASK)
+#define L2_P2M_IDX(pfn) (((pfn) >> L1_P2M_SHIFT) & P2M_MASK)
+#define L3_P2M_IDX(pfn) (((pfn) >> L2_P2M_SHIFT) & P2M_MASK)
 #define INVALID_P2M_ENTRY (~0UL)
+static inline void p2m_chk_pfn(unsigned long pfn)
+{
+if ( (pfn >> L3_P2M_SHIFT) > 0 )
+{
+printk("Error: Too many pfns.\n");
+do_exit();
+}
+}
 
 #include "arch_limits.h"
 #define PAGE_SIZE   __PAGE_SIZE
-- 
2.6.6


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 16/19] mini-os: map page allocator's bitmap to virtual kernel area for ballooning

2016-08-11 Thread Juergen Gross
In case of CONFIG_BALLOON the page allocator's bitmap needs some space
to be able to grow. Remap it to kernel virtual area if the preallocated
area isn't large enough.

Signed-off-by: Juergen Gross 
---
V3: - add assertion as requested by Samuel Thibault
- rename functions to have mm_ prefix as requested by Samuel Thibault
---
 balloon.c | 18 ++
 include/balloon.h |  2 ++
 include/mm.h  |  6 ++
 mm.c  | 43 ++-
 4 files changed, 48 insertions(+), 21 deletions(-)

diff --git a/balloon.c b/balloon.c
index 1ec113d..78b30af 100644
--- a/balloon.c
+++ b/balloon.c
@@ -44,3 +44,21 @@ void get_max_pages(void)
 nr_max_pages = ret;
 printk("Maximum memory size: %ld pages\n", nr_max_pages);
 }
+
+void mm_alloc_bitmap_remap(void)
+{
+unsigned long i;
+
+if ( mm_bitmap_size >= ((nr_max_pages + 1) >> (PAGE_SHIFT + 3)) )
+return;
+
+for ( i = 0; i < mm_bitmap_size; i += PAGE_SIZE )
+{
+map_frame_rw(virt_kernel_area_end + i,
+ virt_to_mfn((unsigned long)(mm_bitmap) + i));
+}
+
+mm_bitmap = (unsigned long *)virt_kernel_area_end;
+virt_kernel_area_end += round_pgup((nr_max_pages + 1) >> (PAGE_SHIFT + 3));
+ASSERT(virt_kernel_area_end <= VIRT_DEMAND_AREA);
+}
diff --git a/include/balloon.h b/include/balloon.h
index b0d0ebf..9154f44 100644
--- a/include/balloon.h
+++ b/include/balloon.h
@@ -31,11 +31,13 @@ extern unsigned long virt_kernel_area_end;
 
 void get_max_pages(void);
 void arch_remap_p2m(unsigned long max_pfn);
+void mm_alloc_bitmap_remap(void);
 
 #else /* CONFIG_BALLOON */
 
 static inline void get_max_pages(void) { }
 static inline void arch_remap_p2m(unsigned long max_pfn) { }
+static inline void mm_alloc_bitmap_remap(void) { }
 
 #endif /* CONFIG_BALLOON */
 #endif /* _BALLOON_H_ */
diff --git a/include/mm.h b/include/mm.h
index 6add683..ab56445 100644
--- a/include/mm.h
+++ b/include/mm.h
@@ -42,8 +42,14 @@
 #define STACK_SIZE_PAGE_ORDER __STACK_SIZE_PAGE_ORDER
 #define STACK_SIZE __STACK_SIZE
 
+#define round_pgdown(_p)  ((_p) & PAGE_MASK)
+#define round_pgup(_p)(((_p) + (PAGE_SIZE - 1)) & PAGE_MASK)
+
 extern unsigned long nr_free_pages;
 
+extern unsigned long *mm_bitmap;
+extern unsigned long mm_bitmap_size;
+
 void init_mm(void);
 unsigned long alloc_pages(int order);
 #define alloc_page()alloc_pages(0)
diff --git a/mm.c b/mm.c
index 707a3e0..e2f55af 100644
--- a/mm.c
+++ b/mm.c
@@ -48,11 +48,13 @@
  *  One bit per page of memory. Bit set => page is allocated.
  */
 
-static unsigned long *alloc_bitmap;
+unsigned long *mm_bitmap;
+unsigned long mm_bitmap_size;
+
 #define PAGES_PER_MAPWORD (sizeof(unsigned long) * 8)
 
 #define allocated_in_map(_pn) \
-(alloc_bitmap[(_pn)/PAGES_PER_MAPWORD] & (1UL<<((_pn)&(PAGES_PER_MAPWORD-1
+(mm_bitmap[(_pn)/PAGES_PER_MAPWORD] & (1UL<<((_pn)&(PAGES_PER_MAPWORD-1
 
 unsigned long nr_free_pages;
 
@@ -61,8 +63,8 @@ unsigned long nr_free_pages;
  *  -(1<= n. 
  *  (1> (PAGE_SHIFT+3);
-bitmap_size  = round_pgup(bitmap_size);
-alloc_bitmap = (unsigned long *)to_virt(min);
-min += bitmap_size;
+mm_bitmap_size  = (max + 1) >> (PAGE_SHIFT + 3);
+mm_bitmap_size  = round_pgup(mm_bitmap_size);
+mm_bitmap = (unsigned long *)to_virt(min);
+min += mm_bitmap_size;
 range= max - min;
 
 /* All allocated by default. */
-memset(alloc_bitmap, ~0, bitmap_size);
+memset(mm_bitmap, ~0, mm_bitmap_size);
 /* Free up the memory we've been given to play with. */
 map_free(PHYS_PFN(min), range>>PAGE_SHIFT);
 
@@ -198,6 +197,8 @@ static void init_page_allocator(unsigned long min, unsigned 
long max)
 free_head[i]= ch;
 ct->level   = i;
 }
+
+mm_alloc_bitmap_remap();
 }
 
 
-- 
2.6.6


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 19/19] mini-os: repair build system

2016-08-11 Thread Juergen Gross
The build system of Mini-OS is using different settings for arch/*/*
than for the rest of the tree. The main reasons are that Config.mk is
included only conditionally in the top level Makefile, while minios.mk
isn't included by the arch Makefiles.

Repairing this mess enables us to move the CONFIG_* handling to
Config.mk enabling the arch sources to make use of those even if no
MINIOS_CONFIG was specified by the caller.

Most of the files under config were not used. Integrate the used ones
into Config.mk and delete the rest.

The CONFIG_* defines should be set for assembler sources, too.

Signed-off-by: Juergen Gross 
---
 Config.mk | 93 +++
 Makefile  | 44 --
 arch/x86/Makefile |  3 --
 config/MiniOS.mk  | 10 --
 config/StdGNU.mk  | 47 
 config/arm32.mk   | 22 -
 config/arm64.mk   | 19 
 config/x86_32.mk  | 20 
 config/x86_64.mk  | 33 
 minios.mk |  4 +--
 10 files changed, 95 insertions(+), 200 deletions(-)
 delete mode 100644 config/MiniOS.mk
 delete mode 100644 config/StdGNU.mk
 delete mode 100644 config/arm32.mk
 delete mode 100644 config/arm64.mk
 delete mode 100644 config/x86_32.mk
 delete mode 100644 config/x86_64.mk

diff --git a/Config.mk b/Config.mk
index 9d19cd7..8ab1a7e 100644
--- a/Config.mk
+++ b/Config.mk
@@ -23,6 +23,11 @@ cc-option = $(shell if test -z "`echo 'void*p=1;' | \
   $(1) $(2) -S -o /dev/null -x c - 2>&1 | grep -- $(2) -`"; \
   then echo "$(2)"; else echo "$(3)"; fi ;)
 
+ifneq ($(MINIOS_CONFIG),)
+EXTRA_DEPS += $(MINIOS_CONFIG)
+include $(MINIOS_CONFIG)
+endif
+
 # Compatibility with Xen's stubdom build environment.  If we are building
 # stubdom, some XEN_ variables are set, set MINIOS_ variables accordingly.
 #
@@ -97,3 +102,91 @@ DEF_CPPFLAGS += -DHAVE_LWIP
 DEF_CPPFLAGS += -isystem $(LWIPDIR)/src/include
 DEF_CPPFLAGS += -isystem $(LWIPDIR)/src/include/ipv4
 endif
+
+# Set tools
+AS = $(CROSS_COMPILE)as
+LD = $(CROSS_COMPILE)ld
+ifeq ($(clang),y)
+CC = $(CROSS_COMPILE)clang
+LD_LTO = $(CROSS_COMPILE)llvm-ld
+else
+CC = $(CROSS_COMPILE)gcc
+LD_LTO = $(CROSS_COMPILE)ld
+endif
+CPP= $(CC) -E
+AR = $(CROSS_COMPILE)ar
+RANLIB = $(CROSS_COMPILE)ranlib
+NM = $(CROSS_COMPILE)nm
+STRIP  = $(CROSS_COMPILE)strip
+OBJCOPY= $(CROSS_COMPILE)objcopy
+OBJDUMP= $(CROSS_COMPILE)objdump
+SIZEUTIL   = $(CROSS_COMPILE)size
+
+# Allow git to be wrappered in the environment
+GIT?= git
+
+INSTALL  = install
+INSTALL_DIR  = $(INSTALL) -d -m0755 -p
+INSTALL_DATA = $(INSTALL) -m0644 -p
+INSTALL_PROG = $(INSTALL) -m0755 -p
+
+BOOT_DIR ?= /boot
+
+SOCKET_LIBS =
+UTIL_LIBS = -lutil
+DLOPEN_LIBS = -ldl
+
+SONAME_LDFLAG = -soname
+SHLIB_LDFLAGS = -shared
+
+ifneq ($(debug),y)
+CFLAGS += -O2 -fomit-frame-pointer
+else
+# Less than -O1 produces bad code and large stack frames
+CFLAGS += -O1 -fno-omit-frame-pointer
+CFLAGS-$(gcc) += -fno-optimize-sibling-calls
+endif
+
+ifeq ($(lto),y)
+CFLAGS += -flto
+LDFLAGS-$(clang) += -plugin LLVMgold.so
+endif
+
+# Configuration defaults
+CONFIG_START_NETWORK ?= y
+CONFIG_SPARSE_BSS ?= y
+CONFIG_QEMU_XS_ARGS ?= n
+CONFIG_TEST ?= n
+CONFIG_PCIFRONT ?= n
+CONFIG_BLKFRONT ?= y
+CONFIG_TPMFRONT ?= n
+CONFIG_TPM_TIS ?= n
+CONFIG_TPMBACK ?= n
+CONFIG_NETFRONT ?= y
+CONFIG_FBFRONT ?= y
+CONFIG_KBDFRONT ?= y
+CONFIG_CONSFRONT ?= y
+CONFIG_XENBUS ?= y
+CONFIG_XC ?=y
+CONFIG_LWIP ?= $(lwip)
+CONFIG_BALLOON ?= n
+
+# Export config items as compiler directives
+DEFINES-$(CONFIG_START_NETWORK) += -DCONFIG_START_NETWORK
+DEFINES-$(CONFIG_SPARSE_BSS) += -DCONFIG_SPARSE_BSS
+DEFINES-$(CONFIG_QEMU_XS_ARGS) += -DCONFIG_QEMU_XS_ARGS
+DEFINES-$(CONFIG_PCIFRONT) += -DCONFIG_PCIFRONT
+DEFINES-$(CONFIG_BLKFRONT) += -DCONFIG_BLKFRONT
+DEFINES-$(CONFIG_TPMFRONT) += -DCONFIG_TPMFRONT
+DEFINES-$(CONFIG_TPM_TIS) += -DCONFIG_TPM_TIS
+DEFINES-$(CONFIG_TPMBACK) += -DCONFIG_TPMBACK
+DEFINES-$(CONFIG_NETFRONT) += -DCONFIG_NETFRONT
+DEFINES-$(CONFIG_KBDFRONT) += -DCONFIG_KBDFRONT
+DEFINES-$(CONFIG_FBFRONT) += -DCONFIG_FBFRONT
+DEFINES-$(CONFIG_CONSFRONT) += -DCONFIG_CONSFRONT
+DEFINES-$(CONFIG_XENBUS) += -DCONFIG_XENBUS
+DEFINES-$(CONFIG_BALLOON) += -DCONFIG_BALLOON
+
+# Override settings for this OS
+PTHREAD_LIBS =
+nosharedlibs=y
diff --git a/Makefile b/Makefile
index f5b7011..5464e89 100644
--- a/Makefile
+++ b/Makefile
@@ -7,51 +7,7 @@
 OBJ_DIR=$(CURDIR)
 TOPLEVEL_DIR=$(CURDIR)
 
-ifeq ($(MINIOS_CONFIG),)
 include Config.mk
-else
-EXTRA_DEPS += $(MINIOS_CONFIG)
-include $(MINIOS_CONFIG)
-endif
-
-include $(MINIOS_ROOT)/config/MiniOS.mk
-
-# Configuration defaults
-CONFIG_START_NETWORK ?= y
-CONFIG_SPARSE_BSS ?= y
-CONFIG_QEMU_XS_ARGS ?= n
-CONFIG_TEST ?= n
-CONFIG_PCIFRONT ?= n
-CONFIG_BLKFRONT ?= y
-CONFIG_TPMFRONT ?= n
-CONFIG_TPM_TIS ?= n
-CONFIG_TPMBACK ?= n
-CONFIG_NETFRONT ?= y
-CONFIG_FBFRON

[Xen-devel] [PATCH v3 01/19] mini-os: correct first free pfn

2016-08-11 Thread Juergen Gross
The first free pfn available for allocation is calculated by adding the
number of page table frames to the pfn of the first page table and
then the magic number 3 to account for start info page et al.

As the start info page, xenstore page and console page are allocated
_before_ the page tables leaving room for these pages behind the page
tables makes no sense.

Signed-off-by: Juergen Gross 
Reviewed-by: Wei Liu 
Acked-by: Samuel Thibault 
---
 arch/x86/mm.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/x86/mm.c b/arch/x86/mm.c
index 51aa966..ae1036e 100644
--- a/arch/x86/mm.c
+++ b/arch/x86/mm.c
@@ -867,9 +867,8 @@ void arch_init_mm(unsigned long* start_pfn_p, unsigned 
long* max_pfn_p)
 printk("stack start: %p(VA)\n", stack);
 printk("   _end: %p(VA)\n", &_end);
 
-/* First page follows page table pages and 3 more pages (store page etc) */
-start_pfn = PFN_UP(to_phys(start_info.pt_base)) + 
-start_info.nr_pt_frames + 3;
+/* First page follows page table pages. */
+start_pfn = PFN_UP(to_phys(start_info.pt_base)) + start_info.nr_pt_frames;
 max_pfn = start_info.nr_pages;
 
 /* We need room for demand mapping and heap, clip available memory */
-- 
2.6.6


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 13/19] mini-os: add function to map one frame

2016-08-11 Thread Juergen Gross
Add a function to map one physical frame to a specified virtual
address as read/write. This will be used later multiple times.

Signed-off-by: Juergen Gross 
Reviewed-by: Samuel Thibault 
---
 include/arm/arch_mm.h | 2 ++
 include/mm.h  | 1 +
 mm.c  | 5 +
 3 files changed, 8 insertions(+)

diff --git a/include/arm/arch_mm.h b/include/arm/arch_mm.h
index 085d4e5..f4685d8 100644
--- a/include/arm/arch_mm.h
+++ b/include/arm/arch_mm.h
@@ -14,6 +14,8 @@ extern uint32_t physical_address_offset;  /* Add this to 
a virtual address to get
 
 #define L1_PAGETABLE_SHIFT  12
 
+#define L1_PROT  0
+
 #define to_phys(x) (((paddr_t)(x)+physical_address_offset) & 
0x)
 #define to_virt(x) ((void *)(((x)-physical_address_offset) & 
0x))
 
diff --git a/include/mm.h b/include/mm.h
index 9244e26..6add683 100644
--- a/include/mm.h
+++ b/include/mm.h
@@ -72,6 +72,7 @@ int do_map_frames(unsigned long addr,
 const unsigned long *f, unsigned long n, unsigned long stride,
unsigned long increment, domid_t id, int *err, unsigned long prot);
 int unmap_frames(unsigned long va, unsigned long num_frames);
+int map_frame_rw(unsigned long addr, unsigned long mfn);
 #ifdef HAVE_LIBC
 extern unsigned long heap, brk, heap_mapped, heap_end;
 #endif
diff --git a/mm.c b/mm.c
index c53b0ca..707a3e0 100644
--- a/mm.c
+++ b/mm.c
@@ -319,6 +319,11 @@ int free_physical_pages(xen_pfn_t *mfns, int n)
 return HYPERVISOR_memory_op(XENMEM_decrease_reservation, &reservation);
 }
 
+int map_frame_rw(unsigned long addr, unsigned long mfn)
+{
+return do_map_frames(addr, &mfn, 1, 1, 1, DOMID_SELF, NULL, L1_PROT);
+}
+
 #ifdef HAVE_LIBC
 void *sbrk(ptrdiff_t increment)
 {
-- 
2.6.6


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 17/19] mini-os: add support for ballooning up

2016-08-11 Thread Juergen Gross
Add support for ballooning the domain up by a specified amount of
pages. Following steps are performed:

- extending the p2m map
- extending the page allocator's bitmap
- getting new memory pages from the hypervisor
- adding the memory at the current end of guest memory

Signed-off-by: Juergen Gross 
Reviewed-by: Samuel Thibault 
---
V3: change "if" to "while" in balloon_up() as requested by Samuel Thibault
---
 arch/arm/balloon.c |  9 ++
 arch/x86/balloon.c | 94 ++
 balloon.c  | 64 +
 include/balloon.h  |  5 +++
 mm.c   |  4 +++
 5 files changed, 176 insertions(+)

diff --git a/arch/arm/balloon.c b/arch/arm/balloon.c
index 549e51b..7f35328 100644
--- a/arch/arm/balloon.c
+++ b/arch/arm/balloon.c
@@ -27,4 +27,13 @@
 
 unsigned long virt_kernel_area_end;   /* TODO: find a virtual area */
 
+int arch_expand_p2m(unsigned long max_pfn)
+{
+return 0;
+}
+
+void arch_pfn_add(unsigned long pfn, unsigned long mfn)
+{
+}
+
 #endif
diff --git a/arch/x86/balloon.c b/arch/x86/balloon.c
index a7f20e4..42389e4 100644
--- a/arch/x86/balloon.c
+++ b/arch/x86/balloon.c
@@ -23,6 +23,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -30,9 +31,36 @@
 
 unsigned long virt_kernel_area_end = VIRT_KERNEL_AREA;
 
+static void p2m_invalidate(unsigned long *list, unsigned long start_idx)
+{
+unsigned long idx;
+
+for ( idx = start_idx; idx < P2M_ENTRIES; idx++ )
+list[idx] = INVALID_P2M_ENTRY;
+}
+
+static inline unsigned long *p2m_l3list(void)
+{
+return 
mfn_to_virt(HYPERVISOR_shared_info->arch.pfn_to_mfn_frame_list_list);
+}
+
+static inline unsigned long *p2m_to_virt(unsigned long p2m)
+{
+return ( p2m == INVALID_P2M_ENTRY ) ? NULL : mfn_to_virt(p2m);
+}
+
 void arch_remap_p2m(unsigned long max_pfn)
 {
 unsigned long pfn;
+unsigned long *l3_list, *l2_list, *l1_list;
+
+l3_list = p2m_l3list();
+l2_list = p2m_to_virt(l3_list[L3_P2M_IDX(max_pfn - 1)]);
+l1_list = p2m_to_virt(l2_list[L2_P2M_IDX(max_pfn - 1)]);
+
+p2m_invalidate(l3_list, L3_P2M_IDX(max_pfn - 1) + 1);
+p2m_invalidate(l2_list, L2_P2M_IDX(max_pfn - 1) + 1);
+p2m_invalidate(l1_list, L1_P2M_IDX(max_pfn - 1) + 1);
 
 if ( p2m_pages(nr_max_pages) <= p2m_pages(max_pfn) )
 return;
@@ -50,4 +78,70 @@ void arch_remap_p2m(unsigned long max_pfn)
 ASSERT(virt_kernel_area_end <= VIRT_DEMAND_AREA);
 }
 
+int arch_expand_p2m(unsigned long max_pfn)
+{
+unsigned long pfn;
+unsigned long *l1_list, *l2_list, *l3_list;
+
+p2m_chk_pfn(max_pfn - 1);
+l3_list = p2m_l3list();
+
+for ( pfn = (HYPERVISOR_shared_info->arch.max_pfn + P2M_MASK) & ~P2M_MASK;
+  pfn < max_pfn; pfn += P2M_ENTRIES )
+{
+l2_list = p2m_to_virt(l3_list[L3_P2M_IDX(pfn)]);
+if ( !l2_list )
+{
+l2_list = (unsigned long*)alloc_page();
+if ( !l2_list )
+return -ENOMEM;
+p2m_invalidate(l2_list, 0);
+l3_list[L3_P2M_IDX(pfn)] = virt_to_mfn(l2_list);
+}
+l1_list = p2m_to_virt(l2_list[L2_P2M_IDX(pfn)]);
+if ( !l1_list )
+{
+l1_list = (unsigned long*)alloc_page();
+if ( !l1_list )
+return -ENOMEM;
+p2m_invalidate(l1_list, 0);
+l2_list[L2_P2M_IDX(pfn)] = virt_to_mfn(l1_list);
+
+if ( map_frame_rw((unsigned long)(phys_to_machine_mapping + pfn),
+  l2_list[L2_P2M_IDX(pfn)]) )
+return -ENOMEM;
+}
+}
+
+HYPERVISOR_shared_info->arch.max_pfn = max_pfn;
+
+/* Make sure the new last page can be mapped. */
+if ( !need_pgt((unsigned long)pfn_to_virt(max_pfn - 1)) )
+return -ENOMEM;
+
+return 0;
+}
+
+void arch_pfn_add(unsigned long pfn, unsigned long mfn)
+{
+mmu_update_t mmu_updates[1];
+pgentry_t *pgt;
+int rc;
+
+phys_to_machine_mapping[pfn] = mfn;
+
+pgt = need_pgt((unsigned long)pfn_to_virt(pfn));
+ASSERT(pgt);
+mmu_updates[0].ptr = virt_to_mach(pgt) | MMU_NORMAL_PT_UPDATE;
+mmu_updates[0].val = (pgentry_t)(mfn << PAGE_SHIFT) |
+ _PAGE_PRESENT | _PAGE_RW;
+rc = HYPERVISOR_mmu_update(mmu_updates, 1, NULL, DOMID_SELF);
+if ( rc < 0 )
+{
+printk("ERROR: build_pagetable(): PTE could not be updated\n");
+printk("   mmu_update failed with rc=%d\n", rc);
+do_exit();
+}
+}
+
 #endif
diff --git a/balloon.c b/balloon.c
index 78b30af..07ef532 100644
--- a/balloon.c
+++ b/balloon.c
@@ -23,11 +23,13 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
 
 unsigned long nr_max_pages;
+unsigned long nr_mem_pages;
 
 void get_max_pages(void)
 {
@@ -62,3 +64,65 @@ void mm_alloc_bitmap_remap(void)
 virt_kernel_area_end += round_pgup((nr_max_pages + 1) >> (PAGE_SHIFT + 3));
 ASSERT(virt_kernel_area_end <= VIRT_DEMAND_AREA);
 }
+
+

[Xen-devel] [PATCH v3 12/19] mini-os: don't allocate new pages for level 1 p2m tree

2016-08-11 Thread Juergen Gross
When constructing the 3 level p2m tree there is no need to allocate
new pages for the level 1 containing the p2m info for all pages. The
pages from the linear p2m list constructed by the hypervisor can be
used for that purpose.

Signed-off-by: Juergen Gross 
Reviewed-by: Samuel Thibault 
---
 arch/x86/mm.c | 14 --
 1 file changed, 4 insertions(+), 10 deletions(-)

diff --git a/arch/x86/mm.c b/arch/x86/mm.c
index 12f7fe4..e10c2c5 100644
--- a/arch/x86/mm.c
+++ b/arch/x86/mm.c
@@ -625,11 +625,11 @@ void arch_init_p2m(unsigned long max_pfn)
 #define L2_P2M_MASK (L2_P2M_ENTRIES - 1)
 #define L3_P2M_MASK (L3_P2M_ENTRIES - 1)
 
-unsigned long *l1_list = NULL, *l2_list = NULL, *l3_list;
+unsigned long *l2_list = NULL, *l3_list;
 unsigned long pfn;
 
 l3_list = (unsigned long *)alloc_page(); 
-for ( pfn=0; pfn> L2_P2M_SHIFT)] = virt_to_mfn(l2_list);  
 }
-if ( !(pfn % (L1_P2M_ENTRIES)) )
-{
-l1_list = (unsigned long*)alloc_page();
-l2_list[(pfn >> L1_P2M_SHIFT) & L2_P2M_MASK] = 
-virt_to_mfn(l1_list); 
-}
-
-l1_list[pfn & L1_P2M_MASK] = pfn_to_mfn(pfn); 
+l2_list[(pfn >> L1_P2M_SHIFT) & L2_P2M_MASK] =
+virt_to_mfn(phys_to_machine_mapping + pfn);
 }
 HYPERVISOR_shared_info->arch.pfn_to_mfn_frame_list_list = 
 virt_to_mfn(l3_list);
-- 
2.6.6


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 21/25] arm/altp2m: Add HVMOP_altp2m_change_gfn.

2016-08-11 Thread Julien Grall



On 06/08/2016 19:42, Sergej Proskurin wrote:

Hi Julien,


Hello Sergej,


On 08/06/2016 04:34 PM, Julien Grall wrote:



On 06/08/2016 14:45, Sergej Proskurin wrote:

Hi Julien,


Hello Sergej,


On 08/04/2016 04:04 PM, Julien Grall wrote:

On 01/08/16 18:10, Sergej Proskurin wrote:

+return rc;
+
+hp2m = p2m_get_hostp2m(d);
+ap2m = d->arch.altp2m_p2m[idx];
+
+altp2m_lock(d);
+
+/*
+ * Flip mem_access_enabled to true when a permission is set, as
to prevent
+ * allocating or inserting super-pages.
+ */
+ap2m->mem_access_enabled = true;


Can you give more details about why you need this?



Similar to altp2m_set_mem_access, if we remap a page that is part of a
super page in the hostp2m, we first map the superpage in form of 512
pages into the ap2m and then change only one page. So, we set
mem_access_enabled to true to shatter the superpage on the ap2m side.


mem_access_enabled should only be set when mem access is enabled and
nothing.

I don't understand why you want to avoid superpage in the altp2m. If
you copy a host mapping is a superpage, then a altp2m mapping should
be a superpage.

The code is able to cope with inserting a mapping in the middle of a
superpage without mem_access_enabled.



Alright, I will try it out in the next patch. Thank you.




+
+mfn = p2m_lookup_attr(ap2m, old_gfn, &p2mt, &level, NULL, NULL);
+
+/* Check whether the page needs to be reset. */
+if ( gfn_eq(new_gfn, INVALID_GFN) )
+{
+/* If mfn is mapped by old_gpa, remove old_gpa from the
altp2m table. */
+if ( !mfn_eq(mfn, INVALID_MFN) )
+{
+rc = remove_altp2m_entry(d, ap2m, old_gpa,
pfn_to_paddr(mfn_x(mfn)), level);


remove_altp2m_entry should take a gfn and mfn in parameter and not an
address. The latter is a call for misusage of the API.



Ok. This will also remove the need for level_sizes/level_masks in the
associated function.


+if ( rc )
+{
+rc = -EINVAL;
+goto out;
+}
+}
+
+rc = 0;
+goto out;
+}
+
+/* Check host p2m if no valid entry in altp2m present. */
+if ( mfn_eq(mfn, INVALID_MFN) )
+{
+mfn = p2m_lookup_attr(hp2m, old_gfn, &p2mt, &level, NULL,
&xma);
+if ( mfn_eq(mfn, INVALID_MFN) || (p2mt != p2m_ram_rw) )


Please add a comment to explain why the second check.



Ok, I will. It has the same reason as in patch #19: It is not sufficient
so simply check for invalid MFN's as the type might be invalid. Also,
the x86 implementation did not allow to remap a gfn to a shared page.


Patch #19 has a different check which does not explain this one. (p2mt
!= p2m_ram_rw) only guest read-write RAM can effectively be remapped
which is different than shared page cannot be remapped.

BTW, ARM does not support shared page.

This also lead to my question, why not allowing p2m_ram_ro?



I don't see a reason why not. Thank you. I will remove the check.


Be careful, I never asked to remove the check. The p2m type contains 
more than 2 cases, so if you remove completely the check it would be 
possible to change device memory, grant, foreign mapping,...


The later will open a security issue because we require to have a 
reference any foreign mapping before mapping it in the p2m.


So I think it only make sense to allow changing a gfn for p2m_ram_ro and 
p2m_ram_rw.







+{
+rc = -EINVAL;
+goto out;
+}
+
+/* If this is a superpage, copy that first. */
+if ( level != 3 )
+{
+rc = modify_altp2m_entry(d, ap2m, old_gpa,
pfn_to_paddr(mfn_x(mfn)),
+ level, p2mt, memaccess[xma]);
+if ( rc )
+{
+rc = -EINVAL;
+goto out;
+}
+}
+}
+
+mfn = p2m_lookup_attr(ap2m, new_gfn, &p2mt, &level, NULL, &xma);
+
+/* If new_gfn is not part of altp2m, get the mapping information
from hp2m */
+if ( mfn_eq(mfn, INVALID_MFN) )
+mfn = p2m_lookup_attr(hp2m, new_gfn, &p2mt, &level, NULL,
&xma);
+
+if ( mfn_eq(mfn, INVALID_MFN) || (p2mt != p2m_ram_rw) )


Please add a comment to explain why the second check.



Same reason as above.


Then add a comment in the code.


I will also remove this check.


No. See my answer above.

Regards,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH V2] xen: support enabling SMEP/SMAP for HVM only

2016-08-11 Thread He Chen
Enhance "skaj...@intel.com>mep" and "smap" command line options to support 
enabling SMEP
or SMAP for HVM only with allowing "hvm" as a value.

Signed-off-by: He Chen 

---
Changes in V2:
* Allow "hvm" as a value to "smep" and "smap" command line options.
* Clear SMEP/SMAP CPUID bits for pv guests if they are set to hvm only.
* Refine docs.
* Rewrite commit message.
---
 docs/misc/xen-command-line.markdown | 14 +
 xen/arch/x86/cpuid.c|  5 
 xen/arch/x86/setup.c| 58 -
 xen/include/asm-x86/setup.h |  3 ++
 4 files changed, 66 insertions(+), 14 deletions(-)

diff --git a/docs/misc/xen-command-line.markdown 
b/docs/misc/xen-command-line.markdown
index 3a250cb..0e49358 100644
--- a/docs/misc/xen-command-line.markdown
+++ b/docs/misc/xen-command-line.markdown
@@ -1427,19 +1427,21 @@ enabling more sockets and cores to go into deeper sleep 
states.
 
 Set the serial transmit buffer size.
 
-### smep
-> `= `
+### smap
+> `=  | hvm`
 
 > Default: `true`
 
-Flag to enable Supervisor Mode Execution Protection
+Flag to enable Supervisor Mode Access Prevention
+Using `smap=hvm` to enable SMAP for HVM guests only.
 
-### smap
-> `= `
+### smep
+> `=  | hvm`
 
 > Default: `true`
 
-Flag to enable Supervisor Mode Access Prevention
+Flag to enable Supervisor Mode Execution Protection
+Using `smep=hvm` to enable SMEP for HVM guests only.
 
 ### snb\_igd\_quirk
 > `=  | cap | `
diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
index 38e34bd..afa16b8 100644
--- a/xen/arch/x86/cpuid.c
+++ b/xen/arch/x86/cpuid.c
@@ -4,6 +4,7 @@
 #include 
 #include 
 #include 
+#include 
 
 const uint32_t known_features[] = INIT_KNOWN_FEATURES;
 const uint32_t special_features[] = INIT_SPECIAL_FEATURES;
@@ -118,6 +119,10 @@ static void __init calculate_pv_featureset(void)
 __set_bit(X86_FEATURE_HTT, pv_featureset);
 __set_bit(X86_FEATURE_X2APIC, pv_featureset);
 __set_bit(X86_FEATURE_CMP_LEGACY, pv_featureset);
+if ( smep_hvm_only )
+__clear_bit(X86_FEATURE_SMEP, pv_featureset);
+if ( smap_hvm_only )
+__clear_bit(X86_FEATURE_SMAP, pv_featureset);
 
 sanitise_featureset(pv_featureset);
 }
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index 217c775..625b9b4 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -61,13 +61,19 @@ boolean_param("nosmp", opt_nosmp);
 static unsigned int __initdata max_cpus;
 integer_param("maxcpus", max_cpus);
 
-/* smep: Enable/disable Supervisor Mode Execution Protection (default on). */
-static bool_t __initdata opt_smep = 1;
-boolean_param("smep", opt_smep);
-
-/* smap: Enable/disable Supervisor Mode Access Prevention (default on). */
-static bool_t __initdata opt_smap = 1;
-boolean_param("smap", opt_smap);
+/* Supervisor Mode Execution Protection (default on). */
+/* "smep=on": Enable SMEP for Xen and guests. */
+/* "smep=hvm": Enable SMEP for HVM only.  */
+/* "smep=off": Disable SMEP for Xen and guests.   */
+static void parse_smep_param(char *s);
+custom_param("smep", parse_smep_param);
+
+/* Supervisor Mode Access Prevention (default on). */
+/* "smep=on": Enable SMAP for Xen and guests.  */
+/* "smep=hvm": Enable SMAP for HVM only.   */
+/* "smep=off": Disable SMAP for Xen and guests.*/
+static void parse_smap_param(char *s);
+custom_param("smap", parse_smap_param);
 
 unsigned long __read_mostly cr4_pv32_mask;
 
@@ -111,6 +117,34 @@ struct cpuinfo_x86 __read_mostly boot_cpu_data = { 0, 0, 
0, 0, -1 };
 
 unsigned long __read_mostly mmu_cr4_features = XEN_MINIMAL_CR4;
 
+static bool_t __initdata opt_smep = 1;
+bool_t __initdata smep_hvm_only = 0;
+static void __init parse_smep_param(char *s)
+{
+if ( !parse_bool(s) )
+{
+opt_smep = 0;
+}
+else if ( !strcmp(s, "hvm") )
+{
+smep_hvm_only = 1;
+}
+}
+
+static bool_t __initdata opt_smap = 1;
+bool_t __initdata smap_hvm_only = 0;
+static void __init parse_smap_param(char *s)
+{
+if ( !parse_bool(s) )
+{
+opt_smap = 0;
+}
+else if ( !strcmp(s, "hvm") )
+{
+smap_hvm_only = 1;
+}
+}
+
 bool_t __read_mostly acpi_disabled;
 bool_t __initdata acpi_force;
 static char __initdata acpi_param[10] = "";
@@ -1404,12 +1438,20 @@ void __init noreturn __start_xen(unsigned long mbi_p)
 if ( !opt_smep )
 setup_clear_cpu_cap(X86_FEATURE_SMEP);
 if ( cpu_has_smep )
+{
 set_in_cr4(X86_CR4_SMEP);
+if ( smep_hvm_only )
+write_cr4(read_cr4() & ~X86_CR4_SMEP);
+}
 
 if ( !opt_smap )
 setup_clear_cpu_cap(X86_FEATURE_SMAP);
 if ( cpu_has_smap )
+{
 set_in_cr4(X86_CR4_SMAP);
+if ( smap_hvm_only )
+write_cr4(read_cr4() & ~X86_CR4_SMAP);
+}
 
 cr4_pv32_mask = mmu_cr4_features & XEN_CR4_PV32_BITS;
 
@@ -1570,7 +1612,7 @@ void __init noreturn __start_xen(unsigned long mbi_p)
 bootstrap_map, cmd

Re: [Xen-devel] [PATCH v5 3/4] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server.

2016-08-11 Thread Yu Zhang



On 8/11/2016 4:58 PM, Jan Beulich wrote:

On 11.08.16 at 10:47,  wrote:

On 8/10/2016 6:43 PM, Yu Zhang wrote:

For " && p2mt != p2m_ioreq_server" condition, it is just to guarantee
that if a write
operation is trapped, and at the same period, device model changed the
status of
ioreq server, it should be discarded.

Hi Paul & Jan, any comments?

Didn't Paul's "should behave like p2m_ram_rw" reply clarify things
sufficiently?


Oh, I may have misunderstood. I thought he was talking about the p2m 
change race condition. :)


So please allow me to give a summary about my next to do for this:
1> To prevent p2m type change race condition, 
hvm_hap_nested_page_fault() need to
be changed so that p2m_unlock() can be triggered after the write 
operation is handled;


2> If a gfn with p2m_ioreq_server is trapped, but the current ioreq 
server has been unmapped,

it will be treated as a p2m_ram_rw;

3> If a gfn with p2m_ioreq_server is trapped, but the  dir is 
IOREQ_READ, it will be treated as a

read-modify-write case.


A second thought is, I am now more worried about the " && dir ==
IOREQ_WRITE"
condition, which we used previously to set s to NULL if it is not a
write operation.
However, if HVM uses a read-modify-write instruction to operate on a
write-protected
address, it will be treated as both read and write accesses in
ept_handle_violation(). In
such situation, we need to emulate the read access first(by just
returning the value being
fetched either in hypervisor or in device model), instead of
discarding the read access.

Any suggestions about this guest read-modify-write instruction situation?
Is my depiction clear? :)

Well, from your earlier reply I concluded that you'd just go ahead
and put this into patch form, which we'd then look at.


OK, thanks. I have give a rough summary in 3> above.

I will have to take several days annual leave from this weekend due to 
some family
urgency, and will be back after Aug 23. Can hardly seen the mailing list 
during this

period, sorry for the inconvenience.  :(

Yu

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH V3] common/vm_event: synchronize vCPU state in vm_event_resume()

2016-08-11 Thread Razvan Cojocaru
Vm_event_vcpu_pause() needs to use vcpu_pause_nosync() in order
for the current vCPU to not get stuck. A consequence of this is
that the custom vm_event response handlers will not always see
the real vCPU state in v->arch.user_regs. This patch makes sure
that the state is always synchronized in vm_event_resume, before
any handlers have been called. This problem especially affects
vm_event_set_registers().

Simply checking vm_event_pause_count to make sure the vCPU is
paused suffices since there's only one ring / consumer at a
time, and events are being processed one-by-one, so the
toolstack won't unpause the vCPU behind our backs.

Signed-off-by: Razvan Cojocaru 

---
Changes since V2:
 - Updated the commit text.
---
 xen/common/vm_event.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/xen/common/vm_event.c b/xen/common/vm_event.c
index 941345b..53cab90 100644
--- a/xen/common/vm_event.c
+++ b/xen/common/vm_event.c
@@ -388,6 +388,13 @@ void vm_event_resume(struct domain *d, struct 
vm_event_domain *ved)
 v = d->vcpu[rsp.vcpu_id];
 
 /*
+ * Make sure the vCPU state has been synchronized for the custom
+ * handlers.
+ */
+if ( atomic_read(&v->vm_event_pause_count) )
+sync_vcpu_execstate(v);
+
+/*
  * In some cases the response type needs extra handling, so here
  * we call the appropriate handlers.
  */
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [linux-3.14 test] 100400: tolerable FAIL - PUSHED

2016-08-11 Thread osstest service owner
flight 100400 linux-3.14 real [real]
http://logs.test-lab.xenproject.org/osstest/logs/100400/

Failures :-/ but no regressions.

Regressions which are regarded as allowable (not blocking):
 build-i386-rumpuserxen6 xen-buildfail   like 99747
 build-amd64-rumpuserxen   6 xen-buildfail   like 99747
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail like 99747
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop  fail like 99747
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail like 99747

Tests which did not succeed, but are not blocking:
 test-amd64-i386-rumpuserxen-i386  1 build-check(1)   blocked  n/a
 test-amd64-amd64-rumpuserxen-amd64  1 build-check(1)   blocked n/a
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass

version targeted for testing:
 linuxb8b6a72089869dee41bd9f29e86bbcf6501e5524
baseline version:
 linuxda99423b3cd3e48c42c0d64b79aba58d828f9648

Last test of basis99747  2016-07-28 09:29:42 Z   13 days
Testing same since   100389  2016-08-10 08:47:32 Z1 days2 attempts


People who touched revisions under test:
  Alexander Klein 
  Alexey Brodkin 
  Alexey Brodkin 
  Amr Bekhit 
  Andrew Morton 
  Andrey Grodzovsky 
  Benjamin Herrenschmidt 
  Brian King 
  Cameron Gutman 
  David S. Miller 
  David Vrabel 
  Dmitri Epshtein 
  Dmitry Torokhov 
  Greg Kroah-Hartman 
  Ilya Dryomov 
  Jeff Mahoney 
  Jiri Slaby 
  Kangjie Lu 
  Kangjie Lu 
  Linus Torvalds 
  Linus Walleij 
  Marc Kleine-Budde 
  Marcin Wojtas 
  Martin K. Petersen 
  Oliver Hartkopp 
  Ping Cheng 
  Ping Cheng 
  Ryusuke Konishi 
  Takashi Iwai 
  Taras Kondratiuk 
  Theodore Ts'o 
  Tony Lindgren 
  Torsten Hilbrich 
  Tyler Hicks 
  Ulf Hansson 
  Ursula Braun 
  Vegard Nossum 
  Vineet Gupta 
  Willy Tarreau 
  Wolfgang Grandegger 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  fail
 build-i386-rumpuserxen   fail
 test-amd64-amd64-xl  pass
 test-amd64-i386-xl   pass
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm   pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmpass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm pass
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm pass
 test-amd64-amd64-libvirt-xsm pass
 test-amd64-i386-libvirt-xsm  pass
 test-amd64-amd64-xl-xsm  pass
 test-amd64-i386-xl-xsm   pass
 test-amd64-amd64-qemuu-nested-amdfail
 test-amd64-amd64-xl-pvh-amd  fail
 test-amd64-i386-qemut-rhel6hvm-amd   pass
 test-amd64-i386-qemuu-rhel6hvm-amd   pass
 test-amd64-amd64-xl-qemut-debianhvm-amd64pass
 test-amd64-i386-xl-qemut-debianhvm-amd64 pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-i386-xl-qemuu-debianhvm-amd64

Re: [Xen-devel] xen does not support the 8G large bar

2016-08-11 Thread Jan Beulich
>>> On 11.08.16 at 05:30,  wrote:
> According to the analysis hvmloader code, find a problem:
> 
> if (is_64bar) {
> bar_data_upper = pci_readl(devfn, bar_reg + 4);
> pci_writel(devfn, bar_reg + 4, ~0);
> bar_sz_upper = pci_readl(devfn, bar_reg + 4);
> pci_writel(devfn, bar_reg + 4, bar_data_upper);
> bar_sz = (bar_sz_upper << 32) | bar_sz;
> }
> bar_sz &= ~(bar_sz - 1);
> 
> read from the pci device, bar_sz_upper is 0x, if the bar size is 8G, 
> the bar_sz_upper should be 0xfffe.

But that doesn't indicate a problem with hvmloader. Instead that
tells you that qemu isn't behaving correctly. First thing for you
to do is probably to try a newer qemu, or otherwise see whether
you can figure why qemu sends back 0x in this case
(perhaps by adding a little bit of logging to the respective code).

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 16/19] mini-os: map page allocator's bitmap to virtual kernel area for ballooning

2016-08-11 Thread Samuel Thibault
Juergen Gross, on Thu 11 Aug 2016 11:18:19 +0200, wrote:
> +extern unsigned long *mm_bitmap;
> +extern unsigned long mm_bitmap_size;

Ah I was thinking to have these use mm_malloc_ too. "mm_bitmap" seems
short for namespace pollution.

Samuel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [XTF PATCH 3/3] xtf-runner: support two modes for getting output

2016-08-11 Thread Wei Liu
On Thu, Aug 11, 2016 at 09:33:57AM +0100, Wei Liu wrote:
> On Wed, Aug 10, 2016 at 04:07:30PM +0100, Wei Liu wrote:
> [...]
> >  
> > +def run_test_logfile(opts, test):
> > +""" Run a specific test via grepping log file"""
> > +
> > +fn = opts.logfile_dir + (opts.logfile_pattern % test)
> > +local_time = time.strftime('%Y-%m-%d %H:%M:%S', time.localtime())
> > +
> > +# Use time to generate unique stamps
> > +start_stamp = "= XTF TEST START %s =" % local_time
> > +end_stamp = "= XTF TEST END %s =" % local_time
> > +
> > +print "Using %s" % fn
> > +
> > +f = open(fn, "ab")
> > +f.write(start_stamp + "\n")
> > +f.close()
> > +
> 
> I think it would make more sense for the micro VM itself to write
> stamps?

I want to pass a stamp generated by the runner to micro VMs, otherwise
runner wouldn't be able to tell which stamps are the right one.

For PV guests it works because there is start_info->cmd_line.
Unfortunately I can't seem to find a place for putting in a command line
for hvm guest in the ABI. Newer version of Xen will have a boot ABI that
supports command line. This mode is mainly for old versions of Xen so I
don't see how it is possibly at this stage to uniformly support both old
and new versions of Xen.

Maybe we need to live with running adding the stamp? Let me know if I
miss anything.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] Remove ambiguities in the COPYING file; add CONTRIBUTING file

2016-08-11 Thread George Dunlap
On 11/08/16 01:51, Stefano Stabellini wrote:
> On Wed, 10 Aug 2016, Lars Kurth wrote:
>> COPYING file:
>> The motivation of this change is to make it easier for new
>> contributors to conduct a license and patent review, WITHOUT
>> changing any licenses.
>> - Remove references to BSD-style licenses as we have more
>>   common license exceptions and replace with "other license
>>   stanzas"
>> - List the most common situations under which code is licensed
>>   under licenses other than GPLv2 (section "Licensing Exceptions")
>> - List the most common non-GPLv2 licenses that are in use in
>>   this repository based on a recent FOSSology scan (section
>>   "Licensing Exceptions")
>> - List other license related conventions within the project
>>   to make it easier to conduct a license review.
>> - Clarify the incoming license as its omission has confused
>>   past contributors (section "Contributions")
>>
>> CONTRIBUTION file:
>> The motivation of this file is to make it easier for contributors
>> to find contribution related resources. Add information on existing
>> license related conventions to avoid unintentional future licensing
>> issues. Provide templates for copyright headers for the most commonly
>> used licenses in this repository.
>>
>> Signed-off-by: Lars Kurth 
>> ---
>>  CONTRIBUTING | 210 
>> +++
>>  COPYING  |  64 ++
>>  2 files changed, 260 insertions(+), 14 deletions(-)
>>  create mode 100644 CONTRIBUTING
>>
>> diff --git a/CONTRIBUTING b/CONTRIBUTING
>> new file mode 100644
>> index 000..7af13c4
>> --- /dev/null
>> +++ b/CONTRIBUTING
>> @@ -0,0 +1,210 @@
>> +
>> +CONTRIBUTING
>> +
>> +
>> +INBOUND LICENSE
>> +---
>> +
>> +Contributions are governed by the license that applies to relevant 
>> +specific file or by the license specified in the COPYING file, that
>  ^files

I think "file" is better here, as the license is on a file-by-file
basis, not on a whole contribution basis.  That is, if your contribution
changes a BSD file and a GPLv2 file in a single series (or a single
patch), then the changes to the BSD file are goverened by the BSD
licence, and the changes to the GPLv2 file are governed by the GPLv2.

> 
> 
>> +governs the license of its containing directory and its subdirectories.
>> +
>> +Most of the Xen Project code is licensed under GPLv2, but a number of 
>> +directories are primarily licensed under different licenses. 
> ^ I would remove "primarily" from this sentence

"primarily licensed under different licenses" implies to me that most of
the files in the directory are under a different license, but some may
be licensed GPLv2.  Without the "primarily" I would take that to imply
that *none* of the files are licensed GPLv2.

If there is at least one directory that has mostly non-GPLv2 files but
at least one GPLv2 file, or we anticipate that such directories might
exist in the future, I would leave the "primarily" in.  If there aren't
any now and we don't expect any in the future, then yes it's unnecessary
and should probably be removed.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 1/2] page-alloc/x86: don't restrict DMA heap to node 0

2016-08-11 Thread Julien Grall

Hi Jan,

On 10/08/2016 11:23, Jan Beulich wrote:

--- a/xen/include/asm-arm/numa.h
+++ b/xen/include/asm-arm/numa.h
@@ -17,6 +17,11 @@ static inline __attribute__((pure)) node
 #define node_start_pfn(nid) (pdx_to_pfn(frametable_base_pdx))
 #define __node_distance(a, b) (20)

+static inline unsigned int arch_get_dma_bitsize(void)
+{
+return 32;
+}
+


I am not sure why we return 32 here for ARM. Anyway, as it was already 
the case before this patch:


Acked-by: Julien Grall 

Regards,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [ovmf test] 100407: all pass - PUSHED

2016-08-11 Thread osstest service owner
flight 100407 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/100407/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 ovmf bfaa3b5b7fad475355ef28bb96749d2410ef75b0
baseline version:
 ovmf cbe09e31218bc2b84a0f1c57f0fa8e72fe0830d5

Last test of basis   100402  2016-08-11 02:47:45 Z0 days
Testing same since   100407  2016-08-11 07:49:15 Z0 days1 attempts


People who touched revisions under test:
  Dandan Bi 
  Thomas Huth 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

+ branch=ovmf
+ revision=bfaa3b5b7fad475355ef28bb96749d2410ef75b0
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x '!=' x/home/osstest/repos/lock ']'
++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock
++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push ovmf 
bfaa3b5b7fad475355ef28bb96749d2410ef75b0
+ branch=ovmf
+ revision=bfaa3b5b7fad475355ef28bb96749d2410ef75b0
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']'
+ . ./cri-common
++ . ./cri-getconfig
++ umask 002
+ select_xenbranch
+ case "$branch" in
+ tree=ovmf
+ xenbranch=xen-unstable
+ '[' xovmf = xlinux ']'
+ linuxbranch=
+ '[' x = x ']'
+ qemuubranch=qemu-upstream-unstable
+ select_prevxenbranch
++ ./cri-getprevxenbranch xen-unstable
+ prevxenbranch=xen-4.7-testing
+ '[' xbfaa3b5b7fad475355ef28bb96749d2410ef75b0 = x ']'
+ : tested/2.6.39.x
+ . ./ap-common
++ : osst...@xenbits.xen.org
+++ getconfig OsstestUpstream
+++ perl -e '
use Osstest;
readglobalconfig();
print $c{"OsstestUpstream"} or die $!;
'
++ :
++ : git://xenbits.xen.org/xen.git
++ : osst...@xenbits.xen.org:/home/xen/git/xen.git
++ : git://xenbits.xen.org/qemu-xen-traditional.git
++ : git://git.kernel.org
++ : git://git.kernel.org/pub/scm/linux/kernel/git
++ : git
++ : git://xenbits.xen.org/libvirt.git
++ : osst...@xenbits.xen.org:/home/xen/git/libvirt.git
++ : git://xenbits.xen.org/libvirt.git
++ : git://xenbits.xen.org/rumpuser-xen.git
++ : git
++ : git://xenbits.xen.org/rumpuser-xen.git
++ : osst...@xenbits.xen.org:/home/xen/git/rumpuser-xen.git
+++ besteffort_repo https://github.com/rumpkernel/rumpkernel-netbsd-src
+++ local repo=https://github.com/rumpkernel/rumpkernel-netbsd-src
+++ cached_repo https://github.com/rumpkernel/rumpkernel-netbsd-src 
'[fetch=try]'
+++ local repo=https://github.com/rumpkernel/rumpkernel-netbsd-src
+++ local 'options=[fetch=try]'
 getconfig GitCacheProxy
 perl -e '
use Osstest;
readglobalconfig();
print $c{"GitCacheProxy"} or die $!;
'
+++ local cache=git://cache:9419/
+++ '[' xgit://cache:9419/ '!=' x ']'
+++ echo 
'git://cache:9419/https://github.com/rumpkernel/rumpkernel-netbsd-src%20[fetch=try]'
++ : 
'

Re: [Xen-devel] [XTF PATCH 3/3] xtf-runner: support two modes for getting output

2016-08-11 Thread Andrew Cooper
On 11/08/16 10:44, Wei Liu wrote:
> On Thu, Aug 11, 2016 at 09:33:57AM +0100, Wei Liu wrote:
>> On Wed, Aug 10, 2016 at 04:07:30PM +0100, Wei Liu wrote:
>> [...]
>>>  
>>> +def run_test_logfile(opts, test):
>>> +""" Run a specific test via grepping log file"""
>>> +
>>> +fn = opts.logfile_dir + (opts.logfile_pattern % test)
>>> +local_time = time.strftime('%Y-%m-%d %H:%M:%S', time.localtime())
>>> +
>>> +# Use time to generate unique stamps
>>> +start_stamp = "= XTF TEST START %s =" % local_time
>>> +end_stamp = "= XTF TEST END %s =" % local_time
>>> +
>>> +print "Using %s" % fn
>>> +
>>> +f = open(fn, "ab")
>>> +f.write(start_stamp + "\n")
>>> +f.close()
>>> +
>> I think it would make more sense for the micro VM itself to write
>> stamps?
> I want to pass a stamp generated by the runner to micro VMs, otherwise
> runner wouldn't be able to tell which stamps are the right one.
>
> For PV guests it works because there is start_info->cmd_line.
> Unfortunately I can't seem to find a place for putting in a command line
> for hvm guest in the ABI. Newer version of Xen will have a boot ABI that
> supports command line. This mode is mainly for old versions of Xen so I
> don't see how it is possibly at this stage to uniformly support both old
> and new versions of Xen.
>
> Maybe we need to live with running adding the stamp? Let me know if I
> miss anything.

I haven't managed to come up with a reasonable way to get a command line
into an HVM guest yet.  The best I managed was a xenstore key, but that
gets in the way of doing xenstore ring testing in XTF, and still
requires going behind the back of the toolstack.

Can't you just open the log file as read, seek to the end, run the test
and read again from the same FD?  It would be rather more simple than
marking the logs.

Sadly, whatever method we use here is going to have to be clever enough
to cope with the log files being rotated, and I can't think of a clever
way of doing that ATM.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] Remove ambiguities in the COPYING file; add CONTRIBUTING file

2016-08-11 Thread George Dunlap
On 10/08/16 12:30, Lars Kurth wrote:
> COPYING file:
> The motivation of this change is to make it easier for new
> contributors to conduct a license and patent review, WITHOUT
> changing any licenses.
> - Remove references to BSD-style licenses as we have more
>   common license exceptions and replace with "other license
>   stanzas"
> - List the most common situations under which code is licensed
>   under licenses other than GPLv2 (section "Licensing Exceptions")
> - List the most common non-GPLv2 licenses that are in use in
>   this repository based on a recent FOSSology scan (section
>   "Licensing Exceptions")
> - List other license related conventions within the project
>   to make it easier to conduct a license review.
> - Clarify the incoming license as its omission has confused
>   past contributors (section "Contributions")
> 
> CONTRIBUTION file:
> The motivation of this file is to make it easier for contributors
> to find contribution related resources. Add information on existing
> license related conventions to avoid unintentional future licensing
> issues. Provide templates for copyright headers for the most commonly
> used licenses in this repository.
> 
> Signed-off-by: Lars Kurth 
> ---
>  CONTRIBUTING | 210 
> +++
>  COPYING  |  64 ++
>  2 files changed, 260 insertions(+), 14 deletions(-)
>  create mode 100644 CONTRIBUTING
> 
> diff --git a/CONTRIBUTING b/CONTRIBUTING
> new file mode 100644
> index 000..7af13c4
> --- /dev/null
> +++ b/CONTRIBUTING
> @@ -0,0 +1,210 @@
> +
> +CONTRIBUTING
> +
> +
> +INBOUND LICENSE
> +---
> +
> +Contributions are governed by the license that applies to relevant 
> +specific file or by the license specified in the COPYING file, that
> +governs the license of its containing directory and its subdirectories.
> +
> +Most of the Xen Project code is licensed under GPLv2, but a number of 
> +directories are primarily licensed under different licenses. 
> +
> +Most notably:
> + - tools/blktap2  : BSD-Modified
> + - tools/libxc: LGPL v2.1
> + - tools/libxl: LGPL v2.1
> + - xen/include/public : MIT license
> +
> +When creating new components and directories that contain a 
> +significant amount of files that are licensed under licenses other 
> +than GPLv2 or the license specified in the COPYING file, please 
> +create a new COPYING file in that directory containing a copy of the 
> +license text and a rationale for using a different license. This helps 
> +ensure that the license of this new component/directory is maintained 
> +consistently with the original intention.
> +
> +When importing code from other upstream projects into this repository, 
> +please create a README.source file in the directory the code is imported 
> +to, listing the original source of the code. An example can be found at 
> +m4/README.source
> +
> +The COMMON COPYRIGHT NOTICES section of this document contains 
> +sample copyright notices for the most common licenses used within 
> +this repository.
> +
> +Developer's Certificate of Origin
> +-
> +
> +All patches to the Xen Project code base must include the the line 
> +"Signed-off-by: your_name " at the end of the change 
> +description. This is required and indicates that you certify the patch 
> +under the "Developer's Certificate of Origin" which states:
> +
> +  Developer's Certificate of Origin 1.1
> +
> +  By making a contribution to this project, I certify that:
> +
> +  (a) The contribution was created in whole or in part by me and I
> +  have the right to submit it under the open source license
> +  indicated in the file; or
> +
> +  (b) The contribution is based upon previous work that, to the best
> +  of my knowledge, is covered under an appropriate open source
> +  license and I have the right under that license to submit that
> +  work with modifications, whether created in whole or in part
> +  by me, under the same open source license (unless I am
> +  permitted to submit under a different license), as indicated
> +  in the file; or
> +
> +  (c) The contribution was provided directly to me by some other
> +  person who certified (a), (b) or (c) and I have not modified
> +  it.
> +
> +  (d) I understand and agree that this project and the contribution
> +  are public and that a record of the contribution (including all
> +  personal information I submit with it, including my sign-off) is
> +  maintained indefinitely and may be redistributed consistent with
> +  this project or the open source license(s) involved.
> +
> +GOVERNANCE AND WORKFLOW
> +---
> +
> +The following documents provide a general overview of governance and
> +contribution guidelines for the Xen Project:
> + - https://xenproject.org/governance.html  
> + - https://xenproject.org/help/contribution-guidelines.html 
> +
> +For mo

Re: [Xen-devel] [PATCH 1/2] page-alloc/x86: don't restrict DMA heap to node 0

2016-08-11 Thread Jan Beulich
>>> On 11.08.16 at 11:53,  wrote:
> Hi Jan,
> 
> On 10/08/2016 11:23, Jan Beulich wrote:
>> --- a/xen/include/asm-arm/numa.h
>> +++ b/xen/include/asm-arm/numa.h
>> @@ -17,6 +17,11 @@ static inline __attribute__((pure)) node
>>  #define node_start_pfn(nid) (pdx_to_pfn(frametable_base_pdx))
>>  #define __node_distance(a, b) (20)
>>
>> +static inline unsigned int arch_get_dma_bitsize(void)
>> +{
>> +return 32;
>> +}
>> +
> 
> I am not sure why we return 32 here for ARM. Anyway, as it was already 
> the case before this patch:

In fact I think we could get away without that inline function for the
time being: Since NODES_SHIFT is undefined on ARM, the call out of
page_alloc.c should get elided by the compiler anyway, so a possible
alternative would be to just have a declaration of the function on
ARM without actual implementation.

> Acked-by: Julien Grall 

Thanks.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [XTF PATCH 3/3] xtf-runner: support two modes for getting output

2016-08-11 Thread Wei Liu
On Thu, Aug 11, 2016 at 10:56:23AM +0100, Andrew Cooper wrote:
> On 11/08/16 10:44, Wei Liu wrote:
> > On Thu, Aug 11, 2016 at 09:33:57AM +0100, Wei Liu wrote:
> >> On Wed, Aug 10, 2016 at 04:07:30PM +0100, Wei Liu wrote:
> >> [...]
> >>>  
> >>> +def run_test_logfile(opts, test):
> >>> +""" Run a specific test via grepping log file"""
> >>> +
> >>> +fn = opts.logfile_dir + (opts.logfile_pattern % test)
> >>> +local_time = time.strftime('%Y-%m-%d %H:%M:%S', time.localtime())
> >>> +
> >>> +# Use time to generate unique stamps
> >>> +start_stamp = "= XTF TEST START %s =" % local_time
> >>> +end_stamp = "= XTF TEST END %s =" % local_time
> >>> +
> >>> +print "Using %s" % fn
> >>> +
> >>> +f = open(fn, "ab")
> >>> +f.write(start_stamp + "\n")
> >>> +f.close()
> >>> +
> >> I think it would make more sense for the micro VM itself to write
> >> stamps?
> > I want to pass a stamp generated by the runner to micro VMs, otherwise
> > runner wouldn't be able to tell which stamps are the right one.
> >
> > For PV guests it works because there is start_info->cmd_line.
> > Unfortunately I can't seem to find a place for putting in a command line
> > for hvm guest in the ABI. Newer version of Xen will have a boot ABI that
> > supports command line. This mode is mainly for old versions of Xen so I
> > don't see how it is possibly at this stage to uniformly support both old
> > and new versions of Xen.
> >
> > Maybe we need to live with running adding the stamp? Let me know if I
> > miss anything.
> 
> I haven't managed to come up with a reasonable way to get a command line
> into an HVM guest yet.  The best I managed was a xenstore key, but that
> gets in the way of doing xenstore ring testing in XTF, and still
> requires going behind the back of the toolstack.
> 
> Can't you just open the log file as read, seek to the end, run the test
> and read again from the same FD?  It would be rather more simple than
> marking the logs.
> 

The reason I want to put stamps (I think Ian's, too) is to make sure we
can still get the right bits out if same test case is run consequently,
multiple times. The only race-free way of doing it is to have the micro
vm itself prints out stamps. Using the runner to print stamps won't
solve it -- I only realised that after posting this series.

Given the current restrictions, I can live with the method you suggest,
too.

> Sadly, whatever method we use here is going to have to be clever enough
> to cope with the log files being rotated, and I can't think of a clever
> way of doing that ATM.
> 

Require logrorate to be disabled. :-)

Wei.

> ~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [XTF PATCH 1/3] xtf-runner: introduce get_xen_version

2016-08-11 Thread Andrew Cooper
On 10/08/16 16:07, Wei Liu wrote:
> Signed-off-by: Wei Liu 
> ---
>  xtf-runner | 13 +
>  1 file changed, 13 insertions(+)
>
> diff --git a/xtf-runner b/xtf-runner
> index c063699..7b69c45 100755
> --- a/xtf-runner
> +++ b/xtf-runner
> @@ -151,6 +151,19 @@ def __repr__(self):
>  return "TestInfo(%s)" % (self.name, )
>  
>  
> +def get_xen_version():
> +"""Get the version string of Xen"""
> +
> +for line in check_output(['xl', 'info']).splitlines():
> +if not line.startswith("xen_version"):
> +continue
> +
> +version_str = line.split()[2:][0]
> +break
> +
> +return version_str

This will hit a name error if xen_version isn't found in the output.

A better option would be to "return line.split()[2:][0]" directly and
raise a RunnerError() at this point.

However, xen_version was introduced by me a while ago, but not so very
long ago, and won't work on older versions of Xen.

I think at this point, it would just be easier to use the python libxc
bindings, so

from xen.lowlevel.xc import xc
libxc = xc()

info = libxc.xeninfo()
return "%s.%s" % (info["xen_major"], info["xen_minor"])

Make sure the import statement is inside the function call to avoid
breaking the usecase of `xtf-runner --list` offhost.

~Andrew

> +
> +
>  def parse_test_instance_string(arg):
>  """Parse a test instance string.
>  


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH] libxc: use DPRINTF in xc_domain_dumpcore_via_callback

2016-08-11 Thread Wei Liu
That line doesn't reveal much information to ordinary users.

Change that to debug output.

Signed-off-by: Wei Liu 
---
 tools/libxc/xc_core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/libxc/xc_core.c b/tools/libxc/xc_core.c
index d792566..e581905 100644
--- a/tools/libxc/xc_core.c
+++ b/tools/libxc/xc_core.c
@@ -859,7 +859,7 @@ copy_done:
 /* When live dump-mode (-L option) is specified,
  * guest domain may reduce memory. pad with zero pages.
  */
-IPRINTF("j (%ld) != nr_pages (%ld)", j, nr_pages);
+DPRINTF("j (%ld) != nr_pages (%ld)", j, nr_pages);
 memset(dump_mem_start, 0, PAGE_SIZE);
 for (; j < nr_pages; j++) {
 sts = dump_rtn(xch, args, dump_mem_start, PAGE_SIZE);
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [XTF PATCH 1/3] xtf-runner: introduce get_xen_version

2016-08-11 Thread Wei Liu
On Thu, Aug 11, 2016 at 11:10:41AM +0100, Andrew Cooper wrote:
> On 10/08/16 16:07, Wei Liu wrote:
> > Signed-off-by: Wei Liu 
> > ---
> >  xtf-runner | 13 +
> >  1 file changed, 13 insertions(+)
> >
> > diff --git a/xtf-runner b/xtf-runner
> > index c063699..7b69c45 100755
> > --- a/xtf-runner
> > +++ b/xtf-runner
> > @@ -151,6 +151,19 @@ def __repr__(self):
> >  return "TestInfo(%s)" % (self.name, )
> >  
> >  
> > +def get_xen_version():
> > +"""Get the version string of Xen"""
> > +
> > +for line in check_output(['xl', 'info']).splitlines():
> > +if not line.startswith("xen_version"):
> > +continue
> > +
> > +version_str = line.split()[2:][0]
> > +break
> > +
> > +return version_str
> 
> This will hit a name error if xen_version isn't found in the output.
> 
> A better option would be to "return line.split()[2:][0]" directly and
> raise a RunnerError() at this point.
> 
> However, xen_version was introduced by me a while ago, but not so very
> long ago, and won't work on older versions of Xen.
> 

Oh, didn't notice that. I thought it was available to all versions of
Xen.

> I think at this point, it would just be easier to use the python libxc
> bindings, so
> 
> from xen.lowlevel.xc import xc
> libxc = xc()
> 
> info = libxc.xeninfo()
> return "%s.%s" % (info["xen_major"], info["xen_minor"])
> 
> Make sure the import statement is inside the function call to avoid
> breaking the usecase of `xtf-runner --list` offhost.
> 

NP.

Wei.

> ~Andrew
> 
> > +
> > +
> >  def parse_test_instance_string(arg):
> >  """Parse a test instance string.
> >  
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [XTF PATCH 1/3] xtf-runner: introduce get_xen_version

2016-08-11 Thread Andrew Cooper
On 11/08/16 11:12, Wei Liu wrote:
> On Thu, Aug 11, 2016 at 11:10:41AM +0100, Andrew Cooper wrote:
>> On 10/08/16 16:07, Wei Liu wrote:
>>> Signed-off-by: Wei Liu 
>>> ---
>>>  xtf-runner | 13 +
>>>  1 file changed, 13 insertions(+)
>>>
>>> diff --git a/xtf-runner b/xtf-runner
>>> index c063699..7b69c45 100755
>>> --- a/xtf-runner
>>> +++ b/xtf-runner
>>> @@ -151,6 +151,19 @@ def __repr__(self):
>>>  return "TestInfo(%s)" % (self.name, )
>>>  
>>>  
>>> +def get_xen_version():
>>> +"""Get the version string of Xen"""
>>> +
>>> +for line in check_output(['xl', 'info']).splitlines():
>>> +if not line.startswith("xen_version"):
>>> +continue
>>> +
>>> +version_str = line.split()[2:][0]
>>> +break
>>> +
>>> +return version_str
>> This will hit a name error if xen_version isn't found in the output.
>>
>> A better option would be to "return line.split()[2:][0]" directly and
>> raise a RunnerError() at this point.
>>
>> However, xen_version was introduced by me a while ago, but not so very
>> long ago, and won't work on older versions of Xen.
>>
> Oh, didn't notice that. I thought it was available to all versions of
> Xen.
>
>> I think at this point, it would just be easier to use the python libxc
>> bindings, so
>>
>> from xen.lowlevel.xc import xc
>> libxc = xc()
>>
>> info = libxc.xeninfo()
>> return "%s.%s" % (info["xen_major"], info["xen_minor"])

Erm - these should be %d not %s.

Sorry for the misinformation.

~Andrew

>>
>> Make sure the import statement is inside the function call to avoid
>> breaking the usecase of `xtf-runner --list` offhost.
>>
> NP.
>
> Wei.
>
>> ~Andrew
>>
>>> +
>>> +
>>>  def parse_test_instance_string(arg):
>>>  """Parse a test instance string.
>>>  


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 16/19] mini-os: map page allocator's bitmap to virtual kernel area for ballooning

2016-08-11 Thread Juergen Gross
On 11/08/16 11:40, Samuel Thibault wrote:
> Juergen Gross, on Thu 11 Aug 2016 11:18:19 +0200, wrote:
>> +extern unsigned long *mm_bitmap;
>> +extern unsigned long mm_bitmap_size;
> 
> Ah I was thinking to have these use mm_malloc_ too. "mm_bitmap" seems
> short for namespace pollution.

Sorry, you wrote:

> Ditto, mm_bitmap and mm_bitmap_size.

So I thought you wanted it to be that way.

I can easily change it again.

Wei, do I need to send all patches again, or is it enough to start with
patch 16? Patches 1-15 are already Acked/Reviewed by Samuel.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 16/19] mini-os: map page allocator's bitmap to virtual kernel area for ballooning

2016-08-11 Thread Wei Liu
On Thu, Aug 11, 2016 at 12:19:20PM +0200, Juergen Gross wrote:
> On 11/08/16 11:40, Samuel Thibault wrote:
> > Juergen Gross, on Thu 11 Aug 2016 11:18:19 +0200, wrote:
> >> +extern unsigned long *mm_bitmap;
> >> +extern unsigned long mm_bitmap_size;
> > 
> > Ah I was thinking to have these use mm_malloc_ too. "mm_bitmap" seems
> > short for namespace pollution.
> 
> Sorry, you wrote:
> 
> > Ditto, mm_bitmap and mm_bitmap_size.
> 
> So I thought you wanted it to be that way.
> 
> I can easily change it again.
> 
> Wei, do I need to send all patches again, or is it enough to start with
> patch 16? Patches 1-15 are already Acked/Reviewed by Samuel.
> 
> 

No need to resend all of them. Resending this one is ok.

I would in fact even prefer a git branch that I can pull and commit
directly.

Wei.

> Juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 16/19] mini-os: map page allocator's bitmap to virtual kernel area for ballooning

2016-08-11 Thread Samuel Thibault
Juergen Gross, on Thu 11 Aug 2016 12:19:20 +0200, wrote:
> On 11/08/16 11:40, Samuel Thibault wrote:
> > Juergen Gross, on Thu 11 Aug 2016 11:18:19 +0200, wrote:
> >> +extern unsigned long *mm_bitmap;
> >> +extern unsigned long mm_bitmap_size;
> > 
> > Ah I was thinking to have these use mm_malloc_ too. "mm_bitmap" seems
> > short for namespace pollution.
> 
> Sorry, you wrote:
> 
> > Ditto, mm_bitmap and mm_bitmap_size.
> 
> So I thought you wanted it to be that way.

Oops, sorry about my typo.

Samuel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 16/19] mini-os: map page allocator's bitmap to virtual kernel area for ballooning

2016-08-11 Thread Juergen Gross
On 11/08/16 12:21, Wei Liu wrote:
> On Thu, Aug 11, 2016 at 12:19:20PM +0200, Juergen Gross wrote:
>> On 11/08/16 11:40, Samuel Thibault wrote:
>>> Juergen Gross, on Thu 11 Aug 2016 11:18:19 +0200, wrote:
 +extern unsigned long *mm_bitmap;
 +extern unsigned long mm_bitmap_size;
>>>
>>> Ah I was thinking to have these use mm_malloc_ too. "mm_bitmap" seems
>>> short for namespace pollution.
>>
>> Sorry, you wrote:
>>
>>> Ditto, mm_bitmap and mm_bitmap_size.
>>
>> So I thought you wanted it to be that way.
>>
>> I can easily change it again.
>>
>> Wei, do I need to send all patches again, or is it enough to start with
>> patch 16? Patches 1-15 are already Acked/Reviewed by Samuel.
>>
>>
> 
> No need to resend all of them. Resending this one is ok.

I think I'll need to modify at least one other patch due to this change.
So I'll resend all modified patches.

> 
> I would in fact even prefer a git branch that I can pull and commit
> directly.

NP.


Juergen


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] Remove ambiguities in the COPYING file; add CONTRIBUTING file

2016-08-11 Thread Lars Kurth


On 11/08/2016 01:51, "Stefano Stabellini"  wrote:

>> +Developer's Certificate of Origin
>> +-
>> +
>> +All patches to the Xen Project code base must include the the line
>  ^ double "the"

Thanks: will fix. And also fixed in the wiki page where I copied this from.


>> +GOVERNANCE AND WORKFLOW
>> +---
>> +
>> +The following documents provide a general overview of governance and
>> +contribution guidelines for the Xen Project:
>> + - https://xenproject.org/governance.html
>> + - https://xenproject.org/help/contribution-guidelines.html
>
>It might be worth considering importing the governance as a file in the
>Xen repository.

After discussing with a few committers, I think storing this in a git repo
is a good idea: possibly as markup. It should not be in xen.git though,
but probably in a new governance.git tree, for which I am maintainer.

Let me do a bit of investigation first. In any case, this should be covered
separately. We can later update this document accordingly.

Lars

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] common/vm_event: Fix comment

2016-08-11 Thread Jan Beulich
>>> On 09.08.16 at 10:32,  wrote:
> --- a/xen/common/vm_event.c
> +++ b/xen/common/vm_event.c
> @@ -255,7 +255,7 @@ static inline void vm_event_release_slot(struct domain *d,
>  
>  /*
>   * vm_event_mark_and_pause() tags vcpu and put it to sleep.
> - * The vcpu will resume execution in vm_event_wake_waiters().
> + * The vcpu will resume execution in vm_event_wake().
>   */
>  void vm_event_mark_and_pause(struct vcpu *v, struct vm_event_domain *ved)
>  {

I was about to commit this without further waiting for an ack, as
being supposedly trivial, but then I checked and also found
vm_event_wake{blocked,queued}(), and now I'm not sure
whether the reference wouldn't better be to
vm_event_wake_blocked(). Could you clarify that for me please?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 1/2] page-alloc/x86: don't restrict DMA heap to node 0

2016-08-11 Thread Jan Beulich
When node zero has no memory, the DMA bit width will end up getting set
to 9, which is obviously not helpful to hold back a reasonable amount
of low enough memory for Dom0 to use for DMA purposes. Find the lowest
node with memory below 4Gb instead.

Introduce arch_get_dma_bitsize() to keep this arch-specific logic out
of common code.

Also adjust the original calculation: I think the subtraction of 1
should have been part of the flsl() argument rather than getting
applied to its result. And while previously the division by 4 was valid
to be done on the flsl() result, this now also needs to be converted,
as is should only be applied to the spanned pages value.

Signed-off-by: Jan Beulich 
Acked-by: Julien Grall 
---
v2: Extend commit message to reason about the calculation change. Add
a comment to the calculation.

--- a/xen/arch/x86/numa.c
+++ b/xen/arch/x86/numa.c
@@ -355,11 +355,25 @@ void __init init_cpu_to_node(void)
 }
 }
 
-EXPORT_SYMBOL(cpu_to_node);
-EXPORT_SYMBOL(node_to_cpumask);
-EXPORT_SYMBOL(memnode_shift);
-EXPORT_SYMBOL(memnodemap);
-EXPORT_SYMBOL(node_data);
+unsigned int __init arch_get_dma_bitsize(void)
+{
+unsigned int node;
+
+for_each_online_node(node)
+if ( node_spanned_pages(node) &&
+ !(node_start_pfn(node) >> (32 - PAGE_SHIFT)) )
+break;
+if ( node >= MAX_NUMNODES )
+panic("No node with memory below 4Gb");
+
+/*
+ * Try to not reserve the whole node's memory for DMA, but dividing
+ * its spanned pages by (arbitrarily chosen) 4.
+ */
+return min_t(unsigned int,
+ flsl(node_start_pfn(node) + node_spanned_pages(node) / 4 - 1)
+ + PAGE_SHIFT, 32);
+}
 
 static void dump_numa(unsigned char key)
 {
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -1368,16 +1368,7 @@ void __init end_boot_allocator(void)
 init_heap_pages(virt_to_page(bootmem_region_list), 1);
 
 if ( !dma_bitsize && (num_online_nodes() > 1) )
-{
-#ifdef CONFIG_X86
-dma_bitsize = min_t(unsigned int,
-flsl(NODE_DATA(0)->node_spanned_pages) - 1
-+ PAGE_SHIFT - 2,
-32);
-#else
-dma_bitsize = 32;
-#endif
-}
+dma_bitsize = arch_get_dma_bitsize();
 
 printk("Domain heap initialised");
 if ( dma_bitsize )
--- a/xen/include/asm-arm/numa.h
+++ b/xen/include/asm-arm/numa.h
@@ -17,6 +17,11 @@ static inline __attribute__((pure)) node
 #define node_start_pfn(nid) (pdx_to_pfn(frametable_base_pdx))
 #define __node_distance(a, b) (20)
 
+static inline unsigned int arch_get_dma_bitsize(void)
+{
+return 32;
+}
+
 #endif /* __ARCH_ARM_NUMA_H */
 /*
  * Local variables:
--- a/xen/include/asm-x86/numa.h
+++ b/xen/include/asm-x86/numa.h
@@ -86,5 +86,6 @@ extern int valid_numa_range(u64 start, u
 
 void srat_parse_regions(u64 addr);
 extern u8 __node_distance(nodeid_t a, nodeid_t b);
+unsigned int arch_get_dma_bitsize(void);
 
 #endif



page-alloc/x86: don't restrict DMA heap to node 0

When node zero has no memory, the DMA bit width will end up getting set
to 9, which is obviously not helpful to hold back a reasonable amount
of low enough memory for Dom0 to use for DMA purposes. Find the lowest
node with memory below 4Gb instead.

Introduce arch_get_dma_bitsize() to keep this arch-specific logic out
of common code.

Also adjust the original calculation: I think the subtraction of 1
should have been part of the flsl() argument rather than getting
applied to its result. And while previously the division by 4 was valid
to be done on the flsl() result, this now also needs to be converted,
as is should only be applied to the spanned pages value.

Signed-off-by: Jan Beulich 
Acked-by: Julien Grall 
---
v2: Extend commit message to reason about the calculation change. Add
a comment to the calculation.

--- a/xen/arch/x86/numa.c
+++ b/xen/arch/x86/numa.c
@@ -355,11 +355,25 @@ void __init init_cpu_to_node(void)
 }
 }
 
-EXPORT_SYMBOL(cpu_to_node);
-EXPORT_SYMBOL(node_to_cpumask);
-EXPORT_SYMBOL(memnode_shift);
-EXPORT_SYMBOL(memnodemap);
-EXPORT_SYMBOL(node_data);
+unsigned int __init arch_get_dma_bitsize(void)
+{
+unsigned int node;
+
+for_each_online_node(node)
+if ( node_spanned_pages(node) &&
+ !(node_start_pfn(node) >> (32 - PAGE_SHIFT)) )
+break;
+if ( node >= MAX_NUMNODES )
+panic("No node with memory below 4Gb");
+
+/*
+ * Try to not reserve the whole node's memory for DMA, but dividing
+ * its spanned pages by (arbitrarily chosen) 4.
+ */
+return min_t(unsigned int,
+ flsl(node_start_pfn(node) + node_spanned_pages(node) / 4 - 1)
+ + PAGE_SHIFT, 32);
+}
 
 static void dump_numa(unsigned char key)
 {
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -1368,16 +1368,7 @@ void __init end_boot_allocator(void)
 init_heap_pages(virt_to_page(bootmem_region_list), 1);
 

Re: [Xen-devel] [PATCH v2 1/2] page-alloc/x86: don't restrict DMA heap to node 0

2016-08-11 Thread Andrew Cooper
On 11/08/16 11:44, Jan Beulich wrote:
> When node zero has no memory, the DMA bit width will end up getting set
> to 9, which is obviously not helpful to hold back a reasonable amount
> of low enough memory for Dom0 to use for DMA purposes. Find the lowest
> node with memory below 4Gb instead.
>
> Introduce arch_get_dma_bitsize() to keep this arch-specific logic out
> of common code.
>
> Also adjust the original calculation: I think the subtraction of 1
> should have been part of the flsl() argument rather than getting
> applied to its result. And while previously the division by 4 was valid
> to be done on the flsl() result, this now also needs to be converted,
> as is should only be applied to the spanned pages value.
>
> Signed-off-by: Jan Beulich 
> Acked-by: Julien Grall 
> ---
> v2: Extend commit message to reason about the calculation change. Add
> a comment to the calculation.

Reviewed-by: Andrew Cooper 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [XTF PATCH 3/3] xtf-runner: support two modes for getting output

2016-08-11 Thread Ian Jackson
Wei Liu writes ("[XTF PATCH 3/3] xtf-runner: support two modes for getting 
output"):
> We need two modes for getting output:
...
> +# Use time to generate unique stamps
> +start_stamp = "= XTF TEST START %s =" % local_time
> +end_stamp = "= XTF TEST END %s =" % local_time

This will go wrong if someone runs the same test very rapidly in a
loop.  It needs to be augmented, at least.

Ideally with the domid, but AFAICT that's not available here.  If you
can't think of anything else, use, in addition to the timestamp, the
pid of the xtf-runner process, plus a counter.  (The counter is
necessary in case the same xtf-runner process is used to run the same
test multiple times.)

> +f = open(fn, "rb")
> +output = f.readlines()
> +f.close()
> +lines = []
> +found = False
> +for line in output:
> +if end_stamp in line:
> +break
> +if start_stamp in line:
> +found = True
> +continue
> +if not found:
> +continue
> +lines.append(line)

This is accidentally quadratic in the number of test executions.
Do we care ?

If we do care, we should read the file backwards.  AFAICT "tac" can do
this in a manner that's not quadratic in the size of the logfile:

mariner:iwj> time tac v | head | sha256sum 
55748d8a2243c7a8978eccf24bb4f603091ecf4be836294e7009695426971485  -

real0m0.019s
user0m0.000s
sys 0m0.000s
mariner:iwj> time cat v | tail | tac | sha256sum 
55748d8a2243c7a8978eccf24bb4f603091ecf4be836294e7009695426971485  -

real0m11.286s
user0m0.296s
sys 0m1.096s
mariner:iwj> ll --hu v
-rw-rw-r-- 1 iwj iwj 638M Aug 11 11:42 v
mariner:iwj>

Or you could write your own implementation.  The algorithm is to read
increasingly large chunks of the tail of the file, into memory, until
you find all of the part you're looking for.

HTH.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] Remove ambiguities in the COPYING file; add CONTRIBUTING file

2016-08-11 Thread Lars Kurth


On 11/08/2016 10:49, "George Dunlap"  wrote:

>On 11/08/16 01:51, Stefano Stabellini wrote:
>> On Wed, 10 Aug 2016, Lars Kurth wrote:
>>> COPYING file:
>>> The motivation of this change is to make it easier for new
>>> contributors to conduct a license and patent review, WITHOUT
>>> changing any licenses.
>>> - Remove references to BSD-style licenses as we have more
>>>   common license exceptions and replace with "other license
>>>   stanzas"
>>> - List the most common situations under which code is licensed
>>>   under licenses other than GPLv2 (section "Licensing Exceptions")
>>> - List the most common non-GPLv2 licenses that are in use in
>>>   this repository based on a recent FOSSology scan (section
>>>   "Licensing Exceptions")
>>> - List other license related conventions within the project
>>>   to make it easier to conduct a license review.
>>> - Clarify the incoming license as its omission has confused
>>>   past contributors (section "Contributions")
>>>
>>> CONTRIBUTION file:
>>> The motivation of this file is to make it easier for contributors
>>> to find contribution related resources. Add information on existing
>>> license related conventions to avoid unintentional future licensing
>>> issues. Provide templates for copyright headers for the most commonly
>>> used licenses in this repository.
>>>
>>> Signed-off-by: Lars Kurth 
>>> ---
>>>  CONTRIBUTING | 210
>>>+++
>>>  COPYING  |  64 ++
>>>  2 files changed, 260 insertions(+), 14 deletions(-)
>>>  create mode 100644 CONTRIBUTING
>>>
>>> diff --git a/CONTRIBUTING b/CONTRIBUTING
>>> new file mode 100644
>>> index 000..7af13c4
>>> --- /dev/null
>>> +++ b/CONTRIBUTING
>>> @@ -0,0 +1,210 @@
>>> +
>>> +CONTRIBUTING
>>> +
>>> +
>>> +INBOUND LICENSE
>>> +---
>>> +
>>> +Contributions are governed by the license that applies to relevant
>>> +specific file or by the license specified in the COPYING file, that
>>  ^files
>
>I think "file" is better here, as the license is on a file-by-file
>basis, not on a whole contribution basis.

Agreed: licenses are per file. For files which don't have a license header
(of which we have many), the license is governed by the COPYING file.

> That is, if your contribution
>changes a BSD file and a GPLv2 file in a single series (or a single
>patch), then the changes to the BSD file are goverened by the BSD
>licence, and the changes to the GPLv2 file are governed by the GPLv2.

Correct.

>> 
>> 
>>> +governs the license of its containing directory and its
>>>subdirectories.
>>> +
>>> +Most of the Xen Project code is licensed under GPLv2, but a number of
>>> +directories are primarily licensed under different licenses.
>> ^ I would remove "primarily" from this sentence
>
>"primarily licensed under different licenses" implies to me that most of
>the files in the directory are under a different license, but some may
>be licensed GPLv2.  Without the "primarily" I would take that to imply
>that *none* of the files are licensed GPLv2.

George is correct: I used "primarily" because there are hardly any
directories 
which are truly one license only. Almost all contain a mixture of
licenses. 
The only notable exception is xen/include/public, which is all MIT.

>If there is at least one directory that has mostly non-GPLv2 files but
>at least one GPLv2 file, or we anticipate that such directories might
>exist in the future, I would leave the "primarily" in.  If there aren't
>any now and we don't expect any in the future, then yes it's unnecessary
>and should probably be removed.

There are loads: for example xen/include/acpi

Lars

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2] domctl: relax getdomaininfo permissions

2016-08-11 Thread Andrew Cooper
On 08/08/16 07:12, Jan Beulich wrote:
 On 05.08.16 at 19:07,  wrote:
>> On 05/08/16 14:54, Jan Beulich wrote:
>> On 05.08.16 at 15:10,  wrote:
 On 05/08/16 12:20, Jan Beulich wrote:
> I wonder what good the duplication of the returned domain ID does: I'm
> tempted to remove the one in the command-specific structure. Does
> anyone have insight into why it was done that way?
 I wonder whether the first incarnation of this hypercall lacked a domid
 field in the returned structure?  It seems like the kind of thing which
 would be omitted, until the sysctl list version got introduced.
>>> Oh, good point - that makes clear why the field can't be dropped:
>>> That sysctl would break then.
>> Which domid were you referring to then?
>>
>> The domid in the xen_domctl_getdomaininfo structure clearly needs to
>> stay, but the domctl "op->domain = op->u.getdomaininfo.domain;"
>> needn't.  OTOH, as we need to copy back the entire domctl structure
>> anyway, it doesn't hurt to keep it.
> The comment was about removal of the field, not just the
> assignment. But as you did make obvious, the sysctl side needs
> it to stay.
>
> --- a/xen/include/xsm/dummy.h
> +++ b/xen/include/xsm/dummy.h
> @@ -61,7 +61,12 @@ static always_inline int xsm_default_act
>  return 0;
>  case XSM_TARGET:
>  if ( src == target )
> +{
>  return 0;
> +case XSM_XS_PRIV:
> +if ( src->is_xenstore )
> +return 0;
> +}
>  /* fall through */
>  case XSM_DM_PRIV:
>  if ( target && src->target == target )
> @@ -71,10 +76,6 @@ static always_inline int xsm_default_act
>  if ( src->is_privileged )
>  return 0;
>  return -EPERM;
> -case XSM_XS_PRIV:
> -if ( src->is_xenstore || src->is_privileged )
> -return 0;
> -return -EPERM;
>  default:
>  LINKER_BUG_ON(1);
>  return -EPERM;
 What is this change in relation to?  I can't see how it is related to
 the XSM changes mentioned in the commit, as that is strictly for the use
 of XSM_OTHER.
>>> I don't see any XSM changes mentioned in the description, there
>>> was only the XSM_OTHER related question outside the description.
>>> Anyway - the change above is what guarantees the XSM_XS_PRIV
>>> check, as invoked by xsm_domctl()'s XEN_DOMCTL_getdomaininfo
>>> case, to fall through into XSM_DM_PRIV - after all that's what the
>>> whole patch is about.
>> But the patch is about a qemu stubdom, which would be DM_PRIV, not XS_PRIV.
> The point of the patch is to _extend_ permissions of this domctl
> from XS_PRIV to DM_PRIV.

Aah - and this only exists because of the xsm_domctl() bodge with
XSM_OTHER, which actually makes getdomaininfo protected with XS_PRIV.

Reviewed-by: Andrew Cooper 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/3] livepach: Add .livepatch.hooks functions and test-case

2016-08-11 Thread Ian Jackson
Jan Beulich writes ("Re: [Xen-devel] [PATCH v2 1/3] livepach: Add 
.livepatch.hooks functions and test-case"):
> On 10.08.16 at 11:46,  wrote:
> > Odd. I've tried this simple example:
> > 
> > typedef int fn_t(void);
...
> > const fn_t**cfn;

Ie,
const int **cfn(void);

> > for(i = 0; !rc && i < ps->n; ++i)
> > rc = ps->cfn[i]();

From `(gcc-4)Function Attributes':

  `const'
   Many functions do not examine any values except their arguments,
   and have no effects except the return value.  Basically this is
   just slightly more strict class than the `pure' attribute below,
   since function is not allowed to read global memory.

   Note that a function that has pointer arguments and examines the
   data pointed to must _not_ be declared `const'.  Likewise, a
   function that calls a non-`const' function usually must not be
   `const'.  It does not make sense for a `const' function to return
   `void'.

   The attribute `const' is not implemented in GCC versions earlier
   than 2.5.  An alternative way to declare that a function has no
   side effects, which works in the current version and in some older
   versions, is as follows:

typedef int intfn ();

extern const intfn square;

   This approach does not work in GNU C++ from 2.6.0 on, since the
   language specifies that the `const' must be attached to the return
   value.

Ie, gcc has always treated a function marked const as having no
unexpected inputs and no side effects.

> > test1() and test2() get compiled identically. test3(), using the field
> > with the misplaced const, oddly enough gets compiled slightly
> > differently (and without a warning despite one would seem
> > warranted), yet the call doesn't get omitted. If, however, I change
> > the return type of fn_t to void, the function body of test3() ends
> > up empty, which is a compiler bug afaict, but which also suggests
> > that you've tried the variant with the misplaced const.

No, it is gcc realising that there is no point calling a function
which returns void and has no side effects

TBH I am inclined to agree that gcc should issue a warning for a
such a function.

> FTR: This is not a compiler bug, as specifically named undefined
> in the C spec.

In this case, the behaviour is defined by the GCC manual.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [XTF PATCH 3/3] xtf-runner: support two modes for getting output

2016-08-11 Thread Wei Liu
On Thu, Aug 11, 2016 at 11:49:29AM +0100, Ian Jackson wrote:
> Wei Liu writes ("[XTF PATCH 3/3] xtf-runner: support two modes for getting 
> output"):
> > We need two modes for getting output:
> ...
> > +# Use time to generate unique stamps
> > +start_stamp = "= XTF TEST START %s =" % local_time
> > +end_stamp = "= XTF TEST END %s =" % local_time
> 
> This will go wrong if someone runs the same test very rapidly in a
> loop.  It needs to be augmented, at least.
> 
> Ideally with the domid, but AFAICT that's not available here.  If you
> can't think of anything else, use, in addition to the timestamp, the
> pid of the xtf-runner process, plus a counter.  (The counter is
> necessary in case the same xtf-runner process is used to run the same
> test multiple times.)
> 
> > +f = open(fn, "rb")
> > +output = f.readlines()
> > +f.close()
> > +lines = []
> > +found = False
> > +for line in output:
> > +if end_stamp in line:
> > +break
> > +if start_stamp in line:
> > +found = True
> > +continue
> > +if not found:
> > +continue
> > +lines.append(line)
> 
> This is accidentally quadratic in the number of test executions.
> Do we care ?
> 

I don't think so. That's why I used this simple algorithm. I don't
expect the log files to even exceed a few MBs. We can optimise later if
there is a test that would generate a huge amount of output.

Anyway, I think the stamps and the algorithm issues will become moot if
we seek to the end of the file and read it after the test is finished as
Andrew suggested. What do you think of that?

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] Remove ambiguities in the COPYING file; add CONTRIBUTING file

2016-08-11 Thread Lars Kurth


On 11/08/2016 10:56, "George Dunlap"  wrote:

>> +GPL v2 License
>> +--
>> +
>> +/*
>> + * 
>> + *
>> + * 
>> + * 
>> + * Copyright (C)   
>> + *
>> + * This program is free software; you can redistribute it and/or
>> + * modify it under the terms of the GNU General Public
>> + * License v2 as published by the Free Software Foundation.
>
>Should this line have "only" in it somewhere?

I don't believe this is necessary. I couldn't find a single instance in
our codebase which says "version 2 only".

Most files have the following:

 * This program is free software; you can redistribute it and/or modify it
 * under the terms and conditions of the GNU General Public License,
 * version 2, as published by the Free Software Foundation.


which spells out the version with comma's and has slightly different line
breaks. I think it may be better to use that, as FOSSology creates less
noise with this variant of the header: it creates one GPLv2 entry, whereas
the previous one creates a GPL and a GPLv2 entry.

Regards
Lars

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v4 17/19] mini-os: add support for ballooning up

2016-08-11 Thread Juergen Gross
Add support for ballooning the domain up by a specified amount of
pages. Following steps are performed:

- extending the p2m map
- extending the page allocator's bitmap
- getting new memory pages from the hypervisor
- adding the memory at the current end of guest memory

Signed-off-by: Juergen Gross 
Reviewed-by: Samuel Thibault 
---
V3: change "if" to "while" in balloon_up() as requested by Samuel Thibault
---
 arch/arm/balloon.c |  9 ++
 arch/x86/balloon.c | 94 ++
 balloon.c  | 64 +
 include/balloon.h  |  5 +++
 mm.c   |  4 +++
 5 files changed, 176 insertions(+)

diff --git a/arch/arm/balloon.c b/arch/arm/balloon.c
index 549e51b..7f35328 100644
--- a/arch/arm/balloon.c
+++ b/arch/arm/balloon.c
@@ -27,4 +27,13 @@
 
 unsigned long virt_kernel_area_end;   /* TODO: find a virtual area */
 
+int arch_expand_p2m(unsigned long max_pfn)
+{
+return 0;
+}
+
+void arch_pfn_add(unsigned long pfn, unsigned long mfn)
+{
+}
+
 #endif
diff --git a/arch/x86/balloon.c b/arch/x86/balloon.c
index a7f20e4..42389e4 100644
--- a/arch/x86/balloon.c
+++ b/arch/x86/balloon.c
@@ -23,6 +23,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -30,9 +31,36 @@
 
 unsigned long virt_kernel_area_end = VIRT_KERNEL_AREA;
 
+static void p2m_invalidate(unsigned long *list, unsigned long start_idx)
+{
+unsigned long idx;
+
+for ( idx = start_idx; idx < P2M_ENTRIES; idx++ )
+list[idx] = INVALID_P2M_ENTRY;
+}
+
+static inline unsigned long *p2m_l3list(void)
+{
+return 
mfn_to_virt(HYPERVISOR_shared_info->arch.pfn_to_mfn_frame_list_list);
+}
+
+static inline unsigned long *p2m_to_virt(unsigned long p2m)
+{
+return ( p2m == INVALID_P2M_ENTRY ) ? NULL : mfn_to_virt(p2m);
+}
+
 void arch_remap_p2m(unsigned long max_pfn)
 {
 unsigned long pfn;
+unsigned long *l3_list, *l2_list, *l1_list;
+
+l3_list = p2m_l3list();
+l2_list = p2m_to_virt(l3_list[L3_P2M_IDX(max_pfn - 1)]);
+l1_list = p2m_to_virt(l2_list[L2_P2M_IDX(max_pfn - 1)]);
+
+p2m_invalidate(l3_list, L3_P2M_IDX(max_pfn - 1) + 1);
+p2m_invalidate(l2_list, L2_P2M_IDX(max_pfn - 1) + 1);
+p2m_invalidate(l1_list, L1_P2M_IDX(max_pfn - 1) + 1);
 
 if ( p2m_pages(nr_max_pages) <= p2m_pages(max_pfn) )
 return;
@@ -50,4 +78,70 @@ void arch_remap_p2m(unsigned long max_pfn)
 ASSERT(virt_kernel_area_end <= VIRT_DEMAND_AREA);
 }
 
+int arch_expand_p2m(unsigned long max_pfn)
+{
+unsigned long pfn;
+unsigned long *l1_list, *l2_list, *l3_list;
+
+p2m_chk_pfn(max_pfn - 1);
+l3_list = p2m_l3list();
+
+for ( pfn = (HYPERVISOR_shared_info->arch.max_pfn + P2M_MASK) & ~P2M_MASK;
+  pfn < max_pfn; pfn += P2M_ENTRIES )
+{
+l2_list = p2m_to_virt(l3_list[L3_P2M_IDX(pfn)]);
+if ( !l2_list )
+{
+l2_list = (unsigned long*)alloc_page();
+if ( !l2_list )
+return -ENOMEM;
+p2m_invalidate(l2_list, 0);
+l3_list[L3_P2M_IDX(pfn)] = virt_to_mfn(l2_list);
+}
+l1_list = p2m_to_virt(l2_list[L2_P2M_IDX(pfn)]);
+if ( !l1_list )
+{
+l1_list = (unsigned long*)alloc_page();
+if ( !l1_list )
+return -ENOMEM;
+p2m_invalidate(l1_list, 0);
+l2_list[L2_P2M_IDX(pfn)] = virt_to_mfn(l1_list);
+
+if ( map_frame_rw((unsigned long)(phys_to_machine_mapping + pfn),
+  l2_list[L2_P2M_IDX(pfn)]) )
+return -ENOMEM;
+}
+}
+
+HYPERVISOR_shared_info->arch.max_pfn = max_pfn;
+
+/* Make sure the new last page can be mapped. */
+if ( !need_pgt((unsigned long)pfn_to_virt(max_pfn - 1)) )
+return -ENOMEM;
+
+return 0;
+}
+
+void arch_pfn_add(unsigned long pfn, unsigned long mfn)
+{
+mmu_update_t mmu_updates[1];
+pgentry_t *pgt;
+int rc;
+
+phys_to_machine_mapping[pfn] = mfn;
+
+pgt = need_pgt((unsigned long)pfn_to_virt(pfn));
+ASSERT(pgt);
+mmu_updates[0].ptr = virt_to_mach(pgt) | MMU_NORMAL_PT_UPDATE;
+mmu_updates[0].val = (pgentry_t)(mfn << PAGE_SHIFT) |
+ _PAGE_PRESENT | _PAGE_RW;
+rc = HYPERVISOR_mmu_update(mmu_updates, 1, NULL, DOMID_SELF);
+if ( rc < 0 )
+{
+printk("ERROR: build_pagetable(): PTE could not be updated\n");
+printk("   mmu_update failed with rc=%d\n", rc);
+do_exit();
+}
+}
+
 #endif
diff --git a/balloon.c b/balloon.c
index 0a3342c..e1af778 100644
--- a/balloon.c
+++ b/balloon.c
@@ -23,11 +23,13 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
 
 unsigned long nr_max_pages;
+unsigned long nr_mem_pages;
 
 void get_max_pages(void)
 {
@@ -62,3 +64,65 @@ void mm_alloc_bitmap_remap(void)
 virt_kernel_area_end += round_pgup((nr_max_pages + 1) >> (PAGE_SHIFT + 3));
 ASSERT(virt_kernel_area_end <= VIRT_DEMAND_AREA);
 }
+
+

[Xen-devel] [PATCH v4 16/19] mini-os: map page allocator's bitmap to virtual kernel area for ballooning

2016-08-11 Thread Juergen Gross
In case of CONFIG_BALLOON the page allocator's bitmap needs some space
to be able to grow. Remap it to kernel virtual area if the preallocated
area isn't large enough.

Signed-off-by: Juergen Gross 
---
V4: - mm_bitmap* -> mm_alloc_bitmap* as requested by Samuel Thibault

V3: - add assertion as requested by Samuel Thibault
- rename functions to have mm_ prefix as requested by Samuel Thibault
---
 balloon.c | 18 ++
 include/balloon.h |  2 ++
 include/mm.h  |  6 ++
 mm.c  | 44 +++-
 4 files changed, 49 insertions(+), 21 deletions(-)

diff --git a/balloon.c b/balloon.c
index 1ec113d..0a3342c 100644
--- a/balloon.c
+++ b/balloon.c
@@ -44,3 +44,21 @@ void get_max_pages(void)
 nr_max_pages = ret;
 printk("Maximum memory size: %ld pages\n", nr_max_pages);
 }
+
+void mm_alloc_bitmap_remap(void)
+{
+unsigned long i;
+
+if ( mm_alloc_bitmap_size >= ((nr_max_pages + 1) >> (PAGE_SHIFT + 3)) )
+return;
+
+for ( i = 0; i < mm_alloc_bitmap_size; i += PAGE_SIZE )
+{
+map_frame_rw(virt_kernel_area_end + i,
+ virt_to_mfn((unsigned long)(mm_alloc_bitmap) + i));
+}
+
+mm_alloc_bitmap = (unsigned long *)virt_kernel_area_end;
+virt_kernel_area_end += round_pgup((nr_max_pages + 1) >> (PAGE_SHIFT + 3));
+ASSERT(virt_kernel_area_end <= VIRT_DEMAND_AREA);
+}
diff --git a/include/balloon.h b/include/balloon.h
index b0d0ebf..9154f44 100644
--- a/include/balloon.h
+++ b/include/balloon.h
@@ -31,11 +31,13 @@ extern unsigned long virt_kernel_area_end;
 
 void get_max_pages(void);
 void arch_remap_p2m(unsigned long max_pfn);
+void mm_alloc_bitmap_remap(void);
 
 #else /* CONFIG_BALLOON */
 
 static inline void get_max_pages(void) { }
 static inline void arch_remap_p2m(unsigned long max_pfn) { }
+static inline void mm_alloc_bitmap_remap(void) { }
 
 #endif /* CONFIG_BALLOON */
 #endif /* _BALLOON_H_ */
diff --git a/include/mm.h b/include/mm.h
index 6add683..fc3128b 100644
--- a/include/mm.h
+++ b/include/mm.h
@@ -42,8 +42,14 @@
 #define STACK_SIZE_PAGE_ORDER __STACK_SIZE_PAGE_ORDER
 #define STACK_SIZE __STACK_SIZE
 
+#define round_pgdown(_p)  ((_p) & PAGE_MASK)
+#define round_pgup(_p)(((_p) + (PAGE_SIZE - 1)) & PAGE_MASK)
+
 extern unsigned long nr_free_pages;
 
+extern unsigned long *mm_alloc_bitmap;
+extern unsigned long mm_alloc_bitmap_size;
+
 void init_mm(void);
 unsigned long alloc_pages(int order);
 #define alloc_page()alloc_pages(0)
diff --git a/mm.c b/mm.c
index 707a3e0..9e3a479 100644
--- a/mm.c
+++ b/mm.c
@@ -48,11 +48,14 @@
  *  One bit per page of memory. Bit set => page is allocated.
  */
 
-static unsigned long *alloc_bitmap;
+unsigned long *mm_alloc_bitmap;
+unsigned long mm_alloc_bitmap_size;
+
 #define PAGES_PER_MAPWORD (sizeof(unsigned long) * 8)
 
 #define allocated_in_map(_pn) \
-(alloc_bitmap[(_pn)/PAGES_PER_MAPWORD] & (1UL<<((_pn)&(PAGES_PER_MAPWORD-1
+(mm_alloc_bitmap[(_pn) / PAGES_PER_MAPWORD] & \
+ (1UL << ((_pn) & (PAGES_PER_MAPWORD - 1
 
 unsigned long nr_free_pages;
 
@@ -61,8 +64,8 @@ unsigned long nr_free_pages;
  *  -(1<= n. 
  *  (1> (PAGE_SHIFT+3);
-bitmap_size  = round_pgup(bitmap_size);
-alloc_bitmap = (unsigned long *)to_virt(min);
-min += bitmap_size;
+mm_alloc_bitmap_size  = (max + 1) >> (PAGE_SHIFT + 3);
+mm_alloc_bitmap_size  = round_pgup(mm_alloc_bitmap_size);
+mm_alloc_bitmap = (unsigned long *)to_virt(min);
+min += mm_alloc_bitmap_size;
 range= max - min;
 
 /* All allocated by default. */
-memset(alloc_bitmap, ~0, bitmap_size);
+memset(mm_alloc_bitmap, ~0, mm_alloc_bitmap_size);
 /* Free up the memory we've been given to play with. */
 map_free(PHYS_PFN(min), range>>PAGE_SHIFT);
 
@@ -198,6 +198,8 @@ static void init_page_allocator(unsigned long min, unsigned 
long max)
 free_head[i]= ch;
 ct->level   = i;
 }
+
+mm_alloc_bitmap_remap();
 }
 
 
-- 
2.6.6


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [XTF PATCH 3/3] xtf-runner: support two modes for getting output

2016-08-11 Thread Ian Jackson
Andrew Cooper writes ("Re: [XTF PATCH 3/3] xtf-runner: support two modes for 
getting output"):
> Can't you just open the log file as read, seek to the end, run the test
> and read again from the same FD?  It would be rather more simple than
> marking the logs.

This is a good suggestion, I think.  Much better than mine in my other
email.

> Sadly, whatever method we use here is going to have to be clever enough
> to cope with the log files being rotated, and I can't think of a clever
> way of doing that ATM.

In an automatic test system the logfiles won't get big enough to be
rotated.  In a manual setup this is just a "random hazard" which you
seem to be prepared to accept...

But: it is easy for the runner to see if the logfile was rotated.  It
can open the logfile a second time and check if it has the same inum.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] libxc: use DPRINTF in xc_domain_dumpcore_via_callback

2016-08-11 Thread Ian Jackson
Wei Liu writes ("[PATCH] libxc: use DPRINTF in 
xc_domain_dumpcore_via_callback"):
> That line doesn't reveal much information to ordinary users.
> 
> Change that to debug output.

I think you are right.

Acked-by: Ian Jackson 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/3] livepach: Add .livepatch.hooks functions and test-case

2016-08-11 Thread Jan Beulich
>>> On 11.08.16 at 12:56,  wrote:
> Jan Beulich writes ("Re: [Xen-devel] [PATCH v2 1/3] livepach: Add 
> .livepatch.hooks functions and test-case"):
>> On 10.08.16 at 11:46,  wrote:
>> > Odd. I've tried this simple example:
>> > 
>> > typedef int fn_t(void);
> ...
>> >const fn_t**cfn;
> 
> Ie,
> const int **cfn(void);
> 
>> >for(i = 0; !rc && i < ps->n; ++i)
>> >rc = ps->cfn[i]();
> 
> From `(gcc-4)Function Attributes':
> 
>   `const'
>Many functions do not examine any values except their arguments,
>and have no effects except the return value.  Basically this is
>just slightly more strict class than the `pure' attribute below,
>since function is not allowed to read global memory.
> 
>Note that a function that has pointer arguments and examines the
>data pointed to must _not_ be declared `const'.  Likewise, a
>function that calls a non-`const' function usually must not be
>`const'.  It does not make sense for a `const' function to return
>`void'.
> 
>The attribute `const' is not implemented in GCC versions earlier
>than 2.5.  An alternative way to declare that a function has no
>side effects, which works in the current version and in some older
>versions, is as follows:
> 
> typedef int intfn ();
> 
> extern const intfn square;
> 
>This approach does not work in GNU C++ from 2.6.0 on, since the
>language specifies that the `const' must be attached to the return
>value.
> 
> Ie, gcc has always treated a function marked const as having no
> unexpected inputs and no side effects.

Oh, I've always assumed that would be __attribute__((const))
only, but what you quote above proves me wrong.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [XTF PATCH 3/3] xtf-runner: support two modes for getting output

2016-08-11 Thread Wei Liu
On Thu, Aug 11, 2016 at 12:08:03PM +0100, Ian Jackson wrote:
> Andrew Cooper writes ("Re: [XTF PATCH 3/3] xtf-runner: support two modes for 
> getting output"):
> > Can't you just open the log file as read, seek to the end, run the test
> > and read again from the same FD?  It would be rather more simple than
> > marking the logs.
> 
> This is a good suggestion, I think.  Much better than mine in my other
> email.
> 
> > Sadly, whatever method we use here is going to have to be clever enough
> > to cope with the log files being rotated, and I can't think of a clever
> > way of doing that ATM.
> 
> In an automatic test system the logfiles won't get big enough to be
> rotated.  In a manual setup this is just a "random hazard" which you
> seem to be prepared to accept...
> 

That's what I thought as well.

> But: it is easy for the runner to see if the logfile was rotated.  It
> can open the logfile a second time and check if it has the same inum.
> 

For now let's keep things simple. Doesn't seem to buy us much even if we
implement this.

Wei.

> Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] xen does not support the 8G large bar

2016-08-11 Thread Konrad Rzeszutek Wilk
On August 10, 2016 11:30:37 PM EDT, "Gaofeng (GaoFeng, Euler)" 
 wrote:
>Hi George,
>
>I found that you have submitted a patch about "libxl,hvmloader: Don't
>relocate memory for MMIO hole". So, I have a "8G large bar PCI
>device(NVIDIA Tesla M60)" passthrough question to ask you.
>
>
>
>Host passthrough PCI Device info:
>lspci -vs 06:00.0
>06:00.0 3D controller: NVIDIA Corporation Device 13f2 (rev a1)
>Subsystem: NVIDIA Corporation Device 115e
>Flags: fast devsel, IRQ 40
>Memory at 9300 (32-bit, non-prefetchable) [disabled] [size=16M]
>  Memory at 23c (64-bit, prefetchable) [disabled] [size=8G]
> Memory at 23e (64-bit, prefetchable) [disabled] [size=32M]
>Capabilities: [60] Power Management version 3
>Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
>Capabilities: [78] Express Endpoint, MSI 00
>Capabilities: [100] Virtual Channel
>Capabilities: [250] Latency Tolerance Reporting
>Capabilities: [258] #1e
>Capabilities: [128] Power Budgeting 
>Capabilities: [420] Advanced Error Reporting
>Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024
>
>Capabilities: [900] #19
>Kernel driver in use: pciback
>Kernel modules: nvidiafb
>
>Guest passthrough PCI Device info:
>lspci -vs 00:05.0
>00:05.0 3D controller: NVIDIA Corporation Device 13f2 (rev a1)
>Subsystem: NVIDIA Corporation Device 115e
>Physical Slot: 5
>Flags: fast devsel, IRQ 36
>Memory at f500 (32-bit, non-prefetchable) [size=16M]
>Memory at 2 (64-bit, prefetchable) [size=4G]
>Memory at f200 (64-bit, prefetchable) [size=32M]
>Capabilities: [60] Power Management version 3
>Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
>Capabilities: [78] Express Endpoint, MSI 00
>Kernel modules: nouveau, nvidiafb
>
>
>passthrough to the guest, the large bar size only has 4G. so, the
>passthrough pci device does not work, windows 7/2008 R2 64bit vm has
>the same problem.
>
>.
>
>According to the analysis hvmloader code, find a problem:
>
>
>
>if (is_64bar) {
>
>bar_data_upper = pci_readl(devfn, bar_reg + 4);
>
>pci_writel(devfn, bar_reg + 4, ~0);
>
>bar_sz_upper = pci_readl(devfn, bar_reg + 4);
>
>pci_writel(devfn, bar_reg + 4, bar_data_upper);
>
>bar_sz = (bar_sz_upper << 32) | bar_sz;
>
>}
>
>bar_sz &= ~(bar_sz - 1);
>
>
>
>read from the pci device, bar_sz_upper is 0x, if the bar size
>is 8G, the bar_sz_upper should be 0xfffe.
>
>Have you ever encountered a similar problem?
>
>

Yes but only if you are using qemu-trad.

You need to have

https://github.com/xenserver/qemu-trad.pg/blob/master/master/0001-Add-64-bit-support-to-QEMU.patch

>
>Env info:
>
>device_model_version: qemu-xen
>
>xen version: xen-4.6.1
>
>Guest: RedHat-6.4-64
>
>Passthrough PCI Device: NVIDIA Tesla M60(compute mode)
>
>CPU: Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz
>
>
>
>Thanks
>
>Feng
>
>
>
>
>___
>Xen-devel mailing list
>Xen-devel@lists.xen.org
>https://lists.xen.org/xen-devel



___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2] domctl: relax getdomaininfo permissions

2016-08-11 Thread Jan Beulich
>>> On 05.08.16 at 13:20,  wrote:

Daniel,

I've only now realized that I forgot to Cc you on this v2.

Jan

> Qemu needs access to this for the domain it controls, both due to it
> being used by xc_domain_memory_mapping() (which qemu calls) and the
> explicit use in hw/xenpv/xen_domainbuild.c:xen_domain_poll(). Extend
> permissions to that of any "ordinary" domctl: A domain controlling the
> targeted domain can invoke this operation for that target domain (which
> is being achieved by no longer passing NULL to xsm_domctl()).
> 
> This at once avoids a for_each_domain() loop when the ID of an
> existing domain gets passed in.
> 
> Reported-by: Marek Marczykowski-Górecki 
> Signed-off-by: Jan Beulich 
> ---
> v2: Add a comment. Clarify description as to what additional permission
> is being granted.
> ---
> I know there had been an alternative patch suggestion, but that one
> doesn't seem have seen a formal submission so far, so here is my
> original proposal.
> 
> I wonder what good the duplication of the returned domain ID does: I'm
> tempted to remove the one in the command-specific structure. Does
> anyone have insight into why it was done that way?
> 
> I further wonder why we have XSM_OTHER: The respective conversion into
> other XSM_* values in xsm/dummy.h could as well move into the callers,
> making intentions more obvious when looking at the actual code.
> 
> --- a/tools/flask/policy/modules/xen.if
> +++ b/tools/flask/policy/modules/xen.if
> @@ -149,7 +149,7 @@ define(`device_model', `
>   create_channel($2, $1, $2_channel)
>   allow $1 $2_channel:event create;
>  
> - allow $1 $2_target:domain shutdown;
> + allow $1 $2_target:domain { getdomaininfo shutdown };
>   allow $1 $2_target:mmu { map_read map_write adjust physmap target_hack 
> };
>   allow $1 $2_target:hvm { getparam setparam trackdirtyvram hvmctl 
> irqlevel 
> pciroute pcilevel cacheattr send_irq };
>  ')
> --- a/xen/common/domctl.c
> +++ b/xen/common/domctl.c
> @@ -396,14 +396,13 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xe
>  switch ( op->cmd )
>  {
>  case XEN_DOMCTL_createdomain:
> -case XEN_DOMCTL_getdomaininfo:
>  case XEN_DOMCTL_test_assign_device:
>  case XEN_DOMCTL_gdbsx_guestmemio:
>  d = NULL;
>  break;
>  default:
>  d = rcu_lock_domain_by_id(op->domain);
> -if ( d == NULL )
> +if ( !d && op->cmd != XEN_DOMCTL_getdomaininfo )
>  return -ESRCH;
>  }
>  
> @@ -817,14 +816,22 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xe
>  
>  case XEN_DOMCTL_getdomaininfo:
>  {
> -domid_t dom = op->domain;
> -
> -rcu_read_lock(&domlist_read_lock);
> +domid_t dom = DOMID_INVALID;
>  
> -for_each_domain ( d )
> -if ( d->domain_id >= dom )
> +if ( !d )
> +{
> +ret = -EINVAL;
> +if ( op->domain >= DOMID_FIRST_RESERVED )
>  break;
>  
> +rcu_read_lock(&domlist_read_lock);
> +
> +dom = op->domain;
> +for_each_domain ( d )
> +if ( d->domain_id >= dom )
> +break;
> +}
> +
>  ret = -ESRCH;
>  if ( d == NULL )
>  goto getdomaininfo_out;
> @@ -839,6 +846,10 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xe
>  copyback = 1;
>  
>  getdomaininfo_out:
> +/* When d was non-NULL upon entry, no cleanup is needed. */
> +if ( dom == DOMID_INVALID )
> +break;
> +
>  rcu_read_unlock(&domlist_read_lock);
>  d = NULL;
>  break;
> --- a/xen/include/xsm/dummy.h
> +++ b/xen/include/xsm/dummy.h
> @@ -61,7 +61,12 @@ static always_inline int xsm_default_act
>  return 0;
>  case XSM_TARGET:
>  if ( src == target )
> +{
>  return 0;
> +case XSM_XS_PRIV:
> +if ( src->is_xenstore )
> +return 0;
> +}
>  /* fall through */
>  case XSM_DM_PRIV:
>  if ( target && src->target == target )
> @@ -71,10 +76,6 @@ static always_inline int xsm_default_act
>  if ( src->is_privileged )
>  return 0;
>  return -EPERM;
> -case XSM_XS_PRIV:
> -if ( src->is_xenstore || src->is_privileged )
> -return 0;
> -return -EPERM;
>  default:
>  LINKER_BUG_ON(1);
>  return -EPERM;



___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] Scheduler regression in 4.7

2016-08-11 Thread Andrew Cooper
Hello,

XenServer testing has discovered a regression from recent changes in
staging-4.7.

The actual cause is _csched_cpu_pick() falling over LIST_POISON, which
happened to occur at the same time as a domain was shutting down.  The
instruction in question is `mov 0x10(%rax),%rax` which looks like
reverse list traversal.

The regression is across the changes

xen-4.7/xen$ git lg d37c2b9^..f2160ba
* f2160ba - x86/mmcfg: Fix initalisation of variables in
pci_mmcfg_nvidia_mcp55() (6 days ago) 
* 471a151 - xen: Remove buggy initial placement algorithm (6 days ago)

* c732d3c - xen: Have schedulers revise initial placement (6 days ago)

* d37c2b9 - x86/EFI + Live Patch: avoid symbol address truncation (6
days ago) 

and is almost certainly c732d3c.

The log is below, although being a non-debug build, has mostly stack
rubble in the stack trace.

~Andrew

(XEN) [ 3315.431878] [ Xen-4.7.0-xs127546  x86_64  debug=n  Not
tainted ]
(XEN) [ 3315.431884] CPU:3
(XEN) [ 3315.431888] RIP:e008:[]
sched_credit.c#_csched_cpu_pick+0x1af/0x549
(XEN) [ 3315.431900] RFLAGS: 00010206   CONTEXT: hypervisor (d0v6)
(XEN) [ 3315.431907] rax: 0200200200200200   rbx: 0006  
rcx: 0006
(XEN) [ 3315.431914] rdx: 003fbfc42580   rsi: 82d0802df3a0  
rdi: 83102dba7c78
(XEN) [ 3315.431919] rbp: 83102dba7d28   rsp: 83102dba7bb8  
r8:  0001
(XEN) [ 3315.431924] r9:  0001   r10: 82d080317528  
r11: 
(XEN) [ 3315.431930] r12: 831108d7a000   r13: 0040  
r14: 83110889e980
(XEN) [ 3315.431934] r15:    cr0: 80050033  
cr4: 000426e0
(XEN) [ 3315.431939] cr3: 00202036a000   cr2: 88013dc783d8
(XEN) [ 3315.431944] ds:    es:    fs:    gs:    ss:
e010   cs: e008
(XEN) [ 3315.431952] Xen code around 
(sched_credit.c#_csched_cpu_pick+0x1af/0x549):
(XEN) [ 3315.431956]  18 48 8b 00 48 8b 40 28 <48> 8b 40 10 66 81 38 ff
7f 75 07 0f ab 9d 50 ff
(XEN) [ 3315.431973] Xen stack trace from rsp=83102dba7bb8:
(XEN) [ 3315.431976]00012dba7c78 83102db9d9c0
82d0802da560 00010002
(XEN) [ 3315.431984]8300bdb7e000 82d080121afd
82d08035ebb0 82d08035eba8
(XEN) [ 3315.431992]ff00 00010028
0011088c4001 0001
(XEN) [ 3315.431999]83102dba7c38 
 
(XEN) [ 3315.432005] 0003
83102dba7c98 0206
(XEN) [ 3315.432011]0292 00fb2dba7c78
0206 82d08032ab78
(XEN) [ 3315.432018]fffddfb7 82d080121a1c
83102dba7ca8 2dba7ca8
(XEN) [ 3315.432025]ff00 8328
83102dba7ce8 82d08013dc34
(XEN) [ 3315.432032] 0010
0048 0048
(XEN) [ 3315.432038]0001 83110889e8c0
83102dba7d38 82d08013dff0
(XEN) [ 3315.432045]83102dba7d28 8300bdb7e000
831108d7a000 0040
(XEN) [ 3315.432053]83110889e980 83110889e8c0
83102dba7d38 82d080129804
(XEN) [ 3315.432060]83102dba7d78 82d080129833
83102dba7d98 8300bdb7e000
(XEN) [ 3315.432068]831108d7a000 0040
0001 83110889e8c0
(XEN) [ 3315.432074]83102dba7db8 82d08012f930
0006 8300bdb7e000
(XEN) [ 3315.432081]831108d7a000 001b
0006 0020
(XEN) [ 3315.432087]83102dba7de8 82d080107847
831108d7a000 0006
(XEN) [ 3315.432095]7f9f7007b004 83102db9d9c0
83102dba7f08 82d08010537c
(XEN) [ 3315.432102]8300bd8fd000 07ff8300
001b 001b
(XEN) [ 3315.432109]8310031540c0 0003
83102dba7e48 83103ffe37c0
(XEN) [ 3315.432116] Xen call trace:
(XEN) [ 3315.432122][]
sched_credit.c#_csched_cpu_pick+0x1af/0x549
(XEN) [ 3315.432129][]
page_alloc.c#alloc_heap_pages+0x604/0x6d7
(XEN) [ 3315.432135][]
page_alloc.c#alloc_heap_pages+0x523/0x6d7
(XEN) [ 3315.432141][] xmem_pool_alloc+0x43f/0x46d
(XEN) [ 3315.432147][] _xmalloc+0xcb/0x1fc
(XEN) [ 3315.432153][]
sched_credit.c#csched_cpu_pick+0x1b/0x1d
(XEN) [ 3315.432160][]
sched_credit.c#csched_vcpu_insert+0x2d/0x14f
(XEN) [ 3315.432166][] sched_init_vcpu+0x24e/0x2ec
(XEN) [ 3315.432173][] alloc_vcpu+0x1d1/0x2ca
(XEN) [ 3315.432178][] do_domctl+0x98f/0x1de3
(XEN) [ 3315.432189][] lstar_enter+0x9b/0xa0
(XEN) [ 3315.432192]
(XEN) [ 3317.105524]
(XEN) [ 3317.114726] 
(XEN) [ 3317.139954] Panic on CPU 3:
(XEN) [ 3317.155197] GENERAL PROTECTION FAULT
(XEN) [ 3317.174247] [error_code=]
(XEN) [ 3317.190248] 
(XEN) [ 3317.215469]
(XEN) [ 3317.224674] Reboot in five seconds...
(XEN) [ 3317.243913] Executing kexec i

Re: [Xen-devel] [XTF PATCH 3/3] xtf-runner: support two modes for getting output

2016-08-11 Thread Andrew Cooper
On 11/08/16 12:19, Wei Liu wrote:
> On Thu, Aug 11, 2016 at 12:08:03PM +0100, Ian Jackson wrote:
>> Andrew Cooper writes ("Re: [XTF PATCH 3/3] xtf-runner: support two modes for 
>> getting output"):
>>> Can't you just open the log file as read, seek to the end, run the test
>>> and read again from the same FD?  It would be rather more simple than
>>> marking the logs.
>> This is a good suggestion, I think.  Much better than mine in my other
>> email.
>>
>>> Sadly, whatever method we use here is going to have to be clever enough
>>> to cope with the log files being rotated, and I can't think of a clever
>>> way of doing that ATM.
>> In an automatic test system the logfiles won't get big enough to be
>> rotated.  In a manual setup this is just a "random hazard" which you
>> seem to be prepared to accept...
>>
> That's what I thought as well.
>
>> But: it is easy for the runner to see if the logfile was rotated.  It
>> can open the logfile a second time and check if it has the same inum.
>>
> For now let's keep things simple. Doesn't seem to buy us much even if we
> implement this.

I expect anyone developing tests by hand to be using upstream, and
therefore the non-logfile mode.

It is worth a warning in the code about logfile rotation, but lets not
waste time now for a feature people will hopefully never need.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [xen-unstable-smoke test] 100412: tolerable all pass - PUSHED

2016-08-11 Thread osstest service owner
flight 100412 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/100412/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  e64f22df3c1a0b70031ad6e8c0fc875d27cc5c3c
baseline version:
 xen  6480cc6280e955d1245d8dfb2456d2b830240c74

Last test of basis   100396  2016-08-10 16:02:02 Z0 days
Testing same since   100412  2016-08-11 10:03:45 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Kevin Tian 

jobs:
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-amd64-amd64-xl-qemuu-debianhvm-i386 pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

+ branch=xen-unstable-smoke
+ revision=e64f22df3c1a0b70031ad6e8c0fc875d27cc5c3c
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x '!=' x/home/osstest/repos/lock ']'
++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock
++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push xen-unstable-smoke 
e64f22df3c1a0b70031ad6e8c0fc875d27cc5c3c
+ branch=xen-unstable-smoke
+ revision=e64f22df3c1a0b70031ad6e8c0fc875d27cc5c3c
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']'
+ . ./cri-common
++ . ./cri-getconfig
++ umask 002
+ select_xenbranch
+ case "$branch" in
+ tree=xen
+ xenbranch=xen-unstable-smoke
+ qemuubranch=qemu-upstream-unstable
+ '[' xxen = xlinux ']'
+ linuxbranch=
+ '[' xqemu-upstream-unstable = x ']'
+ select_prevxenbranch
++ ./cri-getprevxenbranch xen-unstable-smoke
+ prevxenbranch=xen-4.7-testing
+ '[' xe64f22df3c1a0b70031ad6e8c0fc875d27cc5c3c = x ']'
+ : tested/2.6.39.x
+ . ./ap-common
++ : osst...@xenbits.xen.org
+++ getconfig OsstestUpstream
+++ perl -e '
use Osstest;
readglobalconfig();
print $c{"OsstestUpstream"} or die $!;
'
++ :
++ : git://xenbits.xen.org/xen.git
++ : osst...@xenbits.xen.org:/home/xen/git/xen.git
++ : git://xenbits.xen.org/qemu-xen-traditional.git
++ : git://git.kernel.org
++ : git://git.kernel.org/pub/scm/linux/kernel/git
++ : git
++ : git://xenbits.xen.org/libvirt.git
++ : osst...@xenbits.xen.org:/home/xen/git/libvirt.git
++ : git://xenbits.xen.org/libvirt.git
++ : git://xenbits.xen.org/rumpuser-xen.git
++ : git
++ : git://xenbits.xen.org/rumpuser-xen.git
++ : osst...@xenbits.xen.org:/home/xen/git/rumpuser-xen.git
+++ besteffort_repo https://github.com/rumpkernel/rumpkernel-netbsd-src
+++ local repo=https://github.com/rumpkernel/rumpkernel-netbsd-src
+++ cached_repo https://github.com/rumpkernel/rumpkernel-netbsd-src 
'[fetch=try]'
+++ local repo=https://github.com/rumpkernel/rumpkernel-netbsd-src
+++ local 'options=[fetch=try]'
 getconfig GitCacheProxy
 perl -e '
use Osstest;
readglobalconfig();
print $c{"GitCacheProxy"} or die $!;
'
+++ local cache=git://cache:9419/
+++ '[' xgit://cache:9419/ '!=' x ']'
+++ echo 
'git://cache:94

[Xen-devel] [PATCH 0/7] x86emul: misc small adjustments

2016-08-11 Thread Jan Beulich
This is mainly to reduce the amount of fetching of insn bytes past
the initial phase of decoding, as a little bit of preparation for the
intended splitting of decoding and execution stages. Having fully
read all insn bytes by that point would be specifically required if
the first stage alone should become usable to simply size an
instruction

There are, however, unrelated other improvements, with the
fundamental common attribute being the attempt to avoid
open coding in the handling of specific instructions what can be
done by generic code.

1: don't special case fetching the immediate of PUSH
2: don't special case fetching immediates of near and short branches
3: x86emul: all push flavors are data moves
4: fold SrcImmByte fetching
5: don't special case fetching unsigned 8-bit immediates
6: use DstEax where possible
7: introduce SrcImm16

Signed-off-by: Jan Beulich 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 16/19] mini-os: map page allocator's bitmap to virtual kernel area for ballooning

2016-08-11 Thread Samuel Thibault
Juergen Gross, on Thu 11 Aug 2016 13:06:36 +0200, wrote:
> In case of CONFIG_BALLOON the page allocator's bitmap needs some space
> to be able to grow. Remap it to kernel virtual area if the preallocated
> area isn't large enough.
> 
> Signed-off-by: Juergen Gross 

Reviewed-by: Samuel Thibault 

> ---
> V4: - mm_bitmap* -> mm_alloc_bitmap* as requested by Samuel Thibault
> 
> V3: - add assertion as requested by Samuel Thibault
> - rename functions to have mm_ prefix as requested by Samuel Thibault
> ---
>  balloon.c | 18 ++
>  include/balloon.h |  2 ++
>  include/mm.h  |  6 ++
>  mm.c  | 44 +++-
>  4 files changed, 49 insertions(+), 21 deletions(-)
> 
> diff --git a/balloon.c b/balloon.c
> index 1ec113d..0a3342c 100644
> --- a/balloon.c
> +++ b/balloon.c
> @@ -44,3 +44,21 @@ void get_max_pages(void)
>  nr_max_pages = ret;
>  printk("Maximum memory size: %ld pages\n", nr_max_pages);
>  }
> +
> +void mm_alloc_bitmap_remap(void)
> +{
> +unsigned long i;
> +
> +if ( mm_alloc_bitmap_size >= ((nr_max_pages + 1) >> (PAGE_SHIFT + 3)) )
> +return;
> +
> +for ( i = 0; i < mm_alloc_bitmap_size; i += PAGE_SIZE )
> +{
> +map_frame_rw(virt_kernel_area_end + i,
> + virt_to_mfn((unsigned long)(mm_alloc_bitmap) + i));
> +}
> +
> +mm_alloc_bitmap = (unsigned long *)virt_kernel_area_end;
> +virt_kernel_area_end += round_pgup((nr_max_pages + 1) >> (PAGE_SHIFT + 
> 3));
> +ASSERT(virt_kernel_area_end <= VIRT_DEMAND_AREA);
> +}
> diff --git a/include/balloon.h b/include/balloon.h
> index b0d0ebf..9154f44 100644
> --- a/include/balloon.h
> +++ b/include/balloon.h
> @@ -31,11 +31,13 @@ extern unsigned long virt_kernel_area_end;
>  
>  void get_max_pages(void);
>  void arch_remap_p2m(unsigned long max_pfn);
> +void mm_alloc_bitmap_remap(void);
>  
>  #else /* CONFIG_BALLOON */
>  
>  static inline void get_max_pages(void) { }
>  static inline void arch_remap_p2m(unsigned long max_pfn) { }
> +static inline void mm_alloc_bitmap_remap(void) { }
>  
>  #endif /* CONFIG_BALLOON */
>  #endif /* _BALLOON_H_ */
> diff --git a/include/mm.h b/include/mm.h
> index 6add683..fc3128b 100644
> --- a/include/mm.h
> +++ b/include/mm.h
> @@ -42,8 +42,14 @@
>  #define STACK_SIZE_PAGE_ORDER __STACK_SIZE_PAGE_ORDER
>  #define STACK_SIZE __STACK_SIZE
>  
> +#define round_pgdown(_p)  ((_p) & PAGE_MASK)
> +#define round_pgup(_p)(((_p) + (PAGE_SIZE - 1)) & PAGE_MASK)
> +
>  extern unsigned long nr_free_pages;
>  
> +extern unsigned long *mm_alloc_bitmap;
> +extern unsigned long mm_alloc_bitmap_size;
> +
>  void init_mm(void);
>  unsigned long alloc_pages(int order);
>  #define alloc_page()alloc_pages(0)
> diff --git a/mm.c b/mm.c
> index 707a3e0..9e3a479 100644
> --- a/mm.c
> +++ b/mm.c
> @@ -48,11 +48,14 @@
>   *  One bit per page of memory. Bit set => page is allocated.
>   */
>  
> -static unsigned long *alloc_bitmap;
> +unsigned long *mm_alloc_bitmap;
> +unsigned long mm_alloc_bitmap_size;
> +
>  #define PAGES_PER_MAPWORD (sizeof(unsigned long) * 8)
>  
>  #define allocated_in_map(_pn) \
> -(alloc_bitmap[(_pn)/PAGES_PER_MAPWORD] & 
> (1UL<<((_pn)&(PAGES_PER_MAPWORD-1
> +(mm_alloc_bitmap[(_pn) / PAGES_PER_MAPWORD] & \
> + (1UL << ((_pn) & (PAGES_PER_MAPWORD - 1
>  
>  unsigned long nr_free_pages;
>  
> @@ -61,8 +64,8 @@ unsigned long nr_free_pages;
>   *  -(1<= n. 
>   *  (1<   * Variable names in map_{alloc,free}:
> - *  *_idx == Index into `alloc_bitmap' array.
> - *  *_off == Bit offset within an element of the `alloc_bitmap' array.
> + *  *_idx == Index into `mm_alloc_bitmap' array.
> + *  *_off == Bit offset within an element of the `mm_alloc_bitmap' array.
>   */
>  
>  static void map_alloc(unsigned long first_page, unsigned long nr_pages)
> @@ -76,13 +79,13 @@ static void map_alloc(unsigned long first_page, unsigned 
> long nr_pages)
>  
>  if ( curr_idx == end_idx )
>  {
> -alloc_bitmap[curr_idx] |= ((1UL< +mm_alloc_bitmap[curr_idx] |= ((1UL<  }
>  else 
>  {
> -alloc_bitmap[curr_idx] |= -(1UL< -while ( ++curr_idx < end_idx ) alloc_bitmap[curr_idx] = ~0UL;
> -alloc_bitmap[curr_idx] |= (1UL< +mm_alloc_bitmap[curr_idx] |= -(1UL< +while ( ++curr_idx < end_idx ) mm_alloc_bitmap[curr_idx] = ~0UL;
> +mm_alloc_bitmap[curr_idx] |= (1UL<  }
>  
>  nr_free_pages -= nr_pages;
> @@ -102,13 +105,13 @@ static void map_free(unsigned long first_page, unsigned 
> long nr_pages)
>  
>  if ( curr_idx == end_idx )
>  {
> -alloc_bitmap[curr_idx] &= -(1UL< +mm_alloc_bitmap[curr_idx] &= -(1UL<  }
>  else 
>  {
> -alloc_bitmap[curr_idx] &= (1UL< -while ( ++curr_idx != end_idx ) alloc_bitmap[curr_idx] = 0;
> -alloc_bitmap[curr_idx] &= -(1UL< +mm_alloc_bitmap[curr_idx] &= (1UL< +while ( ++curr_idx != end_idx ) mm_alloc

Re: [Xen-devel] [PATCH v2 1/2] x86/altp2m: use __get_gfn_type_access to avoid lock conflicts

2016-08-11 Thread Jan Beulich
>>> On 10.08.16 at 17:00,  wrote:
> From: Tamas K Lengyel 
> 
> Use __get_gfn_type_access instead of get_gfn_type_access when checking
> the hostp2m entries during altp2m mem_access setting and gfn remapping
> to avoid a lock conflict which can make dom0 freeze. During mem_access
> setting the hp2m is already locked. For gfn remapping we change the flow
> to lock the hp2m before locking the ap2m.
> 
> Signed-off-by: Tamas K Lengyel 
> Reviewed-by: Razvan Cojocaru 

Reviewed-by: Jan Beulich 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 4/9] x86/pv: Implement pv_hypercall() in C

2016-08-11 Thread Andrew Cooper
On 02/08/16 14:12, Jan Beulich wrote:
 On 18.07.16 at 11:51,  wrote:
>> +long pv_hypercall(struct cpu_user_regs *regs)
>> +{
>> +struct vcpu *curr = current;
>> +#ifndef NDEBUG
>> +unsigned long old_rip = regs->rip;
>> +#endif
>> +long ret;
>> +uint32_t eax = regs->eax;
>> +
>> +ASSERT(curr->arch.flags & TF_kernel_mode);
> I'm afraid TF_kernel_mode can't be relied on for 32-bit guests, so
> this needs to move into the if() below.
>
>> +if ( (eax >= NR_hypercalls) || !hypercall_table[eax] )
>> + return -ENOSYS;
>> +
>> +if ( !is_pv_32bit_vcpu(curr) )
>> +{
>> +unsigned long rdi = regs->rdi;
>> +unsigned long rsi = regs->rsi;
>> +unsigned long rdx = regs->rdx;
>> +unsigned long r10 = regs->r10;
>> +unsigned long r8 = regs->r8;
>> +unsigned long r9 = regs->r9;
>> +
>> +#ifndef NDEBUG
>> +/* Deliberately corrupt parameter regs not used by this hypercall. 
>> */
>> +switch ( hypercall_args_table[eax] )
>> +{
>> +case 0: rdi = 0xdeadbeefdeadf00dUL;
>> +case 1: rsi = 0xdeadbeefdeadf00dUL;
>> +case 2: rdx = 0xdeadbeefdeadf00dUL;
>> +case 3: r10 = 0xdeadbeefdeadf00dUL;
>> +case 4: r8 = 0xdeadbeefdeadf00dUL;
>> +case 5: r9 = 0xdeadbeefdeadf00dUL;
> Without comments, aren't these going to become 5 new Coverity
> issues?
>
>> +}
>> +#endif
>> +if ( unlikely(tb_init_done) )
>> +{
>> +unsigned long args[6] = { rdi, rsi, rdx, r10, r8, r9 };
>> +
>> +__trace_hypercall(TRC_PV_HYPERCALL_V2, eax, args);
>> +}
>> +
>> +ret = hypercall_table[eax](rdi, rsi, rdx, r10, r8, r9);
>> +
>> +#ifndef NDEBUG
>> +if ( regs->rip == old_rip )
>> +{
>> +/* Deliberately corrupt parameter regs used by this hypercall. 
>> */
>> +switch ( hypercall_args_table[eax] )
>> +{
>> +case 6: regs->r9  = 0xdeadbeefdeadf00dUL;
>> +case 5: regs->r8  = 0xdeadbeefdeadf00dUL;
>> +case 4: regs->r10 = 0xdeadbeefdeadf00dUL;
>> +case 3: regs->edx = 0xdeadbeefdeadf00dUL;
>> +case 2: regs->esi = 0xdeadbeefdeadf00dUL;
>> +case 1: regs->edi = 0xdeadbeefdeadf00dUL;
> For consistency with earlier code, lease use rdx, rsi, and rdi here.
>
>> +#ifndef NDEBUG
>> +if ( regs->rip == old_rip )
>> +{
>> +/* Deliberately corrupt parameter regs used by this hypercall. 
>> */
>> +switch ( compat_hypercall_args_table[eax] )
>> +{
>> +case 6: regs->ebp = 0xdeadf00d;
>> +case 5: regs->edi = 0xdeadf00d;
>> +case 4: regs->esi = 0xdeadf00d;
>> +case 3: regs->edx = 0xdeadf00d;
>> +case 2: regs->ecx = 0xdeadf00d;
>> +case 1: regs->ebx = 0xdeadf00d;
> Please use 32-bit stores here.
>
>> --- a/xen/arch/x86/x86_64/compat/entry.S
>> +++ b/xen/arch/x86/x86_64/compat/entry.S
>> @@ -25,70 +25,10 @@ UNLIKELY_START(ne, msi_check)
>>  LOAD_C_CLOBBERED compat=1 ax=0
>>  UNLIKELY_END(msi_check)
>>  
>> -movl  UREGS_rax(%rsp),%eax
>>  GET_CURRENT(bx)
>>  
>> -cmpl  $NR_hypercalls,%eax
>> -jae   compat_bad_hypercall
>> -#ifndef NDEBUG
>> -/* Deliberately corrupt parameter regs not used by this hypercall. 
>> */
>> -pushq UREGS_rbx(%rsp); pushq %rcx; pushq %rdx; pushq %rsi; pushq 
>> %rdi
>> -pushq UREGS_rbp+5*8(%rsp)
>> -leaq  compat_hypercall_args_table(%rip),%r10
>> -movl  $6,%ecx
>> -subb  (%r10,%rax,1),%cl
>> -movq  %rsp,%rdi
>> -movl  $0xDEADBEEF,%eax
>> -rep   stosq
>> -popq  %r8 ; popq  %r9 ; xchgl %r8d,%r9d /* Args 5&6: zero extend */
>> -popq  %rdx; popq  %rcx; xchgl %edx,%ecx /* Args 3&4: zero extend */
>> -popq  %rdi; popq  %rsi; xchgl %edi,%esi /* Args 1&2: zero extend */
>> -movl  UREGS_rax(%rsp),%eax
>> -pushq %rax
>> -pushq UREGS_rip+8(%rsp)
>> -#define SHADOW_BYTES 16 /* Shadow EIP + shadow hypercall # */
>> -#else
>> -/* Relocate argument registers and zero-extend to 64 bits. */
>> -xchgl %ecx,%esi  /* Arg 2, Arg 4 */
>> -movl  %edx,%edx  /* Arg 3*/
>> -movl  %edi,%r8d  /* Arg 5*/
>> -movl  %ebp,%r9d  /* Arg 6*/
>> -movl  UREGS_rbx(%rsp),%edi   /* Arg 1*/
>> -#define SHADOW_BYTES 0  /* No on-stack shadow state */
>> -#endif
>> -cmpb  $0,tb_init_done(%rip)
>> -UNLIKELY_START(ne, compat_trace)
>> -call  __trace_hypercall_entry
>> -/* Restore the registers that __trace_hypercall_entry clobbered. */
>> -movl  UREGS_rax+SHADOW_BYTES(%rsp),%eax   /* Hypercall #  */
>> -movl  UREGS_rbx+SHADOW_BYTES(%rsp),%edi   /* Arg 1*/
>> -movl  UREGS_rcx+SHADOW_BYTES(%rsp),%esi   /* Arg 2*/

Re: [Xen-devel] [PATCH 5/9] x86/hypercall: Move the hypercall tables into C

2016-08-11 Thread Andrew Cooper
On 02/08/16 14:40, Jan Beulich wrote:
 On 02.08.16 at 15:30,  wrote:
>> On 02/08/16 14:23, Jan Beulich wrote:
>> On 18.07.16 at 11:51,  wrote:
 +hypercall_fn_t *const hypercall_table[NR_hypercalls] = {
 +HYPERCALL(set_trap_table),
 +HYPERCALL(mmu_update),
 +HYPERCALL(set_gdt),
 +HYPERCALL(stack_switch),
 +HYPERCALL(set_callbacks),
 +HYPERCALL(fpu_taskswitch),
 +HYPERCALL(sched_op_compat),
 +HYPERCALL(platform_op),
 +HYPERCALL(set_debugreg),
 +HYPERCALL(get_debugreg),
 +HYPERCALL(update_descriptor),
 +HYPERCALL(memory_op),
 +HYPERCALL(multicall),
 +HYPERCALL(update_va_mapping),
 +HYPERCALL(set_timer_op),
 +HYPERCALL(event_channel_op_compat),
 +HYPERCALL(xen_version),
 +HYPERCALL(console_io),
 +HYPERCALL(physdev_op_compat),
 +HYPERCALL(grant_table_op),
 +HYPERCALL(vm_assist),
 +HYPERCALL(update_va_mapping_otherdomain),
 +HYPERCALL(iret),
 +HYPERCALL(vcpu_op),
 +HYPERCALL(set_segment_base),
 +HYPERCALL(mmuext_op),
 +HYPERCALL(xsm_op),
 +HYPERCALL(nmi_op),
 +HYPERCALL(sched_op),
 +HYPERCALL(callback_op),
 +#ifdef CONFIG_XENOPROF
 +HYPERCALL(xenoprof_op),
 +#endif
 +HYPERCALL(event_channel_op),
 +HYPERCALL(physdev_op),
 +HYPERCALL(hvm_op),
 +HYPERCALL(sysctl),
 +HYPERCALL(domctl),
 +#ifdef CONFIG_KEXEC
 +HYPERCALL(kexec_op),
 +#endif
 +#ifdef CONFIG_TMEM
 +HYPERCALL(tmem_op),
 +#endif
>>> To be honest I'd prefer the necessary #ifdef-ery to live in hypercall.h,
>>> the more that then ARM could (if they want) benefit from that too.
>> Which #ifdefary?
>>
>> HYPERCALL() can't be used by ARM.
> I mean just the #ifdef-s above, not the HYPERCALL() lines. Clearly
> you can do
>
> #ifndef CONFIG_TMEM
> # define do_tmem_op NULL
> #endif
>
> and alike in the header?

This isn't any neater IMO, and still risks getting a NULL in one half of
a native/compat pair.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 2/2] x86/altp2m: allow specifying external-only use-case

2016-08-11 Thread Jan Beulich
>>> On 10.08.16 at 17:00,  wrote:
> @@ -5238,18 +5238,19 @@ static int do_altp2m_op(
>  goto out;
>  }
>  
> -if ( (rc = xsm_hvm_altp2mhvm_op(XSM_TARGET, d)) )
> +if ( !d->arch.hvm_domain.params[HVM_PARAM_ALTP2M] )
> +{
> +rc = -EINVAL;
> +goto out;
> +}
> +
> +if ( (rc = xsm_hvm_altp2mhvm_op(XSM_OTHER, d,
> +d->arch.hvm_domain.params[HVM_PARAM_ALTP2M])) )

I'm sorry that this didn't occur to me on v1 already, but is there
really a need for passing this extra argument, when the callee
could - if it cared in the first place - read the value itself?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 1/7] x86emul: don't special case fetching the immediate of PUSH

2016-08-11 Thread Jan Beulich
These immediates follow the standard patterns in all modes, so they're
better fetched by the generic source operand handling code.

To facilitate testing, instead of adding yet another of these pretty
convoluted individual test cases, simply introduce another blowfish run
with -mno-accumulate-outgoing-args (the additional -Dstatic is to
keep the compiler from converting the calling convention to
"regparm(3)", which I did observe it does).

To make this introduction of a new blowfish pass (and potential further
ones later one) have less impact on the readability of the final code,
abstract all such "binary blob" executions via a table to iterate
through.

The resulting native code execution adjustment also uncovered a lack of
clobbers on the asm() in the 64-bit case, which is being fixed at once.

Signed-off-by: Jan Beulich 

--- a/tools/tests/x86_emulator/Makefile
+++ b/tools/tests/x86_emulator/Makefile
@@ -11,21 +11,21 @@ all: $(TARGET)
 run: $(TARGET)
./$(TARGET)
 
-.PHONY: blowfish.h
-blowfish.h:
-   rm -f blowfish.bin
-   XEN_TARGET_ARCH=x86_32 make -f blowfish.mk all
-   (echo "static unsigned int blowfish32_code[] = {"; \
-   od -v -t x blowfish.bin | sed 's/^[0-9]* /0x/' | sed 's/ /, 0x/g' | sed 
's/$$/,/';\
-   echo "};") >$@
-   rm -f blowfish.bin
-ifeq ($(XEN_COMPILE_ARCH),x86_64)
-   XEN_TARGET_ARCH=x86_64 make -f blowfish.mk all
-   (echo "static unsigned int blowfish64_code[] = {"; \
-   od -v -t x blowfish.bin | sed 's/^[0-9]* /0x/' | sed 's/ /, 0x/g' | sed 
's/$$/,/';\
-   echo "};") >>$@
-   rm -f blowfish.bin
-endif
+cflags-x86_32 := "-mno-accumulate-outgoing-args -Dstatic="
+
+blowfish.h: blowfish.c blowfish.mk Makefile
+   rm -f $@.new blowfish.bin
+   $(foreach arch,$(filter-out $(XEN_COMPILE_ARCH),x86_32) 
$(XEN_COMPILE_ARCH), \
+   for cflags in "" $(cflags-$(arch)); do \
+   $(MAKE) -f blowfish.mk XEN_TARGET_ARCH=$(arch) 
BLOWFISH_CFLAGS="$$cflags" all; \
+   flavor=$$(echo $${cflags} | sed -e 's, .*,,' -e 'y,-=,__,') ; \
+   (echo "static unsigned int blowfish_$(arch)$${flavor}[] = {"; \
+od -v -t x blowfish.bin | sed -e 's/^[0-9]* /0x/' -e 's/ /, 
0x/g' -e 's/$$/,/'; \
+echo "};") >>$@.new; \
+   rm -f blowfish.bin; \
+   done; \
+   )
+   mv $@.new $@
 
 $(TARGET): x86_emulate.o test_x86_emulator.o
$(HOSTCC) -o $@ $^
--- a/tools/tests/x86_emulator/blowfish.mk
+++ b/tools/tests/x86_emulator/blowfish.mk
@@ -5,7 +5,7 @@ include $(XEN_ROOT)/tools/Rules.mk
 
 $(call cc-options-add,CFLAGS,CC,$(EMBEDDED_EXTRA_CFLAGS))
 
-CFLAGS += -fno-builtin -msoft-float
+CFLAGS += -fno-builtin -msoft-float $(BLOWFISH_CFLAGS)
 
 .PHONY: all
 all: blowfish.bin
--- a/tools/tests/x86_emulator/test_x86_emulator.c
+++ b/tools/tests/x86_emulator/test_x86_emulator.c
@@ -1,4 +1,5 @@
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -12,6 +13,21 @@
 #include "x86_emulate/x86_emulate.h"
 #include "blowfish.h"
 
+static const struct {
+const void *code;
+size_t size;
+unsigned int bitness;
+const char*name;
+} blobs[] = {
+{ blowfish_x86_32, sizeof(blowfish_x86_32), 32, "blowfish" },
+{ blowfish_x86_32_mno_accumulate_outgoing_args,
+  sizeof(blowfish_x86_32_mno_accumulate_outgoing_args),
+  32, "blowfish (push)" },
+#ifdef __x86_64__
+{ blowfish_x86_64, sizeof(blowfish_x86_64), 64, "blowfish" },
+#endif
+};
+
 #define MMAP_SZ 16384
 
 /* EFLAGS bit definitions. */
@@ -943,25 +959,19 @@ int main(int argc, char **argv)
 else
 printf("skipped\n");
 
-for ( j = 1; j <= 2; j++ )
+for ( j = 0; j < sizeof(blobs) / sizeof(*blobs); j++ )
 {
-#if defined(__i386__)
-if ( j == 2 ) break;
-memcpy(res, blowfish32_code, sizeof(blowfish32_code));
-#else
-ctxt.addr_size = 16 << j;
-ctxt.sp_size   = 16 << j;
-memcpy(res, (j == 1) ? blowfish32_code : blowfish64_code,
-   (j == 1) ? sizeof(blowfish32_code) : sizeof(blowfish64_code));
-#endif
-printf("Testing blowfish %u-bit code sequence", j*32);
+memcpy(res, blobs[j].code, blobs[j].size);
+ctxt.addr_size = ctxt.sp_size = blobs[j].bitness;
+
+printf("Testing %s %u-bit code sequence",
+   blobs[j].name, ctxt.addr_size);
 regs.eax = 2;
 regs.edx = 1;
 regs.eip = (unsigned long)res;
 regs.esp = (unsigned long)res + MMAP_SZ - 4;
-if ( j == 2 )
+if ( ctxt.addr_size == 64 )
 {
-ctxt.addr_size = ctxt.sp_size = 64;
 *(uint32_t *)(unsigned long)regs.esp = 0;
 regs.esp -= 4;
 }
@@ -983,20 +993,27 @@ int main(int argc, char **argv)
  (regs.eax != 2) || (regs.edx != 1) )
 goto fail;
 printf("okay\n");
-}
 
-printf("%-40s", "Testing blowfish native execution...");
-asm volatile (
+if ( ctxt.addr_size != sizeof(vo

[Xen-devel] [PATCH 2/7] x86emul: don't special case fetching immediates of near and short branches

2016-08-11 Thread Jan Beulich
These immediates follow the standard patterns in all modes, so they're
better fetched by the generic source operand handling code.

Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -98,11 +98,15 @@ static uint8_t opcode_table[256] = {
 DstImplicit|SrcImmByte|Mov, DstReg|SrcImmByte|ModRM|Mov,
 ImplicitOps|Mov, ImplicitOps|Mov, ImplicitOps|Mov, ImplicitOps|Mov,
 /* 0x70 - 0x77 */
-ImplicitOps, ImplicitOps, ImplicitOps, ImplicitOps,
-ImplicitOps, ImplicitOps, ImplicitOps, ImplicitOps,
+DstImplicit|SrcImmByte, DstImplicit|SrcImmByte,
+DstImplicit|SrcImmByte, DstImplicit|SrcImmByte,
+DstImplicit|SrcImmByte, DstImplicit|SrcImmByte,
+DstImplicit|SrcImmByte, DstImplicit|SrcImmByte,
 /* 0x78 - 0x7F */
-ImplicitOps, ImplicitOps, ImplicitOps, ImplicitOps,
-ImplicitOps, ImplicitOps, ImplicitOps, ImplicitOps,
+DstImplicit|SrcImmByte, DstImplicit|SrcImmByte,
+DstImplicit|SrcImmByte, DstImplicit|SrcImmByte,
+DstImplicit|SrcImmByte, DstImplicit|SrcImmByte,
+DstImplicit|SrcImmByte, DstImplicit|SrcImmByte,
 /* 0x80 - 0x87 */
 ByteOp|DstMem|SrcImm|ModRM, DstMem|SrcImm|ModRM,
 ByteOp|DstMem|SrcImm|ModRM, DstMem|SrcImmByte|ModRM,
@@ -155,10 +159,12 @@ static uint8_t opcode_table[256] = {
 ImplicitOps|ModRM|Mov, ImplicitOps|ModRM|Mov,
 ImplicitOps|ModRM|Mov, ImplicitOps|ModRM|Mov,
 /* 0xE0 - 0xE7 */
-ImplicitOps, ImplicitOps, ImplicitOps, ImplicitOps,
+DstImplicit|SrcImmByte, DstImplicit|SrcImmByte,
+DstImplicit|SrcImmByte, DstImplicit|SrcImmByte,
 ImplicitOps, ImplicitOps, ImplicitOps, ImplicitOps,
 /* 0xE8 - 0xEF */
-ImplicitOps, ImplicitOps, ImplicitOps, ImplicitOps,
+DstImplicit|SrcImm|Mov, DstImplicit|SrcImm,
+ImplicitOps, DstImplicit|SrcImmByte,
 ImplicitOps, ImplicitOps, ImplicitOps, ImplicitOps,
 /* 0xF0 - 0xF7 */
 0, ImplicitOps, 0, 0,
@@ -206,11 +212,15 @@ static uint8_t twobyte_table[256] = {
 /* 0x70 - 0x7F */
 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ImplicitOps|ModRM,
 /* 0x80 - 0x87 */
-ImplicitOps, ImplicitOps, ImplicitOps, ImplicitOps,
-ImplicitOps, ImplicitOps, ImplicitOps, ImplicitOps,
+DstImplicit|SrcImm, DstImplicit|SrcImm,
+DstImplicit|SrcImm, DstImplicit|SrcImm,
+DstImplicit|SrcImm, DstImplicit|SrcImm,
+DstImplicit|SrcImm, DstImplicit|SrcImm,
 /* 0x88 - 0x8F */
-ImplicitOps, ImplicitOps, ImplicitOps, ImplicitOps,
-ImplicitOps, ImplicitOps, ImplicitOps, ImplicitOps,
+DstImplicit|SrcImm, DstImplicit|SrcImm,
+DstImplicit|SrcImm, DstImplicit|SrcImm,
+DstImplicit|SrcImm, DstImplicit|SrcImm,
+DstImplicit|SrcImm, DstImplicit|SrcImm,
 /* 0x90 - 0x97 */
 ByteOp|DstMem|SrcNone|ModRM|Mov, ByteOp|DstMem|SrcNone|ModRM|Mov,
 ByteOp|DstMem|SrcNone|ModRM|Mov, ByteOp|DstMem|SrcNone|ModRM|Mov,
@@ -2415,12 +2425,10 @@ x86_emulate(
 break;
 }
 
-case 0x70 ... 0x7f: /* jcc (short) */ {
-int rel = insn_fetch_type(int8_t);
+case 0x70 ... 0x7f: /* jcc (short) */
 if ( test_cc(b, _regs.eflags) )
-jmp_rel(rel);
+jmp_rel((int32_t)src.val);
 break;
-}
 
 case 0x82: /* Grp1 (x86/32 only) */
 generate_exception_if(mode_64bit(), EXC_UD, -1);
@@ -3461,8 +3469,8 @@ x86_emulate(
 break;
 
 case 0xe0 ... 0xe2: /* loop{,z,nz} */ {
-int rel = insn_fetch_type(int8_t);
 int do_jmp = !(_regs.eflags & EFLG_ZF); /* loopnz */
+
 if ( b == 0xe1 )
 do_jmp = !do_jmp; /* loopz */
 else if ( b == 0xe2 )
@@ -3481,17 +3489,15 @@ x86_emulate(
 break;
 }
 if ( do_jmp )
-jmp_rel(rel);
+jmp_rel((int32_t)src.val);
 break;
 }
 
-case 0xe3: /* jcxz/jecxz (short) */ {
-int rel = insn_fetch_type(int8_t);
+case 0xe3: /* jcxz/jecxz (short) */
 if ( (ad_bytes == 2) ? !(uint16_t)_regs.ecx :
  (ad_bytes == 4) ? !(uint32_t)_regs.ecx : !_regs.ecx )
-jmp_rel(rel);
+jmp_rel((int32_t)src.val);
 break;
-}
 
 case 0xe4: /* in imm8,%al */
 case 0xe5: /* in imm8,%eax */
@@ -3528,22 +3534,18 @@ x86_emulate(
 }
 
 case 0xe8: /* call (near) */ {
-int rel = ((op_bytes == 2)
-   ? (int32_t)insn_fetch_type(int16_t)
-   : insn_fetch_type(int32_t));
+int32_t rel = src.val;
+
 op_bytes = ((op_bytes == 4) && mode_64bit()) ? 8 : op_bytes;
 src.val = _regs.eip;
 jmp_rel(rel);
 goto push;
 }
 
-case 0xe9: /* jmp (near) */ {
-int rel = ((op_bytes == 2)
-   ? (int32_t)insn_fetch_type(int16_t)
-   : insn_fetch_type(int32_t));
-jmp_rel(rel);
+case 0xe9: /* jmp (near) */
+case 0xeb: /* jmp (short) */
+jmp_rel((int32_t)src.val);
 break;
-}
 
 case 0xea: /* jmp (far

[Xen-devel] [PATCH 3/7] x86emul: all push flavors are data moves

2016-08-11 Thread Jan Beulich
Make all paths leading to the "push" label have the Mov flag set, and
ASSERT() that to be the case. For the opcode FF group the adjustment is
benign for the paths not leading to "push", as they all set dst.type to
OP_NONE

Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -122,7 +122,7 @@ static uint8_t opcode_table[256] = {
 ImplicitOps, ImplicitOps, ImplicitOps, ImplicitOps,
 /* 0x98 - 0x9F */
 ImplicitOps, ImplicitOps, ImplicitOps, ImplicitOps,
-ImplicitOps, ImplicitOps, ImplicitOps, ImplicitOps,
+ImplicitOps|Mov, ImplicitOps|Mov, ImplicitOps, ImplicitOps,
 /* 0xA0 - 0xA7 */
 ByteOp|ImplicitOps|Mov, ImplicitOps|Mov,
 ByteOp|ImplicitOps|Mov, ImplicitOps|Mov,
@@ -1903,7 +1903,7 @@ x86_emulate(
 /* fall through */
 case 3: /* call (far, absolute indirect) */
 case 5: /* jmp (far, absolute indirect) */
-d = DstNone|SrcMem|ModRM;
+d = DstNone | SrcMem | ModRM | Mov;
 break;
 }
 break;
@@ -2347,7 +2347,7 @@ x86_emulate(
 case 0x68: /* push imm{16,32,64} */
 case 0x6a: /* push imm8 */
 push:
-d |= Mov; /* force writeback */
+ASSERT(d & Mov); /* writeback needed */
 dst.type  = OP_MEM;
 dst.bytes = mode_64bit() && (op_bytes == 4) ? 8 : op_bytes;
 dst.val = src.val;



x86emul: all push flavors are data moves

Make all paths leading to the "push" label have the Mov flag set, and
ASSERT() that to be the case. For the opcode FF group the adjustment is
benign for the paths not leading to "push", as they all set dst.type to
OP_NONE

Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -122,7 +122,7 @@ static uint8_t opcode_table[256] = {
 ImplicitOps, ImplicitOps, ImplicitOps, ImplicitOps,
 /* 0x98 - 0x9F */
 ImplicitOps, ImplicitOps, ImplicitOps, ImplicitOps,
-ImplicitOps, ImplicitOps, ImplicitOps, ImplicitOps,
+ImplicitOps|Mov, ImplicitOps|Mov, ImplicitOps, ImplicitOps,
 /* 0xA0 - 0xA7 */
 ByteOp|ImplicitOps|Mov, ImplicitOps|Mov,
 ByteOp|ImplicitOps|Mov, ImplicitOps|Mov,
@@ -1903,7 +1903,7 @@ x86_emulate(
 /* fall through */
 case 3: /* call (far, absolute indirect) */
 case 5: /* jmp (far, absolute indirect) */
-d = DstNone|SrcMem|ModRM;
+d = DstNone | SrcMem | ModRM | Mov;
 break;
 }
 break;
@@ -2347,7 +2347,7 @@ x86_emulate(
 case 0x68: /* push imm{16,32,64} */
 case 0x6a: /* push imm8 */
 push:
-d |= Mov; /* force writeback */
+ASSERT(d & Mov); /* writeback needed */
 dst.type  = OP_MEM;
 dst.bytes = mode_64bit() && (op_bytes == 4) ? 8 : op_bytes;
 dst.val = src.val;
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 4/7] x86emul: fold SrcImmByte fetching

2016-08-11 Thread Jan Beulich
There's no need for having identical code spelled out twice.

Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -1979,9 +1979,12 @@ x86_emulate(
 goto done;
 break;
 case SrcImm:
+if ( !(d & ByteOp) )
+src.bytes = op_bytes != 8 ? op_bytes : 4;
+else
+case SrcImmByte:
+src.bytes = 1;
 src.type  = OP_IMM;
-src.bytes = (d & ByteOp) ? 1 : op_bytes;
-if ( src.bytes == 8 ) src.bytes = 4;
 /* NB. Immediates are sign-extended as necessary. */
 switch ( src.bytes )
 {
@@ -1990,11 +1993,6 @@ x86_emulate(
 case 4: src.val = insn_fetch_type(int32_t); break;
 }
 break;
-case SrcImmByte:
-src.type  = OP_IMM;
-src.bytes = 1;
-src.val   = insn_fetch_type(int8_t);
-break;
 }
 
 /* Decode and fetch the destination operand: register or memory. */



x86emul: fold SrcImmByte fetching

There's no need for having identical code spelled out twice.

Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -1979,9 +1979,12 @@ x86_emulate(
 goto done;
 break;
 case SrcImm:
+if ( !(d & ByteOp) )
+src.bytes = op_bytes != 8 ? op_bytes : 4;
+else
+case SrcImmByte:
+src.bytes = 1;
 src.type  = OP_IMM;
-src.bytes = (d & ByteOp) ? 1 : op_bytes;
-if ( src.bytes == 8 ) src.bytes = 4;
 /* NB. Immediates are sign-extended as necessary. */
 switch ( src.bytes )
 {
@@ -1990,11 +1993,6 @@ x86_emulate(
 case 4: src.val = insn_fetch_type(int32_t); break;
 }
 break;
-case SrcImmByte:
-src.type  = OP_IMM;
-src.bytes = 1;
-src.val   = insn_fetch_type(int8_t);
-break;
 }
 
 /* Decode and fetch the destination operand: register or memory. */
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 5/7] x86emul: don't special case fetching unsigned 8-bit immediates

2016-08-11 Thread Jan Beulich
These can be made work using SrcImmByte, making sure the low 8 bits of
src.val get suitably zero extended upon consumption. SHLD and SHRD
require a little more adjustment: Their source operands get changed
away from SrcReg, handling the register access "manually" instead of
the insn byte fetching.

Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -148,11 +148,11 @@ static uint8_t opcode_table[256] = {
 ByteOp|DstMem|SrcImm|ModRM|Mov, DstMem|SrcImm|ModRM|Mov,
 /* 0xC8 - 0xCF */
 ImplicitOps, ImplicitOps, ImplicitOps, ImplicitOps,
-ImplicitOps, ImplicitOps, ImplicitOps, ImplicitOps,
+ImplicitOps, DstImplicit|SrcImmByte, ImplicitOps, ImplicitOps,
 /* 0xD0 - 0xD7 */
 ByteOp|DstMem|SrcImplicit|ModRM, DstMem|SrcImplicit|ModRM,
 ByteOp|DstMem|SrcImplicit|ModRM, DstMem|SrcImplicit|ModRM,
-ImplicitOps, ImplicitOps, ImplicitOps, ImplicitOps,
+DstImplicit|SrcImmByte, DstImplicit|SrcImmByte, ImplicitOps, ImplicitOps,
 /* 0xD8 - 0xDF */
 ImplicitOps|ModRM|Mov, ImplicitOps|ModRM|Mov,
 ImplicitOps|ModRM|Mov, ImplicitOps|ModRM|Mov,
@@ -161,7 +161,8 @@ static uint8_t opcode_table[256] = {
 /* 0xE0 - 0xE7 */
 DstImplicit|SrcImmByte, DstImplicit|SrcImmByte,
 DstImplicit|SrcImmByte, DstImplicit|SrcImmByte,
-ImplicitOps, ImplicitOps, ImplicitOps, ImplicitOps,
+DstImplicit|SrcImmByte, DstImplicit|SrcImmByte,
+DstImplicit|SrcImmByte, DstImplicit|SrcImmByte,
 /* 0xE8 - 0xEF */
 DstImplicit|SrcImm|Mov, DstImplicit|SrcImm,
 ImplicitOps, DstImplicit|SrcImmByte,
@@ -233,10 +234,10 @@ static uint8_t twobyte_table[256] = {
 ByteOp|DstMem|SrcNone|ModRM|Mov, ByteOp|DstMem|SrcNone|ModRM|Mov,
 /* 0xA0 - 0xA7 */
 ImplicitOps, ImplicitOps, ImplicitOps, DstBitBase|SrcReg|ModRM,
-DstMem|SrcReg|ModRM, DstMem|SrcReg|ModRM, 0, 0,
+DstMem|SrcImmByte|ModRM, DstMem|SrcReg|ModRM, 0, 0,
 /* 0xA8 - 0xAF */
 ImplicitOps, ImplicitOps, 0, DstBitBase|SrcReg|ModRM,
-DstMem|SrcReg|ModRM, DstMem|SrcReg|ModRM,
+DstMem|SrcImmByte|ModRM, DstMem|SrcReg|ModRM,
 ImplicitOps|ModRM, DstReg|SrcMem|ModRM,
 /* 0xB0 - 0xB7 */
 ByteOp|DstMem|SrcReg|ModRM, DstMem|SrcReg|ModRM,
@@ -2893,7 +2894,6 @@ x86_emulate(
 goto swint;
 
 case 0xcd: /* int imm8 */
-src.val = insn_fetch_type(uint8_t);
 swint_type = x86_swint_int;
 swint:
 rc = inject_swint(swint_type, src.val,
@@ -2942,7 +2942,7 @@ x86_emulate(
 
 case 0xd4: /* aam */
 case 0xd5: /* aad */ {
-unsigned int base = insn_fetch_type(uint8_t);
+unsigned int base = (uint8_t)src.val;
 
 generate_exception_if(mode_64bit(), EXC_UD, -1);
 if ( b & 0x01 )
@@ -3505,9 +3505,9 @@ x86_emulate(
 case 0xed: /* in %dx,%eax */
 case 0xee: /* out %al,%dx */
 case 0xef: /* out %eax,%dx */ {
-unsigned int port = ((b < 0xe8)
- ? insn_fetch_type(uint8_t)
- : (uint16_t)_regs.edx);
+unsigned int port = ((b < 0xe8) ? (uint8_t)src.val
+: (uint16_t)_regs.edx);
+
 op_bytes = !(b & 1) ? 1 : (op_bytes == 8) ? 4 : op_bytes;
 if ( (rc = ioport_access_check(port, op_bytes, ctxt, ops)) != 0 )
 goto done;
@@ -4562,7 +4562,15 @@ x86_emulate(
 case 0xac: /* shrd imm8,r,r/m */
 case 0xad: /* shrd %%cl,r,r/m */ {
 uint8_t shift, width = dst.bytes << 3;
-shift = (b & 1) ? (uint8_t)_regs.ecx : insn_fetch_type(uint8_t);
+
+if ( b & 1 )
+shift = _regs.ecx;
+else
+{
+shift = src.val;
+src.reg = decode_register(modrm_reg, &_regs, 0);
+src.val = truncate_word(*src.reg, dst.bytes);
+}
 if ( (shift &= width - 1) == 0 )
 break;
 dst.orig_val = truncate_word(dst.val, dst.bytes);



x86emul: don't special case fetching unsigned 8-bit immediates

These can be made work using SrcImmByte, making sure the low 8 bits of
src.val get suitably zero extended upon consumption. SHLD and SHRD
require a little more adjustment: Their source operands get changed
away from SrcReg, handling the register access "manually" instead of
the insn byte fetching.

Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -148,11 +148,11 @@ static uint8_t opcode_table[256] = {
 ByteOp|DstMem|SrcImm|ModRM|Mov, DstMem|SrcImm|ModRM|Mov,
 /* 0xC8 - 0xCF */
 ImplicitOps, ImplicitOps, ImplicitOps, ImplicitOps,
-ImplicitOps, ImplicitOps, ImplicitOps, ImplicitOps,
+ImplicitOps, DstImplicit|SrcImmByte, ImplicitOps, ImplicitOps,
 /* 0xD0 - 0xD7 */
 ByteOp|DstMem|SrcImplicit|ModRM, DstMem|SrcImplicit|ModRM,
 ByteOp|DstMem|SrcImplicit|ModRM, DstMem|SrcImplicit|ModRM,
-ImplicitOps, ImplicitOps, ImplicitOps, ImplicitOps,
+DstImplicit|SrcImmByte, D

[Xen-devel] [PATCH 6/7] x86emul: use DstEax where possible

2016-08-11 Thread Jan Beulich
While it avoids just a few instructions, we should nevertheless make
use of generic code as much as possible.

Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -124,7 +124,7 @@ static uint8_t opcode_table[256] = {
 ImplicitOps, ImplicitOps, ImplicitOps, ImplicitOps,
 ImplicitOps|Mov, ImplicitOps|Mov, ImplicitOps, ImplicitOps,
 /* 0xA0 - 0xA7 */
-ByteOp|ImplicitOps|Mov, ImplicitOps|Mov,
+ByteOp|DstEax|SrcImplicit|Mov, DstEax|SrcImplicit|Mov,
 ByteOp|ImplicitOps|Mov, ImplicitOps|Mov,
 ByteOp|ImplicitOps|Mov, ImplicitOps|Mov,
 ByteOp|ImplicitOps, ImplicitOps,
@@ -161,12 +161,12 @@ static uint8_t opcode_table[256] = {
 /* 0xE0 - 0xE7 */
 DstImplicit|SrcImmByte, DstImplicit|SrcImmByte,
 DstImplicit|SrcImmByte, DstImplicit|SrcImmByte,
-DstImplicit|SrcImmByte, DstImplicit|SrcImmByte,
+DstEax|SrcImmByte, DstEax|SrcImmByte,
 DstImplicit|SrcImmByte, DstImplicit|SrcImmByte,
 /* 0xE8 - 0xEF */
 DstImplicit|SrcImm|Mov, DstImplicit|SrcImm,
 ImplicitOps, DstImplicit|SrcImmByte,
-ImplicitOps, ImplicitOps, ImplicitOps, ImplicitOps,
+DstEax|SrcImplicit, DstEax|SrcImplicit, ImplicitOps, ImplicitOps,
 /* 0xF0 - 0xF7 */
 0, ImplicitOps, 0, 0,
 ImplicitOps, ImplicitOps,
@@ -2617,8 +2617,6 @@ x86_emulate(
 
 case 0xa0 ... 0xa1: /* mov mem.offs,{%al,%ax,%eax,%rax} */
 /* Source EA is not encoded via ModRM. */
-dst.type  = OP_REG;
-dst.reg   = (unsigned long *)&_regs.eax;
 dst.bytes = (d & ByteOp) ? 1 : op_bytes;
 if ( (rc = read_ulong(ea.mem.seg, insn_fetch_bytes(ad_bytes),
   &dst.val, dst.bytes, ctxt, ops)) != 0 )
@@ -3520,9 +3518,7 @@ x86_emulate(
 else
 {
 /* in */
-dst.type  = OP_REG;
 dst.bytes = op_bytes;
-dst.reg   = (unsigned long *)&_regs.eax;
 fail_if(ops->read_io == NULL);
 rc = ops->read_io(port, dst.bytes, &dst.val, ctxt);
 }



x86emul: use DstEax where possible

While it avoids just a few instructions, we should nevertheless make
use of generic code as much as possible.

Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -124,7 +124,7 @@ static uint8_t opcode_table[256] = {
 ImplicitOps, ImplicitOps, ImplicitOps, ImplicitOps,
 ImplicitOps|Mov, ImplicitOps|Mov, ImplicitOps, ImplicitOps,
 /* 0xA0 - 0xA7 */
-ByteOp|ImplicitOps|Mov, ImplicitOps|Mov,
+ByteOp|DstEax|SrcImplicit|Mov, DstEax|SrcImplicit|Mov,
 ByteOp|ImplicitOps|Mov, ImplicitOps|Mov,
 ByteOp|ImplicitOps|Mov, ImplicitOps|Mov,
 ByteOp|ImplicitOps, ImplicitOps,
@@ -161,12 +161,12 @@ static uint8_t opcode_table[256] = {
 /* 0xE0 - 0xE7 */
 DstImplicit|SrcImmByte, DstImplicit|SrcImmByte,
 DstImplicit|SrcImmByte, DstImplicit|SrcImmByte,
-DstImplicit|SrcImmByte, DstImplicit|SrcImmByte,
+DstEax|SrcImmByte, DstEax|SrcImmByte,
 DstImplicit|SrcImmByte, DstImplicit|SrcImmByte,
 /* 0xE8 - 0xEF */
 DstImplicit|SrcImm|Mov, DstImplicit|SrcImm,
 ImplicitOps, DstImplicit|SrcImmByte,
-ImplicitOps, ImplicitOps, ImplicitOps, ImplicitOps,
+DstEax|SrcImplicit, DstEax|SrcImplicit, ImplicitOps, ImplicitOps,
 /* 0xF0 - 0xF7 */
 0, ImplicitOps, 0, 0,
 ImplicitOps, ImplicitOps,
@@ -2617,8 +2617,6 @@ x86_emulate(
 
 case 0xa0 ... 0xa1: /* mov mem.offs,{%al,%ax,%eax,%rax} */
 /* Source EA is not encoded via ModRM. */
-dst.type  = OP_REG;
-dst.reg   = (unsigned long *)&_regs.eax;
 dst.bytes = (d & ByteOp) ? 1 : op_bytes;
 if ( (rc = read_ulong(ea.mem.seg, insn_fetch_bytes(ad_bytes),
   &dst.val, dst.bytes, ctxt, ops)) != 0 )
@@ -3520,9 +3518,7 @@ x86_emulate(
 else
 {
 /* in */
-dst.type  = OP_REG;
 dst.bytes = op_bytes;
-dst.reg   = (unsigned long *)&_regs.eax;
 fail_if(ops->read_io == NULL);
 rc = ops->read_io(port, dst.bytes, &dst.val, ctxt);
 }
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 7/7] x86emul: introduce SrcImm16

2016-08-11 Thread Jan Beulich
... and use it for RET, LRET, and ENTER processing to limit the amount
of "manual" insn bytes fetching.

Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -39,6 +39,7 @@
 #define SrcMem16(4<<3) /* Memory operand (16-bit). */
 #define SrcImm  (5<<3) /* Immediate operand. */
 #define SrcImmByte  (6<<3) /* 8-bit sign-extended immediate operand. */
+#define SrcImm16(7<<3) /* 16-bit zero-extended immediate operand. */
 #define SrcMask (7<<3)
 /* Generic ModRM decode. */
 #define ModRM   (1<<6)
@@ -143,11 +144,11 @@ static uint8_t opcode_table[256] = {
 DstReg|SrcImm|Mov, DstReg|SrcImm|Mov, DstReg|SrcImm|Mov, DstReg|SrcImm|Mov,
 /* 0xC0 - 0xC7 */
 ByteOp|DstMem|SrcImm|ModRM, DstMem|SrcImmByte|ModRM,
-ImplicitOps, ImplicitOps,
+DstImplicit|SrcImm16, ImplicitOps,
 DstReg|SrcMem|ModRM|Mov, DstReg|SrcMem|ModRM|Mov,
 ByteOp|DstMem|SrcImm|ModRM|Mov, DstMem|SrcImm|ModRM|Mov,
 /* 0xC8 - 0xCF */
-ImplicitOps, ImplicitOps, ImplicitOps, ImplicitOps,
+DstImplicit|SrcImm16, ImplicitOps, DstImplicit|SrcImm16, ImplicitOps,
 ImplicitOps, DstImplicit|SrcImmByte, ImplicitOps, ImplicitOps,
 /* 0xD0 - 0xD7 */
 ByteOp|DstMem|SrcImplicit|ModRM, DstMem|SrcImplicit|ModRM,
@@ -1994,6 +1995,11 @@ x86_emulate(
 case 4: src.val = insn_fetch_type(int32_t); break;
 }
 break;
+case SrcImm16:
+src.type  = OP_IMM;
+src.bytes = 2;
+src.val   = insn_fetch_type(uint16_t);
+break;
 }
 
 /* Decode and fetch the destination operand: register or memory. */
@@ -2786,16 +2792,14 @@ x86_emulate(
 break;
 
 case 0xc2: /* ret imm16 (near) */
-case 0xc3: /* ret (near) */ {
-int offset = (b == 0xc2) ? insn_fetch_type(uint16_t) : 0;
+case 0xc3: /* ret (near) */
 op_bytes = ((op_bytes == 4) && mode_64bit()) ? 8 : op_bytes;
-if ( (rc = read_ulong(x86_seg_ss, sp_post_inc(op_bytes + offset),
+if ( (rc = read_ulong(x86_seg_ss, sp_post_inc(op_bytes + src.val),
   &dst.val, op_bytes, ctxt, ops)) != 0 ||
  (rc = ops->insn_fetch(x86_seg_cs, dst.val, NULL, 0, ctxt)) )
 goto done;
 _regs.eip = dst.val;
 break;
-}
 
 case 0xc4: /* les */ {
 unsigned long sel;
@@ -2817,7 +2821,6 @@ x86_emulate(
 goto les;
 
 case 0xc8: /* enter imm16,imm8 */ {
-uint16_t size = insn_fetch_type(uint16_t);
 uint8_t depth = insn_fetch_type(uint8_t) & 31;
 int i;
 
@@ -2846,7 +2849,7 @@ x86_emulate(
 goto done;
 }
 
-sp_pre_dec(size);
+sp_pre_dec(src.val);
 break;
 }
 
@@ -2874,17 +2877,15 @@ x86_emulate(
 break;
 
 case 0xca: /* ret imm16 (far) */
-case 0xcb: /* ret (far) */ {
-int offset = (b == 0xca) ? insn_fetch_type(uint16_t) : 0;
+case 0xcb: /* ret (far) */
 if ( (rc = read_ulong(x86_seg_ss, sp_post_inc(op_bytes),
   &dst.val, op_bytes, ctxt, ops)) ||
- (rc = read_ulong(x86_seg_ss, sp_post_inc(op_bytes + offset),
+ (rc = read_ulong(x86_seg_ss, sp_post_inc(op_bytes + src.val),
   &src.val, op_bytes, ctxt, ops)) ||
  (rc = load_seg(x86_seg_cs, src.val, 1, &cs, ctxt, ops)) ||
  (rc = commit_far_branch(&cs, dst.val)) )
 goto done;
 break;
-}
 
 case 0xcc: /* int3 */
 src.val = EXC_BP;



x86emul: introduce SrcImm16

... and use it for RET, LRET, and ENTER processing to limit the amount
of "manual" insn bytes fetching.

Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -39,6 +39,7 @@
 #define SrcMem16(4<<3) /* Memory operand (16-bit). */
 #define SrcImm  (5<<3) /* Immediate operand. */
 #define SrcImmByte  (6<<3) /* 8-bit sign-extended immediate operand. */
+#define SrcImm16(7<<3) /* 16-bit zero-extended immediate operand. */
 #define SrcMask (7<<3)
 /* Generic ModRM decode. */
 #define ModRM   (1<<6)
@@ -143,11 +144,11 @@ static uint8_t opcode_table[256] = {
 DstReg|SrcImm|Mov, DstReg|SrcImm|Mov, DstReg|SrcImm|Mov, DstReg|SrcImm|Mov,
 /* 0xC0 - 0xC7 */
 ByteOp|DstMem|SrcImm|ModRM, DstMem|SrcImmByte|ModRM,
-ImplicitOps, ImplicitOps,
+DstImplicit|SrcImm16, ImplicitOps,
 DstReg|SrcMem|ModRM|Mov, DstReg|SrcMem|ModRM|Mov,
 ByteOp|DstMem|SrcImm|ModRM|Mov, DstMem|SrcImm|ModRM|Mov,
 /* 0xC8 - 0xCF */
-ImplicitOps, ImplicitOps, ImplicitOps, ImplicitOps,
+DstImplicit|SrcImm16, ImplicitOps, DstImplicit|SrcImm16, ImplicitOps,
 ImplicitOps, DstImplicit|SrcImmByte, ImplicitOps, ImplicitOps,
 /* 0xD0 - 0xD7 */
 ByteOp|DstMem|SrcImplicit|ModRM, DstMem|SrcImplicit|ModRM,
@@ -1994,6 +1995,11 @@ x86_emulate(
 case 4: src.val = insn_fetch_type(int32_t); break;
 

Re: [Xen-devel] [PATCH] common/vm_event: Fix comment

2016-08-11 Thread Razvan Cojocaru
On 08/11/2016 01:38 PM, Jan Beulich wrote:
 On 09.08.16 at 10:32,  wrote:
>> --- a/xen/common/vm_event.c
>> +++ b/xen/common/vm_event.c
>> @@ -255,7 +255,7 @@ static inline void vm_event_release_slot(struct domain 
>> *d,
>>  
>>  /*
>>   * vm_event_mark_and_pause() tags vcpu and put it to sleep.
>> - * The vcpu will resume execution in vm_event_wake_waiters().
>> + * The vcpu will resume execution in vm_event_wake().
>>   */
>>  void vm_event_mark_and_pause(struct vcpu *v, struct vm_event_domain *ved)
>>  {
> 
> I was about to commit this without further waiting for an ack, as
> being supposedly trivial, but then I checked and also found
> vm_event_wake{blocked,queued}(), and now I'm not sure
> whether the reference wouldn't better be to
> vm_event_wake_blocked(). Could you clarify that for me please?

Indeed, that's more precise. I'm happy to send V2 if you'd like.


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [Minios-devel] [PATCH v3 00/19] mini-os: support of auto-ballooning

2016-08-11 Thread Wei Liu
Series pushed, except for the last patch that changed the build system
because it is not yet acked or reviewed.

Note that patch #16 and #17 are resent in another thread.

Thank you both.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 4/9] x86/pv: Implement pv_hypercall() in C

2016-08-11 Thread Jan Beulich
>>> On 11.08.16 at 13:57,  wrote:
> On 02/08/16 14:12, Jan Beulich wrote:
> On 18.07.16 at 11:51,  wrote:
>>> +mov   %rsp, %rdi
>>> +call  pv_hypercall
>>>  movl  %eax,UREGS_rax(%rsp)   # save the return value
>> To follow the HVM model, this should also move into C.
> 
> Having tried this, I can't.
> 
> Using regs->eax = -ENOSYS; in C results in the upper 32bits of UREGS_rax
> set on the stack, and nothing else re-clobbers this.

I don't understand - why would this need "re-clobbering"? Hypercalls
are assumed to return longs, i.e. full 64 bits of data. Even in case
you say this to just the compat variant, my original comment was of
course meant for both, and in the compat case I don't see why the
upper half of RAX would be of any interest, considering the guest
can't look at it anyway.

> It highlights a second bug present in the hvm side, and propagated to
> the pv side.
> 
> Currently, eax gets truncated when reading out of the registers, before
> it is bounds-checked against NR_hypercalls.  For the HVM side, all this
> does is risk aliasing if upper bits are set, but for the PV side, it
> will cause a failure for a compat guest issuing a hypercall after
> previously receiving an error.

And again I don't understand - when we look at only the low 32 bits,
how would a prior error matter?

> I am proposing the following change to the HVM side to compensate, and
> to leave the asm adjustment of UREGS_rax in place.
> 
> ~Andrew
> 
> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> index e2bb58a..69315d1 100644
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -4132,11 +4132,11 @@ int hvm_do_hypercall(struct cpu_user_regs *regs)
>  struct domain *currd = curr->domain;
>  struct segment_register sreg;
>  int mode = hvm_guest_x86_mode(curr);
> -uint32_t eax = regs->eax;
> +unsigned long eax = regs->eax;

That would have the potential of breaking 32-bit callers, as we
mustn't rely on the upper halves of registers when coming out of
compat mode. IOW this would need to be made mode dependent,
and even then I'm afraid this has a (however small) potential of
breaking existing callers (but I agree that from a pure bug fix pov
this change should be made for the 64-bit path, as the PV code
looks at the full 64 bits too).

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] common/vm_event: Fix comment

2016-08-11 Thread Jan Beulich
>>> On 11.08.16 at 14:10,  wrote:
> On 08/11/2016 01:38 PM, Jan Beulich wrote:
> On 09.08.16 at 10:32,  wrote:
>>> --- a/xen/common/vm_event.c
>>> +++ b/xen/common/vm_event.c
>>> @@ -255,7 +255,7 @@ static inline void vm_event_release_slot(struct domain 
>>> *d,
>>>  
>>>  /*
>>>   * vm_event_mark_and_pause() tags vcpu and put it to sleep.
>>> - * The vcpu will resume execution in vm_event_wake_waiters().
>>> + * The vcpu will resume execution in vm_event_wake().
>>>   */
>>>  void vm_event_mark_and_pause(struct vcpu *v, struct vm_event_domain *ved)
>>>  {
>> 
>> I was about to commit this without further waiting for an ack, as
>> being supposedly trivial, but then I checked and also found
>> vm_event_wake{blocked,queued}(), and now I'm not sure
>> whether the reference wouldn't better be to
>> vm_event_wake_blocked(). Could you clarify that for me please?
> 
> Indeed, that's more precise. I'm happy to send V2 if you'd like.

I can as well adjust it while committing.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 7/9] x86/pv: Merge the pv hypercall tables

2016-08-11 Thread Andrew Cooper
On 03/08/16 16:07, Jan Beulich wrote:
 On 18.07.16 at 11:51,  wrote:
>> For the same reason as c/s 33a231e3f "x86/HVM: fold hypercall tables", this
>> removes the risk of accidentally updating only one of the tables.
>>
>> Signed-off-by: Andrew Cooper 
> Reviewed-by: Jan Beulich 
>
> But having come here I still can't see why this can't be folded with
> patch 5 without also folding in patch 6. Anyway - as long as they're
> going to get committed without too big of a time gap in between,
> the final result is what matters most.

References to hypercall_table and compat_hypercall_table are buried in
the multicall inline assembler.

Folding the tables first involves complicated changes, just to be taken
out one patch later.

The risk of accidentally breaking bisectability is greater than the
downside of splitting the patches up a bit.  (Either way, they will be
committed together.)

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH] Xen: remove -fshort-wchar gcc flag

2016-08-11 Thread Arnd Bergmann
A previous patch added the --no-wchar-size-warning to the Makefile to
avoid this harmless warning:

arm-linux-gnueabi-ld: warning: drivers/xen/efi.o uses 2-byte wchar_t yet the 
output is to use 4-byte wchar_t; use of wchar_t values across objects may fail

Changing kbuild to use thin archives instead of recursive linking
unfortunately brings the same warning back during the final link.

This time, we remove the -fshort-wchar flag that originally caused
the warning, hopefully fixing the problem for good. I don't see
any reason for having the flag in the first place, as the Xen code
does not use wchar_t at all.

Signed-off-by: Arnd Bergmann 
Fixes: 971a69db7dc0 ("Xen: don't warn about 2-byte wchar_t in efi")
---
On Thursday, August 11, 2016 8:16:14 PM CEST Nicholas Piggin wrote:
> Hi,
> 
> I would like to submit the kbuild changes in patches 1-3 for
> consideration.
> 
> I've taken on the feedback, so thanks everybody for that. The
> biggest change since last time is a more general way for
> architectures to do a post-link pass in patch 3.
> 
> On the question of whether to enable thin archives unconditionally,
> I prefer to have architectures enable them as they are tested. But
> I would like to see everybody moved as soon as possible and the
> incremental linking removed.

It would be nice to get this patch merged along with the thin
archive conversion, either by merging it through the xen
Tree, or by making it part of Nick's series with an Ack
from the xen maintainers.

diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
index 8feab810aed9..7f188b8d0c67 100644
--- a/drivers/xen/Makefile
+++ b/drivers/xen/Makefile
@@ -7,9 +7,6 @@ obj-y   += xenbus/
 nostackp := $(call cc-option, -fno-stack-protector)
 CFLAGS_features.o  := $(nostackp)
 
-CFLAGS_efi.o   += -fshort-wchar
-LDFLAGS+= $(call ld-option, 
--no-wchar-size-warning)
-
 dom0-$(CONFIG_ARM64) += arm-device.o
 dom0-$(CONFIG_PCI) += pci.o
 dom0-$(CONFIG_USB_SUPPORT) += dbgp.o


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] Xen: remove -fshort-wchar gcc flag

2016-08-11 Thread Jan Beulich
>>> On 11.08.16 at 14:39,  wrote:
> A previous patch added the --no-wchar-size-warning to the Makefile to
> avoid this harmless warning:
> 
> arm-linux-gnueabi-ld: warning: drivers/xen/efi.o uses 2-byte wchar_t yet the 
> output is to use 4-byte wchar_t; use of wchar_t values across objects may 
> fail
> 
> Changing kbuild to use thin archives instead of recursive linking
> unfortunately brings the same warning back during the final link.
> 
> This time, we remove the -fshort-wchar flag that originally caused
> the warning, hopefully fixing the problem for good. I don't see
> any reason for having the flag in the first place, as the Xen code
> does not use wchar_t at all.

It uses efi_char16_t, and by dropping -fshort-wchar you'd open
up a trap for anyone to fall into who were to add wide string
literals to that same file. EFI using 16-bit characters requires
code interfacing with EFI to do so too.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


  1   2   3   >