Re: [Xen-devel] [PATCH] ARM: xen: only set pm function ptrs for Xen guests

2013-08-28 Thread Julien Grall
On 08/28/2013 05:19 PM, Rob Herring wrote:
> From: Rob Herring 
> 
> xen_pm_init was unconditionally setting pm_power_off and arm_pm_restart
> function pointers. This breaks multi-platform kernels. Move this
> initialization into xen_guest_init, so it is conditional on running as a
> Xen guest.
> 
> Cc: Stefano Stabellini 
> Signed-off-by: Rob Herring 
> ---
> This breaks reset and poweroff for Midway when Xen is enabled. This 
> should go into 3.11 or stable.
> Rob
> 
>  arch/arm/xen/enlighten.c | 12 +++-
>  1 file changed, 3 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
> index 8a6295c..fa86452 100644
> --- a/arch/arm/xen/enlighten.c
> +++ b/arch/arm/xen/enlighten.c
> @@ -263,6 +263,9 @@ static int __init xen_guest_init(void)
>   if (xen_vcpu_info == NULL)
>   return -ENOMEM;
>  
> + pm_power_off = xen_power_off;
> + arm_pm_restart = xen_restart;
> +

I think it's too early to set pm callbacks. If Linux is running as dom0,
xen needs to overwrite the power management callback. Otherwise, dom0
could shutdown/restart the whole platform, that is annoying.
For instance, on the versatile express, the power management callback
are set very late (ie during driver initialization).

pm callbacks should be updated by a late initcall and check if xen is
enabled.

>   gnttab_init();
>   if (!xen_initial_domain())
>   xenbus_probe(NULL);
> @@ -271,15 +274,6 @@ static int __init xen_guest_init(void)
>  }
>  core_initcall(xen_guest_init);
>  
> -static int __init xen_pm_init(void)
> -{
> - pm_power_off = xen_power_off;
> - arm_pm_restart = xen_restart;
> -
> - return 0;
> -}
> -subsys_initcall(xen_pm_init);
> -
>  static irqreturn_t xen_arm_callback(int irq, void *arg)
>  {
>   xen_hvm_evtchn_do_upcall();
> 


-- 
Julien Grall
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] ARM: xen: only set pm function ptrs for Xen guests

2013-08-28 Thread Julien Grall
On 28 August 2013 19:20, Rob Herring  wrote:
> From: Rob Herring 
>
> xen_pm_init was unconditionally setting pm_power_off and arm_pm_restart
> function pointers. This breaks multi-platform kernels. Make this
> conditional on running as a Xen guest and make it a late_initcall to
> ensure it is setup after platform code for Dom0.
>
> Cc: Stefano Stabellini 
> Signed-off-by: Rob Herring 
> ---
>  arch/arm/xen/enlighten.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
> index 8a6295c..13a7d1f 100644
> --- a/arch/arm/xen/enlighten.c
> +++ b/arch/arm/xen/enlighten.c
> @@ -273,12 +273,15 @@ core_initcall(xen_guest_init);
>
>  static int __init xen_pm_init(void)
>  {
> +   if (!of_find_compatible_node(NULL, NULL, "xen,xen"))
> +   return -ENODEV;
> +

You should use the macro xen_domain() to check if we are running
in a Xen guest.

Cheers,

-- 
Julien Grall
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH] xen/hvc-console: Make it work with HVM guests.

2013-10-06 Thread Julien Grall

On 09/30/2013 03:45 PM, Konrad Rzeszutek Wilk wrote:

On Fri, Sep 27, 2013 at 10:49:37PM +0100, Julien Grall wrote:

On 09/27/2013 10:25 PM, Konrad Rzeszutek Wilk wrote:


@@ -641,7 +641,20 @@ struct console xenboot_console = {

  void xen_raw_console_write(const char *str)
  {
-   dom0_write_console(0, str, strlen(str));
+   ssize_t len = strlen(str);
+   int rc = 0;
+
+   if (xen_domain()) {
+   dom0_write_console(0, str, len);
+   if (rc == -ENOSYS && xen_hvm_domain())
+   goto outb_print;
+
+   } else if (xen_cpuid_base()) {
+   int i;
+outb_print:
+   for (i = 0; i < len; i++)
+   outb(str[i], 0xe9);
+   }
  }


xen_cpuid_base and outb(0xe9) is x86 specific and won't compile on ARM.


Odd, I see outb defined in arch/arm and arch/arm64 
?(arch/arm[|64]/include/asm.io.h)


On ARM32 the IO access is memory mapped (the exact address depends on 
Linux configuration).
The main problem is not the outb macro but the ioport 0xe9.On ARM, this 
ioport is not trapped by Xen.



You are of course right about xen_cpuid_base.

How about this:


For the ARM side, the code looks good.


 From 04b772d2b819f0dda2163e3193fa7cd447a6245c Mon Sep 17 00:00:00 2001
From: Konrad Rzeszutek Wilk 
Date: Fri, 27 Sep 2013 17:18:13 -0400
Subject: [PATCH] xen/hvc: If we use xen_raw_printk let it also work on HVM
  guests.

The xen_raw_printk works great for debugging purposes. We use
for PV guests and we can also use it for HVM guests.

However, for HVM guests we have a fallback of using the 0xe9
port in case the hypervisor does not support an HVM guest of
using the console_io hypercall. As such lets use 0xe9 during
early bootup, and once the hyper-page is setup and if the
console_io hypercall is supported - use that. Otherwise we
will fallback to using the 0xe9 after early bootup.

We also alter the return value for dom0_write_console to return
an error value instead of zero. The HVC API has been supporting
returning error values for quite some time.

P.S.
To use (and to see the output in the Xen ring buffer) one has to build
the hypervisor with 'debug=y'.

Signed-off-by: Konrad Rzeszutek Wilk 
[v1: ifdef xen_cpuid_base as it is X86 specific]
---
  drivers/tty/hvc/hvc_xen.c | 19 +--
  1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/drivers/tty/hvc/hvc_xen.c b/drivers/tty/hvc/hvc_xen.c
index e61c36c..6458c9f 100644
--- a/drivers/tty/hvc/hvc_xen.c
+++ b/drivers/tty/hvc/hvc_xen.c
@@ -183,7 +183,7 @@ static int dom0_write_console(uint32_t vtermno, const char 
*str, int len)
  {
int rc = HYPERVISOR_console_io(CONSOLEIO_write, len, (char *)str);
if (rc < 0)
-   return 0;
+   return rc;

return len;
  }
@@ -641,7 +641,22 @@ struct console xenboot_console = {

  void xen_raw_console_write(const char *str)
  {
-   dom0_write_console(0, str, strlen(str));
+   ssize_t len = strlen(str);
+   int rc = 0;
+
+   if (xen_domain()) {
+   rc = dom0_write_console(0, str, len);
+#ifdef CONFIG_X86
+   if (rc == -ENOSYS && xen_hvm_domain())
+   goto outb_print;
+
+   } else if (xen_cpuid_base()) {
+   int i;
+outb_print:
+   for (i = 0; i < len; i++)
+   outb(str[i], 0xe9);
+#endif
+   }
  }

  void xen_raw_printk(const char *fmt, ...)




--
Julien Grall
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v5 02/13] arm: introduce a global dma_ops pointer

2013-09-02 Thread Julien Grall
On 08/29/2013 07:32 PM, Stefano Stabellini wrote:
> Initially set dma_ops to arm_dma_ops.
> 
> 
> Signed-off-by: Stefano Stabellini 
> Acked-by: Konrad Rzeszutek Wilk 
> CC: will.dea...@arm.com
> CC: li...@arm.linux.org.uk
> 
> 
> Changes in v3:
> -  keep using arm_dma_ops in dmabounce.
> ---
>  arch/arm/include/asm/dma-mapping.h |3 ++-
>  arch/arm/mm/dma-mapping.c  |3 +++
>  2 files changed, 5 insertions(+), 1 deletions(-)
> 
> diff --git a/arch/arm/include/asm/dma-mapping.h 
> b/arch/arm/include/asm/dma-mapping.h
> index 0982206..7d6e4f9 100644
> --- a/arch/arm/include/asm/dma-mapping.h
> +++ b/arch/arm/include/asm/dma-mapping.h
> @@ -13,6 +13,7 @@
>  #include 
>  
>  #define DMA_ERROR_CODE   (~0)
> +extern struct dma_map_ops *dma_ops;

Hi,

I tried to build your swiotlb patch series for the Arndale. I have a compilation
error because dma_ops is already used in samsung sound driver 
(sound/soc/samsung/dma.c).

This small fix allow me to built this serie for the Arndale.
Do I need to send it separately?

=======
commit 73d4ceded87f52fa958b92d8d8d65be485e90857
Author: Julien Grall 
Date:   Mon Sep 2 15:36:35 2013 +0100

ASoC: Samsung: Rename dma_ops by samsung_dma_ops

The commit "arm: introduce a global dma_ops pointer" introduce compilation 
issue
when CONFIG_SND_SOC_SAMSUNG is enabled.

sound/soc/samsung/dma.c:345:27: error: conflicting types for 'dma_ops'

/local/home/julien/works/arndale/linux/arch/arm/include/asm/dma-mapping.h:16:28:
note: previous declaration of 'dma_ops' was here

Signed-off-by: Julien Grall 

diff --git a/sound/soc/samsung/dma.c b/sound/soc/samsung/dma.c
index ddea134..c341603 100644
--- a/sound/soc/samsung/dma.c
+++ b/sound/soc/samsung/dma.c
@@ -342,7 +342,7 @@ static int dma_mmap(struct snd_pcm_substream *substream,
 runtime->dma_bytes);
 }
 
-static struct snd_pcm_ops dma_ops = {
+static struct snd_pcm_ops samsung_dma_ops = {
.open   = dma_open,
.close  = dma_close,
.ioctl  = snd_pcm_lib_ioctl,
@@ -429,7 +429,7 @@ out:
 }
 
 static struct snd_soc_platform_driver samsung_asoc_platform = {
-   .ops= &dma_ops,
+   .ops    = &samsung_dma_ops,
.pcm_new= dma_new,
.pcm_free   = dma_free_dma_buffers,
 };

-- 
Julien Grall
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH] xen/hvc-console: Make it work with HVM guests.

2013-09-27 Thread Julien Grall

On 09/27/2013 10:25 PM, Konrad Rzeszutek Wilk wrote:


@@ -641,7 +641,20 @@ struct console xenboot_console = {

  void xen_raw_console_write(const char *str)
  {
-   dom0_write_console(0, str, strlen(str));
+   ssize_t len = strlen(str);
+   int rc = 0;
+
+   if (xen_domain()) {
+   dom0_write_console(0, str, len);
+   if (rc == -ENOSYS && xen_hvm_domain())
+   goto outb_print;
+
+   } else if (xen_cpuid_base()) {
+   int i;
+outb_print:
+   for (i = 0; i < len; i++)
+   outb(str[i], 0xe9);
+   }
  }


xen_cpuid_base and outb(0xe9) is x86 specific and won't compile on ARM.
You need to add ifdef around.

--
Julien Grall
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] arm: choose debug/uncompress.h include when uncompress debug is disabled

2013-07-15 Thread Julien Grall
Even if uncompress debug is disabled, some board will continue to print
information during uncompress step.
By using debug/uncompress.h include, all debug output will be disabled.

This is usefull in Xen environment for DOM0 because the UART is stolen by
Xen.

Signed-off-by: Julien Grall 
---
 arch/arm/Kconfig.debug |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/Kconfig.debug b/arch/arm/Kconfig.debug
index 21cc8a7..86c023d 100644
--- a/arch/arm/Kconfig.debug
+++ b/arch/arm/Kconfig.debug
@@ -679,7 +679,7 @@ config DEBUG_UNCOMPRESS
 
 config UNCOMPRESS_INCLUDE
string
-   default "debug/uncompress.h" if ARCH_MULTIPLATFORM
+   default "debug/uncompress.h" if ARCH_MULTIPLATFORM || !DEBUG_UNCOMPRESS
default "mach/uncompress.h"
 
 config EARLY_PRINTK
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] xen/arm: disable cpuidle when linux is running as dom0

2013-07-15 Thread Julien Grall
When linux is running as dom0, Xen doesn't show the physical cpu but a
virtual CPU.
On some ARM SOC (for instance the exynos 5250), linux registers callbacks
for cpuidle. When these callbacks are called, they will modify
directly the physical cpu not the virtual one. It can impact the whole board
instead of dom0.

Signed-off-by: Julien Grall 
---
 arch/arm/xen/enlighten.c |7 +++
 1 file changed, 7 insertions(+)

diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
index 49839d8..a98999f 100644
--- a/arch/arm/xen/enlighten.c
+++ b/arch/arm/xen/enlighten.c
@@ -24,6 +24,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 #include 
 
@@ -292,6 +294,11 @@ static int __init xen_pm_init(void)
 {
pm_power_off = xen_power_off;
arm_pm_restart = xen_restart;
+   /*
+* Making sure board specific code will not set up ops for
+* cpu idle.
+*/
+   disable_cpuidle();
 
return 0;
 }
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] xen/arm: enable PV control for ARM

2013-07-15 Thread Julien Grall
Enable power management from the toolstack for ARM guest.

Signed-off-by: Julien Grall 
---
 drivers/xen/Makefile |3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
index 2bf461a..a5f12bd 100644
--- a/drivers/xen/Makefile
+++ b/drivers/xen/Makefile
@@ -1,9 +1,8 @@
 ifneq ($(CONFIG_ARM),y)
-obj-y  += manage.o
 obj-$(CONFIG_HOTPLUG_CPU)  += cpu_hotplug.o
 endif
 obj-$(CONFIG_X86)  += fallback.o
-obj-y  += grant-table.o features.o events.o balloon.o time.o
+obj-y  += grant-table.o features.o events.o balloon.o time.o manage.o
 obj-y  += xenbus/
 
 nostackp := $(call cc-option, -fno-stack-protector)
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] xen/control: protect functions with CONFIG_HIBERNATE_CALLBACKS to avoid warning

2013-07-15 Thread Julien Grall
If CONFIG_HIBERNATE_CALLBACKS is not set gcc will issue warnings:
drivers/xen/manage.c:46:13: warning: 'xen_hvm_post_suspend' defined but not 
used [-Wunused-function]
drivers/xen/manage.c:52:13: warning: 'xen_pre_suspend' defined but not used 
[-Wunused-function]
drivers/xen/manage.c:59:13: warning: 'xen_post_suspend' defined but not used 
[-Wunused-function]

Signed-off-by: Julien Grall 
---
 drivers/xen/manage.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/xen/manage.c b/drivers/xen/manage.c
index 412b96c..7680276 100644
--- a/drivers/xen/manage.c
+++ b/drivers/xen/manage.c
@@ -36,6 +36,7 @@ enum shutdown_state {
 /* Ignore multiple shutdown requests. */
 static enum shutdown_state shutting_down = SHUTDOWN_INVALID;
 
+#ifdef CONFIG_HIBERNATE_CALLBACKS
 struct suspend_info {
int cancelled;
unsigned long arg; /* extra hypercall argument */
@@ -63,7 +64,6 @@ static void xen_post_suspend(int cancelled)
xen_mm_unpin_all();
 }
 
-#ifdef CONFIG_HIBERNATE_CALLBACKS
 static int xen_suspend(void *data)
 {
struct suspend_info *si = data;
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH] xen/control: protect functions with CONFIG_HIBERNATE_CALLBACKS to avoid warning

2013-07-15 Thread Julien Grall
Forgot the different cc.

On 07/15/2013 04:47 PM, Julien Grall wrote:
> On 07/15/2013 04:27 PM, Konrad Rzeszutek Wilk wrote:
>> On Mon, Jul 15, 2013 at 03:24:35PM +0100, Julien Grall wrote:
>>> If CONFIG_HIBERNATE_CALLBACKS is not set gcc will issue warnings:
>>> drivers/xen/manage.c:46:13: warning: 'xen_hvm_post_suspend' defined but not 
>>> used [-Wunused-function]
>>> drivers/xen/manage.c:52:13: warning: 'xen_pre_suspend' defined but not used 
>>> [-Wunused-function]
>>> drivers/xen/manage.c:59:13: warning: 'xen_post_suspend' defined but not 
>>> used [-Wunused-function]
>>
>> Have you checked the upstream kernel?
> 
> My apologies, I forgot to check upstream for this patch.
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH] xen/arm: disable cpuidle when linux is running as dom0

2013-07-15 Thread Julien Grall
On 07/15/2013 04:25 PM, Konrad Rzeszutek Wilk wrote:
> On Mon, Jul 15, 2013 at 03:21:41PM +0100, Julien Grall wrote:
>> When linux is running as dom0, Xen doesn't show the physical cpu but a
>> virtual CPU.
>> On some ARM SOC (for instance the exynos 5250), linux registers callbacks
>> for cpuidle. When these callbacks are called, they will modify
>> directly the physical cpu not the virtual one. It can impact the whole board
>> instead of dom0.
> 
> Should you also call disable_cpufreq() ?

I had some issue on the versatile express when cpufreq was disabled.
I will give another try and see the exact error.

-- 
Julien

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] xen/arm: missing put_cpu in xen_percpu_init

2013-07-29 Thread Julien Grall
When CONFIG_PREEMPT is enabled, Linux will not be able to boot and warn:
[4.127825] [ cut here ]
[4.133376] WARNING: at init/main.c:699 do_one_initcall+0x150/0x158()
[4.140738] initcall xen_init_events+0x0/0x10c returned with preemption 
imbalance

This is because xen_percpu_init uses get_cpu but doesn't have the corresponding
put_cpu.

Signed-off-by: Julien Grall 
---
 arch/arm/xen/enlighten.c |1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
index f71c37e..dc9f284 100644
--- a/arch/arm/xen/enlighten.c
+++ b/arch/arm/xen/enlighten.c
@@ -170,6 +170,7 @@ static void __init xen_percpu_init(void *unused)
per_cpu(xen_vcpu, cpu) = vcpup;
 
enable_percpu_irq(xen_events_irq, 0);
+   put_cpu();
 }
 
 static void xen_restart(char str, const char *cmd)
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] arm: choose debug/uncompress.h include when uncompress debug is disabled

2013-07-18 Thread Julien Grall
On 17 July 2013 14:25, Stefano Stabellini
 wrote:
> On Mon, 15 Jul 2013, Julien Grall wrote:
>> Even if uncompress debug is disabled, some board will continue to print
>> information during uncompress step.
>
> Are you talking about DEBUG_UNCOMPRESS?
> Should I read the sentence as "even if DEBUG_UNCOMPRESS is not selected,
> some board will continue to print information during the uncompress step"?

Yes. On the arndale, uncompress log are directly output on UART-2.
This is annoying
because  Xen doesn't expose the UART to dom0.

> Isn't this a bug in the platform specific code that should be fixed anyway?
>
>> By using debug/uncompress.h include, all debug output will be disabled.
>
> I am not sure if this is the right solution to the problem.
> I think it might be better to add the appropriate ifdefs into the
> platform specific code.

--
Julien Grall
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] arm: choose debug/uncompress.h include when uncompress debug is disabled

2013-07-18 Thread Julien Grall
On 07/17/2013 04:11 PM, Russell King - ARM Linux wrote:
> On Wed, Jul 17, 2013 at 02:25:38PM +0100, Stefano Stabellini wrote:
>> On Mon, 15 Jul 2013, Julien Grall wrote:
>>> Even if uncompress debug is disabled, some board will continue to print
>>> information during uncompress step.
>>
>> Are you talking about DEBUG_UNCOMPRESS?
>> Should I read the sentence as "even if DEBUG_UNCOMPRESS is not selected,
>> some board will continue to print information during the uncompress step"?
>>
>> Isn't this a bug in the platform specific code that should be fixed anyway?
> 
> Hang on, let's be clear what's going on here.
> 
> 1. The normal output from the decompressor is *not* debugging.  By that
>I mean the "Uncompressing kernel... done" message.  That is part of
>user output.
> 
> 2. In non-multiplatform environments, the decompressor will normally use
>the putc/flush functions found in arch/arm/mach-*/include/mach/uncompress.h
>to implement its output, irrespective of the DEBUG_UNCOMPRESS setting.
>(An interesting point is that DEBUG_UNCOMPRESS really should depend on
>MULTIPLATFORM so that this point is explicit - the option requires
>MULTIPLATFORM to be set.)
> 
> 3. DEBUG_UNCOMPRESS allows the functions which we've implemented for LL
>debug to be re-used for decompressor output.

When Xen will boot, it will use one UART, given by the user, to be able
to log its information. Xen will not map the UART region and the IRQ to
dom0. Of course it will not be present in the device tree either. So if
Linux tries to access to this memory region, it will crash.

When Linux will boot as dom0, it will either use an hvc console or
another UART.

In case of multi-platform environments, there is no issue because when
CONFIG_DEBUG_UNCOMPRESS is disabled.

Now, in non-multiplatform environment, the decompressor will use a
pre-defined UART. This UART may be already used by Xen and Linux will
abort at the first access.

I think, the decompressor should be able to detect if the UART exists
(I'm not sure it's possible) or disabled at compile time uncompress log.

Do you have any better ideas?

Cheers,

-- 
Julien

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] xen/arm: enable PV control for ARM

2013-07-18 Thread Julien Grall
On 17 July 2013 14:40, Stefano Stabellini
 wrote:
> On Mon, 15 Jul 2013, Julien Grall wrote:
>> Enable power management from the toolstack for ARM guest.
>>
>> Signed-off-by: Julien Grall 
>
> Considering that now we support both ARM and ARM64, could you please
> add an ifneq ($(CONFIG_ARM64),y) too around cpu_hotplug.o, since you are
> at it?

Yes. I will resend the patch.

-- 
Julien Grall
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH] xen/arm: disable cpuidle when linux is running as dom0

2013-07-18 Thread Julien Grall
On 17 July 2013 14:28, Stefano Stabellini
 wrote:
> On Mon, 15 Jul 2013, Konrad Rzeszutek Wilk wrote:
>> On Mon, Jul 15, 2013 at 03:21:41PM +0100, Julien Grall wrote:
>> > When linux is running as dom0, Xen doesn't show the physical cpu but a
>> > virtual CPU.
>> > On some ARM SOC (for instance the exynos 5250), linux registers callbacks
>> > for cpuidle. When these callbacks are called, they will modify
>> > directly the physical cpu not the virtual one.
>> It can impact the whole board
>> > instead of dom0.
>
> Certainly this is something that should be fixed at the hypervisor level
> too. However I agree that Linux should try to avoid doing that when
> running on Xen.
>
>
>
>> Should you also call disable_cpufreq() ?
>
> Sounds like a good idea.
> Julien, could you add that to this patch?

I have checked with both the Arndale and Versatile Express without any issue.
I will resend this patch with disable_cpufreq().

-- 
Julien Grall
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2] xen/arm: disable cpuidle and cpufreq when linux is running as dom0

2013-07-19 Thread Julien Grall
When linux is running as dom0, Xen doesn't show the physical cpu but a
virtual CPU.
On some ARM SOC (for instance the exynos 5250), linux registers callbacks
for cpuidle and cpufreq. When these callbacks are called, they will modify
directly the physical cpu not the virtual one. It can impact the whole board
instead of only dom0.

Signed-off-by: Julien Grall 

---
Changes in v2:
- Disable cpufreq
---
 arch/arm/xen/enlighten.c |8 
 1 file changed, 8 insertions(+)

diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
index 49839d8..af82792 100644
--- a/arch/arm/xen/enlighten.c
+++ b/arch/arm/xen/enlighten.c
@@ -24,6 +24,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 #include 
 
@@ -292,6 +294,12 @@ static int __init xen_pm_init(void)
 {
pm_power_off = xen_power_off;
arm_pm_restart = xen_restart;
+   /*
+* Making sure board specific code will not set up ops for
+* cpu idle and cpu freq.
+*/
+   disable_cpuidle();
+   disable_cpufreq();
 
return 0;
 }
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2] xen/arm: enable PV control for ARM

2013-07-19 Thread Julien Grall
Enable power management from the toolstack for ARM guest.

Signed-off-by: Julien Grall 

---
Changes in v2:
- Don't compile xen/cpu_hotplug.o with ARM64
---
 drivers/xen/Makefile |5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
index 2bf461a..b550a94 100644
--- a/drivers/xen/Makefile
+++ b/drivers/xen/Makefile
@@ -1,9 +1,8 @@
-ifneq ($(CONFIG_ARM),y)
-obj-y  += manage.o
+ifneq ($(filter y, ($CONFIG_ARM) $(CONFIG_ARM64)),)
 obj-$(CONFIG_HOTPLUG_CPU)  += cpu_hotplug.o
 endif
 obj-$(CONFIG_X86)  += fallback.o
-obj-y  += grant-table.o features.o events.o balloon.o time.o
+obj-y  += grant-table.o features.o events.o balloon.o time.o manage.o
 obj-y  += xenbus/
 
 nostackp := $(call cc-option, -fno-stack-protector)
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] xen/arm: enable PV control for ARM

2013-07-21 Thread Julien Grall
On 21 July 2013 15:54, Stefano Stabellini
 wrote:
> On Fri, 19 Jul 2013, Julien Grall wrote:
>> Enable power management from the toolstack for ARM guest.
>>
>> Signed-off-by: Julien Grall 
>>
>> ---
>> Changes in v2:
>> - Don't compile xen/cpu_hotplug.o with ARM64
>> ---
>>  drivers/xen/Makefile |5 ++---
>>  1 file changed, 2 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
>> index 2bf461a..b550a94 100644
>> --- a/drivers/xen/Makefile
>> +++ b/drivers/xen/Makefile
>> @@ -1,9 +1,8 @@
>> -ifneq ($(CONFIG_ARM),y)
>> -obj-y+= manage.o
>> +ifneq ($(filter y, ($CONFIG_ARM) $(CONFIG_ARM64)),y
>
> This is wrong: ifneq is checking for the opposite condition of what we want, 
> and
> beside you have the $ in the wrong place for CONFIG_ARM.
>
> Please test you patches before sending them.

My apologies, I tried without any issue on ARM (certainly, because of
the $ in wrong place).

I will send a new patch.

--
Julien Grall
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3] xen/arm: enable PV control for ARM

2013-07-22 Thread Julien Grall
Enable power management from the toolstack for ARM guest.

Signed-off-by: Julien Grall 

---
Changes in v3:
- Fix condition to compile cpu_hotplug.o
Changes in v2:
- Don't compile xen/cpu_hotplug.o with ARM64
---
 drivers/xen/Makefile |5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
index 2bf461a..f185e8d 100644
--- a/drivers/xen/Makefile
+++ b/drivers/xen/Makefile
@@ -1,9 +1,8 @@
-ifneq ($(CONFIG_ARM),y)
-obj-y  += manage.o
+ifeq ($(filter y, $(CONFIG_ARM) $(CONFIG_ARM64)),)
 obj-$(CONFIG_HOTPLUG_CPU)  += cpu_hotplug.o
 endif
 obj-$(CONFIG_X86)  += fallback.o
-obj-y  += grant-table.o features.o events.o balloon.o time.o
+obj-y  += grant-table.o features.o events.o balloon.o time.o manage.o
 obj-y  += xenbus/
 
 nostackp := $(call cc-option, -fno-stack-protector)
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] xen/arm64: Don't compile cpu hotplug

2013-07-22 Thread Julien Grall
On ARM64, when CONFIG_XEN=y, the compilation will fail because CPU hotplug is
not yet supported with XEN. For now, disable it.

Signed-off-by: Julien Grall 
---
 drivers/xen/Makefile |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
index 2bf461a..a609353 100644
--- a/drivers/xen/Makefile
+++ b/drivers/xen/Makefile
@@ -1,4 +1,4 @@
-ifneq ($(CONFIG_ARM),y)
+ifeq ($(filter y, $(CONFIG_ARM) $(CONFIG_ARM64)),)
 obj-y  += manage.o
 obj-$(CONFIG_HOTPLUG_CPU)  += cpu_hotplug.o
 endif
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v4] xen/arm: enable PV control for ARM

2013-07-22 Thread Julien Grall
Enable power management from the toolstack for ARM guest.

Signed-off-by: Julien Grall 

---
Changes in v4:
- Divide the patch in 2 distinct parts
Changes in v3:
- Fix condition to compile cpu_hotplug.o
Changes in v2:
- Don't compile xen/cpu_hotplug.o with ARM64
---
 drivers/xen/Makefile |3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
index a609353..f185e8d 100644
--- a/drivers/xen/Makefile
+++ b/drivers/xen/Makefile
@@ -1,9 +1,8 @@
 ifeq ($(filter y, $(CONFIG_ARM) $(CONFIG_ARM64)),)
-obj-y  += manage.o
 obj-$(CONFIG_HOTPLUG_CPU)  += cpu_hotplug.o
 endif
 obj-$(CONFIG_X86)  += fallback.o
-obj-y  += grant-table.o features.o events.o balloon.o time.o
+obj-y  += grant-table.o features.o events.o balloon.o time.o manage.o
 obj-y  += xenbus/
 
 nostackp := $(call cc-option, -fno-stack-protector)
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4] xen/arm: enable PV control for ARM

2013-07-23 Thread Julien Grall
On 07/23/2013 01:32 AM, Konrad Rzeszutek Wilk wrote:
> Julien Grall  wrote:
>> Enable power management from the toolstack for ARM guest.
>>
>> Signed-off-by: Julien Grall 
>>
>> ---
>>Changes in v4:
>>- Divide the patch in 2 distinct parts
>>Changes in v3:
>>- Fix condition to compile cpu_hotplug.o
>>Changes in v2:
>>- Don't compile xen/cpu_hotplug.o with ARM64
>> ---
>> drivers/xen/Makefile |3 +--
>> 1 file changed, 1 insertion(+), 2 deletions(-)
>>
>> diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
>> index a609353..f185e8d 100644
>> --- a/drivers/xen/Makefile
>> +++ b/drivers/xen/Makefile
>> @@ -1,9 +1,8 @@
>> ifeq ($(filter y, $(CONFIG_ARM) $(CONFIG_ARM64)),)
>> -obj-y   += manage.o
>> obj-$(CONFIG_HOTPLUG_CPU)+= cpu_hotplug.o
>> endif
>> obj-$(CONFIG_X86)+= fallback.o
>> -obj-y   += grant-table.o features.o events.o balloon.o time.o
>> +obj-y   += grant-table.o features.o events.o balloon.o time.o manage.o
>> obj-y+= xenbus/
>>
>> nostackp := $(call cc-option, -fno-stack-protector)
> 
> The patch looks Ok but the description is off.  Power management is the term 
> used for cpu freq,  C states and P states. While this patch touches none of 
> that. 
> 

What about : "Enable lifecycle (reboot, shutdown) management from the
toolstack for ARM guest"?

-- 
Julien

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v5] xen/arm: enable PV control for ARM

2013-07-23 Thread Julien Grall
Enable lifecyle management (reboot, shutdown...) from the toolstack
for ARM guest.

Signed-off-by: Julien Grall 

---
Changes in v5:
- Rework commit message
Changes in v4:
- Divide the patch in 2 distinct parts
Changes in v3:
- Fix condition to compile cpu_hotplug.o
Changes in v2:
- Don't compile xen/cpu_hotplug.o with ARM64
---
 drivers/xen/Makefile |3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
index a609353..f185e8d 100644
--- a/drivers/xen/Makefile
+++ b/drivers/xen/Makefile
@@ -1,9 +1,8 @@
 ifeq ($(filter y, $(CONFIG_ARM) $(CONFIG_ARM64)),)
-obj-y  += manage.o
 obj-$(CONFIG_HOTPLUG_CPU)  += cpu_hotplug.o
 endif
 obj-$(CONFIG_X86)  += fallback.o
-obj-y  += grant-table.o features.o events.o balloon.o time.o
+obj-y  += grant-table.o features.o events.o balloon.o time.o manage.o
 obj-y  += xenbus/
 
 nostackp := $(call cc-option, -fno-stack-protector)
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] xen: clear IRQ_NOAUTOEN and IRQ_NOREQUEST when a VIRQ is bound

2013-04-30 Thread Julien Grall
On 04/30/2013 04:02 PM, Stefano Stabellini wrote:

> On Mon, 29 Apr 2013, Julien Grall wrote:
>> Reset the IRQ_NOAUTOEN and IRQ_NOREQUEST flags that are enabled by
>> default on ARM. If IRQ_NOAUTOEN is set, __setup_irq doesn't call
>> irq_startup, that is responsible for calling irq_unmask at startup time.
>> As a result event channels remain masked.
>>
>> The clear is already made in bind_evtchn_to_irq with commit a8636c0 but was
>> missing in bind_virq_to_irq.
>>
>> Signed-off-by: Julien Grall 
> 
> As in the original commit, you should point out that this change does
> not have any effects on x86 (where IRQ_NOREQUEST and IRQ_NOAUTOEN are
> cleared by default).
> 
> At this point we might as well do this consistently everywhere we
> allocate a new evtchn irq, including pirqs and ipis, even though we don't
> actually use them on ARM.
> 
> If the call to irq_clear_status_flags can be moved earlier, a good place
> for it could be xen_irq_init.


Thanks for the review. I will give a try and send a version with it if
the solution works.

-- 
Julien
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2] xen: clear IRQ_NOAUTOEN and IRQ_NOREQUEST

2013-04-30 Thread Julien Grall
Reset the IRQ_NOAUTOEN and IRQ_NOREQUEST flags that are enabled by
default on ARM. If IRQ_NOAUTOEN is set, __setup_irq doesn't call
irq_startup, that is responsible for calling irq_unmask at startup time.
As a result event channels remain masked.

The clear is already made in bind_evtchn_to_irq with commit a8636c0 but was
missing on all others bind_*_to_irq. Move the clear in xen_irq_info_common_init.

On x86, IRQ_NOAUTOEN and IRQ_NOREQUEST are cleared by default, so this commit
doesn't impact this architecture.

Signed-off-by: Julien Grall 
---
 Changes since v1:
   - Specify this commit will not impact x86 in the comment
   - Clear flag directly in xen_irq_info_common_init, this function is
   called by all bind_*_to_irq

 drivers/xen/events.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/xen/events.c b/drivers/xen/events.c
index d8cc812..6a6bbe4 100644
--- a/drivers/xen/events.c
+++ b/drivers/xen/events.c
@@ -167,6 +167,8 @@ static void xen_irq_info_common_init(struct irq_info *info,
info->cpu = cpu;
 
evtchn_to_irq[evtchn] = irq;
+
+   irq_clear_status_flags(irq, IRQ_NOREQUEST|IRQ_NOAUTOEN);
 }
 
 static void xen_irq_info_evtchn_init(unsigned irq,
@@ -874,7 +876,6 @@ int bind_evtchn_to_irq(unsigned int evtchn)
struct irq_info *info = info_for_irq(irq);
WARN_ON(info == NULL || info->type != IRQT_EVTCHN);
}
-   irq_clear_status_flags(irq, IRQ_NOREQUEST|IRQ_NOAUTOEN);
 
 out:
mutex_unlock(&irq_mapping_update_lock);
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] xen: clear IRQ_NOAUTOEN and IRQ_NOREQUEST when a VIRQ is bound

2013-04-29 Thread Julien Grall
Reset the IRQ_NOAUTOEN and IRQ_NOREQUEST flags that are enabled by
default on ARM. If IRQ_NOAUTOEN is set, __setup_irq doesn't call
irq_startup, that is responsible for calling irq_unmask at startup time.
As a result event channels remain masked.

The clear is already made in bind_evtchn_to_irq with commit a8636c0 but was
missing in bind_virq_to_irq.

Signed-off-by: Julien Grall 
---
 drivers/xen/events.c |1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/xen/events.c b/drivers/xen/events.c
index d8cc812..b0ad226 100644
--- a/drivers/xen/events.c
+++ b/drivers/xen/events.c
@@ -994,6 +994,7 @@ int bind_virq_to_irq(unsigned int virq, unsigned int cpu)
WARN_ON(info == NULL || info->type != IRQT_VIRQ);
}
 
+   irq_clear_status_flags(irq, IRQ_NOREQUEST|IRQ_NOAUTOEN);
 out:
mutex_unlock(&irq_mapping_update_lock);
 
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH linux v3 3/9] xen: introduce xen_vcpu_id mapping

2016-09-07 Thread Julien Grall

Hi Vitaly,

On 07/09/2016 10:07, Vitaly Kuznetsov wrote:

Stefano Stabellini  writes:

I don't know that much about cpuid, but the virtual MPIDR is constructed
from the vcpu id right now:

v->arch.vmpidr = MPIDR_SMP | vcpuid_to_vaffinity(v->vcpu_id);

[...]

static inline register_t vcpuid_to_vaffinity(unsigned int vcpuid)
{
register_t vaff;

vaff = (vcpuid & 0x0f) << MPIDR_LEVEL_SHIFT(0);
vaff |= ((vcpuid >> 4) & MPIDR_LEVEL_MASK) << MPIDR_LEVEL_SHIFT(1);

return vaff;
}


This could work but only in case there is a way to get MPIDR for _other_
cpu (e.g. CPU0 needs to get MPIDR of CPU1 when CPU1 is not yet runnning)
or we'll have to change the machinery of how we bring up secondary CPUs
- e.g. CPUn starts, writes its id somewhere and 'hangs' waiting for CPU0
to set up event channels.


You can get the MPIDR from both the device tree and ACPI. The firmware 
table is parsed at boot time and the value is stored in cpu_logical_map().


Regards,

--
Julien Grall


Re: [PATCH linux v3 3/9] xen: introduce xen_vcpu_id mapping

2016-09-07 Thread Julien Grall

Hi Vitaly,

On 07/09/2016 12:23, Vitaly Kuznetsov wrote:

BTW, were you able to try the patch I suggested? In my opinion it would
be preferable to fix the immediate SMP issue now and play with MPIDR
info later.


Not yet sorry. I will see if I can try to today or tomorrow.

Cheers,

--
Julien Grall


Re: [PATCH linux v3 3/9] xen: introduce xen_vcpu_id mapping

2016-09-02 Thread Julien Grall

Hi Vitaly,

On 26/07/16 13:30, Vitaly Kuznetsov wrote:

It may happen that Xen's and Linux's ideas of vCPU id diverge. In
particular, when we crash on a secondary vCPU we may want to do kdump
and unlike plain kexec where we do migrate_to_reboot_cpu() we try booting
on the vCPU which crashed. This doesn't work very well for PVHVM guests as
we have a number of hypercalls where we pass vCPU id as a parameter. These
hypercalls either fail or do something unexpected. To solve the issue
introduce percpu xen_vcpu_id mapping. ARM and PV guests get direct mapping
for now. Boot CPU for PVHVM guest gets its id from CPUID. With secondary
CPUs it is a bit more trickier. Currently, we initialize IPI vectors
before these CPUs boot so we can't use CPUID. Use ACPI ids from MADT
instead.

Signed-off-by: Vitaly Kuznetsov 
---
Changes since v2:
- Use uint32_t for xen_vcpu_id mapping [Julien Grall]

Changes since v1:
- Introduce xen_vcpu_nr() helper [David Vrabel]
- Use ACPI ids instead of vLAPIC ids /2 [Andrew Cooper, Jan Beulich]
---
 arch/arm/xen/enlighten.c | 10 ++
 arch/x86/xen/enlighten.c | 23 ++-
 include/xen/xen-ops.h|  6 ++
 3 files changed, 38 insertions(+), 1 deletion(-)

diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
index 75cd734..fe32267 100644
--- a/arch/arm/xen/enlighten.c
+++ b/arch/arm/xen/enlighten.c
@@ -46,6 +46,10 @@ struct shared_info *HYPERVISOR_shared_info = (void 
*)&xen_dummy_shared_info;
 DEFINE_PER_CPU(struct vcpu_info *, xen_vcpu);
 static struct vcpu_info __percpu *xen_vcpu_info;

+/* Linux <-> Xen vCPU id mapping */
+DEFINE_PER_CPU(uint32_t, xen_vcpu_id) = U32_MAX;
+EXPORT_PER_CPU_SYMBOL(xen_vcpu_id);
+
 /* These are unused until we support booting "pre-ballooned" */
 unsigned long xen_released_pages;
 struct xen_memory_region xen_extra_mem[XEN_EXTRA_MEM_MAX_REGIONS] __initdata;
@@ -179,6 +183,9 @@ static void xen_percpu_init(void)
pr_info("Xen: initializing cpu%d\n", cpu);
vcpup = per_cpu_ptr(xen_vcpu_info, cpu);

+   /* Direct vCPU id mapping for ARM guests. */
+   per_cpu(xen_vcpu_id, cpu) = cpu;
+


We did some internal testing on ARM64 with the latest Linux kernel 
(4.8-rc4) and noticed that this patch is breaking SMP support. Sorry for 
noticing the issue that late.


This function is called on the running CPU whilst some code (e.g 
init_control_block in drivers/xen/events/events_fifo.c) is executed 
whilst preparing the CPU on the boot CPU.


So xen_vcpu_nr(cpu) will always return 0 in this case and 
init_control_block will fail to execute.


I am not sure how to fix. I guess we could setup per_cpu(xen_vcpu_id, *) 
in xen_guest_init. Any opinions?


[...]


diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index 0f87db2..c833912 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -1795,6 +1806,12 @@ static void __init init_hvm_pv_info(void)

xen_setup_features();

+   cpuid(base + 4, &eax, &ebx, &ecx, &edx);
+   if (eax & XEN_HVM_CPUID_VCPU_ID_PRESENT)
+   this_cpu_write(xen_vcpu_id, ebx);
+   else
+   this_cpu_write(xen_vcpu_id, smp_processor_id());
+
pv_info.name = "Xen HVM";

xen_domain_type = XEN_HVM_DOMAIN;
@@ -1806,6 +1823,10 @@ static int xen_hvm_cpu_notify(struct notifier_block 
*self, unsigned long action,
int cpu = (long)hcpu;
switch (action) {
case CPU_UP_PREPARE:
+   if (cpu_acpi_id(cpu) != U32_MAX)
+   per_cpu(xen_vcpu_id, cpu) = cpu_acpi_id(cpu);
+   else
+   per_cpu(xen_vcpu_id, cpu) = cpu;


I have not tried myself. But looking at the code, the notifiers 
xen_hvm_cpu_notifier and evtchn_fifo_cpu_notifier have the same 
priority. So what does prevent the code above to be executed after the 
event channel callback?



xen_vcpu_setup(cpu);
if (xen_have_vector_callback) {
if (xen_feature(XENFEAT_hvm_safe_pvclock))
diff --git a/include/xen/xen-ops.h b/include/xen/xen-ops.h
index 86abe07..648ce814 100644
--- a/include/xen/xen-ops.h
+++ b/include/xen/xen-ops.h
@@ -9,6 +9,12 @@

 DECLARE_PER_CPU(struct vcpu_info *, xen_vcpu);

+DECLARE_PER_CPU(uint32_t, xen_vcpu_id);
+static inline int xen_vcpu_nr(int cpu)
+{
+   return per_cpu(xen_vcpu_id, cpu);
+}
+
 void xen_arch_pre_suspend(void);
 void xen_arch_post_suspend(int suspend_cancelled);


Regards,

--
Julien Grall


[PATCH] xen/events: events_fifo: Don't use {get,put}_cpu() in xen_evtchn_fifo_init()

2017-08-17 Thread Julien Grall
When booting Linux as Xen guest with CONFIG_DEBUG_ATOMIC, the following
splat appears:

[0.002323] Mountpoint-cache hash table entries: 1024 (order: 1, 8192 bytes)
[0.019717] ASID allocator initialised with 65536 entries
[0.020019] xen:grant_table: Grant tables using version 1 layout
[0.020051] Grant table initialized
[0.020069] BUG: sleeping function called from invalid context at 
/data/src/linux/mm/page_alloc.c:4046
[0.020100] in_atomic(): 1, irqs_disabled(): 0, pid: 1, name: swapper/0
[0.020123] no locks held by swapper/0/1.
[0.020143] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.13.0-rc5 #598
[0.020166] Hardware name: FVP Base (DT)
[0.020182] Call trace:
[0.020199] [] dump_backtrace+0x0/0x270
[0.020222] [] show_stack+0x24/0x30
[0.020244] [] dump_stack+0xb8/0xf0
[0.020267] [] ___might_sleep+0x1c8/0x1f8
[0.020291] [] __might_sleep+0x58/0x90
[0.020313] [] __alloc_pages_nodemask+0x1c0/0x12e8
[0.020338] [] alloc_page_interleave+0x38/0x88
[0.020363] [] alloc_pages_current+0xdc/0xf0
[0.020387] [] __get_free_pages+0x28/0x50
[0.020411] [] evtchn_fifo_alloc_control_block+0x2c/0xa0
[0.020437] [] xen_evtchn_fifo_init+0x38/0xb4
[0.020461] [] xen_init_IRQ+0x44/0xc8
[0.020484] [] xen_guest_init+0x250/0x300
[0.020507] [] do_one_initcall+0x44/0x130
[0.020531] [] kernel_init_freeable+0x120/0x288
[0.020556] [] kernel_init+0x18/0x110
[0.020578] [] ret_from_fork+0x10/0x40
[0.020606] xen:events: Using FIFO-based ABI
[0.020658] Xen: initializing cpu0
[0.027727] Hierarchical SRCU implementation.
[0.036235] EFI services will not be available.
[0.043810] smp: Bringing up secondary CPUs ...

This is because get_cpu() in xen_evtchn_fifo_init() will disable
preemption, but __get_free_page() might sleep (GFP_ATOMIC is not set).

xen_evtchn_fifo_init() will always be called before SMP is initialized,
so {get,put}_cpu() could be replaced by a simple smp_processor_id().

This also avoid to modify evtchn_fifo_alloc_control_block that will be
called in other context.

Signed-off-by: Julien Grall 
Reported-by: Andre Przywara 
Fixes: 1fe565517b57 ("xen/events: use the FIFO-based ABI if available")
---
 drivers/xen/events/events_fifo.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/xen/events/events_fifo.c b/drivers/xen/events/events_fifo.c
index 3c41470c7fc4..76b318e88382 100644
--- a/drivers/xen/events/events_fifo.c
+++ b/drivers/xen/events/events_fifo.c
@@ -432,12 +432,12 @@ static int xen_evtchn_cpu_dead(unsigned int cpu)
 
 int __init xen_evtchn_fifo_init(void)
 {
-   int cpu = get_cpu();
+   int cpu = smp_processor_id();
int ret;
 
ret = evtchn_fifo_alloc_control_block(cpu);
if (ret < 0)
-   goto out;
+   return ret;
 
pr_info("Using FIFO-based ABI\n");
 
@@ -446,7 +446,6 @@ int __init xen_evtchn_fifo_init(void)
cpuhp_setup_state_nocalls(CPUHP_XEN_EVTCHN_PREPARE,
  "xen/evtchn:prepare",
  xen_evtchn_cpu_prepare, xen_evtchn_cpu_dead);
-out:
-   put_cpu();
+
return ret;
 }
-- 
2.11.0



[PATCH] xen/events: events_fifo: Don't use {get,put}_cpu() in xen_evtchn_fifo_init()

2017-08-17 Thread Julien Grall
When booting Linux as Xen guest with CONFIG_DEBUG_ATOMIC, the following
splat appears:

[0.002323] Mountpoint-cache hash table entries: 1024 (order: 1, 8192 bytes)
[0.019717] ASID allocator initialised with 65536 entries
[0.020019] xen:grant_table: Grant tables using version 1 layout
[0.020051] Grant table initialized
[0.020069] BUG: sleeping function called from invalid context at 
/data/src/linux/mm/page_alloc.c:4046
[0.020100] in_atomic(): 1, irqs_disabled(): 0, pid: 1, name: swapper/0
[0.020123] no locks held by swapper/0/1.
[0.020143] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.13.0-rc5 #598
[0.020166] Hardware name: FVP Base (DT)
[0.020182] Call trace:
[0.020199] [] dump_backtrace+0x0/0x270
[0.020222] [] show_stack+0x24/0x30
[0.020244] [] dump_stack+0xb8/0xf0
[0.020267] [] ___might_sleep+0x1c8/0x1f8
[0.020291] [] __might_sleep+0x58/0x90
[0.020313] [] __alloc_pages_nodemask+0x1c0/0x12e8
[0.020338] [] alloc_page_interleave+0x38/0x88
[0.020363] [] alloc_pages_current+0xdc/0xf0
[0.020387] [] __get_free_pages+0x28/0x50
[0.020411] [] evtchn_fifo_alloc_control_block+0x2c/0xa0
[0.020437] [] xen_evtchn_fifo_init+0x38/0xb4
[0.020461] [] xen_init_IRQ+0x44/0xc8
[0.020484] [] xen_guest_init+0x250/0x300
[0.020507] [] do_one_initcall+0x44/0x130
[0.020531] [] kernel_init_freeable+0x120/0x288
[0.020556] [] kernel_init+0x18/0x110
[0.020578] [] ret_from_fork+0x10/0x40
[0.020606] xen:events: Using FIFO-based ABI
[0.020658] Xen: initializing cpu0
[0.027727] Hierarchical SRCU implementation.
[0.036235] EFI services will not be available.
[0.043810] smp: Bringing up secondary CPUs ...

This is because get_cpu() in xen_evtchn_fifo_init() will disable
preemption, but __get_free_page() might sleep (GFP_ATOMIC is not set).

xen_evtchn_fifo_init() will always be called before SMP is initialized,
so {get,put}_cpu() could be replaced by a simple smp_processor_id().

This also avoid to modify evtchn_fifo_alloc_control_block that will be
called in other context.

Signed-off-by: Julien Grall 
Reported-by: Andre Przywara 
Fixes: 1fe565517b57 ("xen/events: use the FIFO-based ABI if available")
---
 drivers/xen/events/events_fifo.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/xen/events/events_fifo.c b/drivers/xen/events/events_fifo.c
index 3c41470c7fc4..76b318e88382 100644
--- a/drivers/xen/events/events_fifo.c
+++ b/drivers/xen/events/events_fifo.c
@@ -432,12 +432,12 @@ static int xen_evtchn_cpu_dead(unsigned int cpu)
 
 int __init xen_evtchn_fifo_init(void)
 {
-   int cpu = get_cpu();
+   int cpu = smp_processor_id();
int ret;
 
ret = evtchn_fifo_alloc_control_block(cpu);
if (ret < 0)
-   goto out;
+   return ret;
 
pr_info("Using FIFO-based ABI\n");
 
@@ -446,7 +446,6 @@ int __init xen_evtchn_fifo_init(void)
cpuhp_setup_state_nocalls(CPUHP_XEN_EVTCHN_PREPARE,
  "xen/evtchn:prepare",
  xen_evtchn_cpu_prepare, xen_evtchn_cpu_dead);
-out:
-   put_cpu();
+
return ret;
 }
-- 
2.11.0



Re: [PATCH] xen/events: events_fifo: Don't use {get,put}_cpu() in xen_evtchn_fifo_init()

2017-08-18 Thread Julien Grall

Hi Boris,

On 17/08/17 18:36, Boris Ostrovsky wrote:

On 08/17/2017 12:14 PM, Julien Grall wrote:

When booting Linux as Xen guest with CONFIG_DEBUG_ATOMIC, the following
splat appears:

[0.002323] Mountpoint-cache hash table entries: 1024 (order: 1, 8192 bytes)
[0.019717] ASID allocator initialised with 65536 entries
[0.020019] xen:grant_table: Grant tables using version 1 layout
[0.020051] Grant table initialized
[0.020069] BUG: sleeping function called from invalid context at 
/data/src/linux/mm/page_alloc.c:4046
[0.020100] in_atomic(): 1, irqs_disabled(): 0, pid: 1, name: swapper/0
[0.020123] no locks held by swapper/0/1.
[0.020143] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.13.0-rc5 #598
[0.020166] Hardware name: FVP Base (DT)
[0.020182] Call trace:
[0.020199] [] dump_backtrace+0x0/0x270
[0.020222] [] show_stack+0x24/0x30
[0.020244] [] dump_stack+0xb8/0xf0
[0.020267] [] ___might_sleep+0x1c8/0x1f8
[0.020291] [] __might_sleep+0x58/0x90
[0.020313] [] __alloc_pages_nodemask+0x1c0/0x12e8
[0.020338] [] alloc_page_interleave+0x38/0x88
[0.020363] [] alloc_pages_current+0xdc/0xf0
[0.020387] [] __get_free_pages+0x28/0x50
[0.020411] [] evtchn_fifo_alloc_control_block+0x2c/0xa0
[0.020437] [] xen_evtchn_fifo_init+0x38/0xb4
[0.020461] [] xen_init_IRQ+0x44/0xc8
[0.020484] [] xen_guest_init+0x250/0x300
[0.020507] [] do_one_initcall+0x44/0x130
[0.020531] [] kernel_init_freeable+0x120/0x288
[0.020556] [] kernel_init+0x18/0x110
[0.020578] [] ret_from_fork+0x10/0x40
[0.020606] xen:events: Using FIFO-based ABI
[0.020658] Xen: initializing cpu0
[0.027727] Hierarchical SRCU implementation.
[0.036235] EFI services will not be available.
[0.043810] smp: Bringing up secondary CPUs ...

This is because get_cpu() in xen_evtchn_fifo_init() will disable
preemption, but __get_free_page() might sleep (GFP_ATOMIC is not set).

xen_evtchn_fifo_init() will always be called before SMP is initialized,
so {get,put}_cpu() could be replaced by a simple smp_processor_id().


On x86 this will be called out of init_IRQ(), which is already preceded
by preempt_disable().


Well the main problem is preempt_disable() itself. in_atomic() will 
check preempt_count and return 1 if it is non-zero.


__get_free_page might sleep if GFP_ATOMIC is not set and therefore you 
will see the splat when CONFIG_DEBUG_ATOMIC is enabled. However, those 
checks don't happen before the scheduler is setup. Hence why you don't 
see the error on x86.


Cheers,



Reviewed-by: Boris Ostrovsky 



--
Julien Grall


Re: [Xen-devel] [PATCH 2/2] block/xen-blkfront: Handle non-indirect grant with 64KB pages

2015-10-05 Thread Julien Grall
On 05/10/15 17:01, Roger Pau Monné wrote:
> El 11/09/15 a les 21.32, Julien Grall ha escrit:
>> The minimal size of request in the block framework is always PAGE_SIZE.
>> It means that when 64KB guest is support, the request will at least be
>> 64KB.
>>
>> Although, if the backend doesn't support indirect grant (such as QDISK
>> in QEMU), a ring request is only able to accomodate 11 segments of 4KB
>> (i.e 44KB).
>>
>> The current frontend is assuming that an I/O request will always fit in
>> a ring request. This is not true any more when using 64KB page
>> granularity and will therefore crash during the boot.
>^ during boot.
>>
>> On ARM64, the ABI is completely neutral to the page granularity used by
>> the domU. The guest has the choice between different page granularity
>> supported by the processors (for instance on ARM64: 4KB, 16KB, 64KB).
>> This can't be enforced by the hypervisor and therefore it's possible to
>> run guests using different page granularity.
>>
>> So we can't mandate the block backend to support non-indirect grant
>> when the frontend is using 64KB page granularity and have to fix it
>> properly in the frontend.
>>
>> The solution exposed below is based on modifying directly the frontend
>> guest rather than asking the block framework to support smaller size
>> (i.e < PAGE_SIZE). This is because the change is the block framework are
>> not trivial as everything seems to relying on a struct *page (see [1]).
>> Although, it may be possible that someone succeed to do it in the future
>> and we would therefore be able to use advantage.
>^ it. (no advantage IMHO)
>>
>> Given that a block request may not fit in a single ring request, a
>> second request is introduced for the data that cannot fit in the first
>> one. This means that the second request should never be used on Linux
>> configuration using a page granularity < 44KB.
>   ^ if the page size is smaller than 44KB.
>>
>> Note that the parameters blk_queue_max_* helpers haven't been updated.
>> The block code will set mimimum size supported and we may be able  to
>   ^ the minimumextra space ^
>> support directly any change in the block framework that lower down the
>> mimimal size of a request.
>   ^ minimal
> 
> I have a concern regarding the splitting done in this patch.
> 
> What happens with FUA requests when split? For example the frontend
> receives a FUA requests with 64KB of data, and this is split into two
> different requests on the ring, is this going to cause trouble in the
> higher layers if for example the first request is completed but the
> second is not? Could we leave the disk in a bad state as a consequence
> of this?

If a block request is split into two ring requests, we will wait the two
responses before reporting the completion to the higher layers (see
blkif_interrupt and blkif_completion).

Furthermore, the second ring request will always use the same operation
as the first one. Note that you will flush twice which is not nice but
could be improved.

> 
>> [1] http://lists.xen.org/archives/html/xen-devel/2015-08/msg02200.html
>>
>> Signed-off-by: Julien Grall 
>>
>> ---
>> Cc: Konrad Rzeszutek Wilk 
>> Cc: "Roger Pau Monné" 
>> Cc: Boris Ostrovsky 
>> Cc: David Vrabel 
>> ---
>>  drivers/block/xen-blkfront.c | 199 
>> +++
>>  1 file changed, 183 insertions(+), 16 deletions(-)
>>
>> diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
>> index f9d55c3..03772c9 100644
>> --- a/drivers/block/xen-blkfront.c
>> +++ b/drivers/block/xen-blkfront.c
>> @@ -60,6 +60,20 @@
>>  
>>  #include 
>>  
>> +/*
>> + * The block framework is always working on segment of PAGE_SIZE minimum.
> 
> The above sentence needs to be reworded.

What about:

"The mininal size of the segment supported by the block framework is
PAGE_SIZE."

> 
>> + * When Linux is using a different page size than xen, it may not be 
>> possible
>> + * to put all the data in a single segment.
>> + * This can happen when the backend doesn't support indirect grant and
>  indirect requests ^
>> + * therefore the maximum amount of data that a request can carry is
>> + * BLKIF_MAX_SEGMENTS_PER_REQUEST * XEN_PAGE_SIZE = 44KB
>> + *
>> + * Note that we only support one extra request. So the Linux page size
>> + * shou

Re: [Xen-devel] [PATCH 2/2] block/xen-blkfront: Handle non-indirect grant with 64KB pages

2015-10-06 Thread Julien Grall

Hi Roger,

On 06/10/2015 10:39, Roger Pau Monné wrote:

El 05/10/15 a les 19.05, Julien Grall ha escrit:

On 05/10/15 17:01, Roger Pau Monné wrote:

El 11/09/15 a les 21.32, Julien Grall ha escrit:

ring_req->u.rw.nr_segments = num_grant;
+   if (unlikely(require_extra_req)) {
+   id2 = blkif_ring_get_request(info, req, &ring_req2);


How can you guarantee that there's always going to be another free
request? AFAICT blkif_queue_rq checks for RING_FULL, but you don't
actually know if there's only one slot or more than one available.


Because the depth of the queue is divided by 2 when the extra request is
used (see xlvbd_init_blk_queue).


I just noticed that I didn't mention this restriction in the commit 
message. I will do it in the next revision.



I see, that's quite restrictive but I guess it's better than introducing
a new ring macro in order to figure out if there are at least two free
slots.


I actually didn't think about your suggestion. I choose to divide by two 
based on the assumption that the block framework will always try to send 
a request with the maximum data possible.


I don't know if this assumption is correct as I'm not fully aware how 
the block framework is working.


If it's valid, in the case of 64KB guest, the maximum size of a request 
would be 64KB when indirect segment is not supported. So we would end up 
with a lot of 64KB request which will require 2 ring request.


Regards,

--
Julien Grall
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] xen/balloon: Use the correct sizeof when declaring frame_list

2015-10-07 Thread Julien Grall
The type of the item in frame_list is xen_pfn_t which is not an unsigned
long on ARM but an uint64_t.

With the current computation, the size of frame_list will be 2 *
PAGE_SIZE rather than PAGE_SIZE.

I bet it's just mistake when the type has been switched from "unsigned
long" to "xen_pfn_t" in commit 965c0aaafe3e75d4e65cd4ec862915869bde3abd
"xen: balloon: use correct type for frame_list".

Signed-off-by: Julien Grall 



Cc: Konrad Rzeszutek Wilk 
Cc: Boris Ostrovsky 
Cc: David Vrabel 
---
 drivers/xen/balloon.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
index b50d229..12eab50 100644
--- a/drivers/xen/balloon.c
+++ b/drivers/xen/balloon.c
@@ -141,7 +141,7 @@ struct balloon_stats balloon_stats;
 EXPORT_SYMBOL_GPL(balloon_stats);
 
 /* We increase/decrease in batches which fit in a page */
-static xen_pfn_t frame_list[PAGE_SIZE / sizeof(unsigned long)];
+static xen_pfn_t frame_list[PAGE_SIZE / sizeof(xen_pfn_t)];
 
 
 /* List of ballooned pages, threaded through the mem_map array. */
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 04/20] xen/grant: Introduce helpers to split a page into grant

2015-08-28 Thread Julien Grall
Hi David,

On 20/08/15 10:51, David Vrabel wrote:
> On 07/08/15 17:46, Julien Grall wrote:
>> Currently, a grant is always based on the Xen page granularity (i.e
>> 4KB). When Linux is using a different page granularity, a single page
>> will be split between multiple grants.
>>
>> The new helpers will be in charge to split the Linux page into grants and
>> call a function given by the caller on each grant.
>>
>> Also provide an helper to count the number of grants within a given
>> contiguous region.
>>
>> Note that the x86/include/asm/xen/page.h is now including
>> xen/interface/grant_table.h rather than xen/grant_table.h. It's
>> necessary because xen/grant_table.h depends on asm/xen/page.h and will
>> break the compilation. Furthermore, only definition in
>> interface/grant_table.h was required.
> 
> Reviewed-by: David Vrabel 
> But...
> 
>> +/* Helper to get to call fn only on the first "grant chunk" */
>> +static inline void gnttab_one_grant(struct page *page, unsigned int offset,
>> +unsigned len, xen_grant_fn_t fn,
>> +    void *data)
> 
> ...call this gnttab_for_one_grant().

Will rename it on the next version.

Regards,

-- 
Julien Grall
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 11/20] tty/hvc: xen: Use xen page definition

2015-08-28 Thread Julien Grall
Hi David,

On 20/08/15 10:55, David Vrabel wrote:
> On 07/08/15 17:46, Julien Grall wrote:
>> The console ring is always based on the page granularity of Xen.
> [...]
>> --- a/drivers/tty/hvc/hvc_xen.c
>> +++ b/drivers/tty/hvc/hvc_xen.c
>> @@ -230,7 +230,7 @@ static int xen_hvm_console_init(void)
>>  if (r < 0 || v == 0)
>>  goto err;
>>  gfn = v;
>> -info->intf = xen_remap(gfn << PAGE_SHIFT, PAGE_SIZE);
>> +info->intf = xen_remap(gfn << XEN_PAGE_SHIFT, PAGE_SIZE);
> 
> You need XEN_PAGE_SIZE here I think...

Right, I did the mistake while rebase on my s/mfn/gfn/ series. I will
fix it in the next version.

Regards,

-- 
Julien Grall
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 12/20] xen/balloon: Don't rely on the page granularity is the same for Xen and Linux

2015-08-28 Thread Julien Grall
On 20/08/15 10:59, David Vrabel wrote:
> On 07/08/15 17:46, Julien Grall wrote:
>> For ARM64 guests, Linux is able to support either 64K or 4K page
>> granularity. Although, the hypercall interface is always based on 4K
>> page granularity.
>>
>> With 64K page granularity, a single page will be spread over multiple
>> Xen frame.
>>
>> To avoid splitting the page into 4K frame, take advantage of the
>> extent_order field to directly allocate/free chunk of the Linux page
>> size.
>>
>> Note that PVMMU is only used for PV guest (which is x86) and the page
>> granularity is always 4KB. Some BUILD_BUG_ON has been added to ensure
>> that because the code has not been modified.
> [...]
>>  #ifdef CONFIG_XEN_HAVE_PVMMU
>> +/* We don't support PV MMU when Linux and Xen is using
>> + * different page granularity.
>> + */
>> +BUILD_BUG_ON(XEN_PAGE_SIZE != PAGE_SIZE);
> 
> You don't need this BUILD_BUG_ON() twice.

I put twice the BUILD_BUG_ON so if we ever decide to drop one of the
#ifdef CONFIG_XEN_HAVE_PVMMU, the check will still be present.

So I'd like to keep it.

> Otherwise,
> 
> Reviewed-by: David Vrabel 

Based on the discussion with Stefano I rework a bit the balloon code to
re-use lru field (see [1]), so I won't retain your reviewed-by on the
next series.

Regards,

[1] http://lists.xen.org/archives/html/xen-devel/2015-08/msg00781.html

-- 
Julien Grall
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 15/20] block/xen-blkfront: Make it running on 64KB page granularity

2015-08-28 Thread Julien Grall
On 20/08/15 09:10, Roger Pau Monné wrote:
> Hello,

Hi,

> I have some comments regarding the commit message, IMHO it would be good
> that a native English speaker reviews it too.
> 
> El 07/08/15 a les 18.46, Julien Grall ha escrit:
>> The PV block protocol is using 4KB page granularity. The goal of this
>> patch is to allow a Linux using 64KB page granularity using block
>> device on a non-modified Xen.
>>
>> The block API is using segment which should at least be the size of a
>> Linux page. Therefore, the driver will have to break the page in chunk
>> of 4K before giving the page to the backend.
>>
>> Breaking a 64KB segment in 4KB chunk will result to have some chunk with
>> no data.
> 
> I would rewrite the above line as:
> 
> When breaking a 64KB segment into 4KB chunks it is possible that some
Correct,
> chunks are empty.

Sounds good, I will replace with it.

>> As the PV protocol always require to have data in the chunk, we
>> have to count the number of Xen page which will be in use and avoid to
>   ^pages
>> sent empty chunk.
>   ^and avoid sending empty chunks.
>>
>> Note that, a pre-defined number of grant is reserved before preparing
>  ^grants are
>> the request. This pre-defined number is based on the number and the
>> maximum size of the segments. If each segment contain a very small
> ^contains
>> amount of data, the driver may reserve too much grant (16 grant is
>  ^many grants   ^grants are
>> reserved per segment with 64KB page granularity).
>>
>> Futhermore, in the case of persistent grant we allocate one Linux page
> ^grants
>> per grant although only the 4KB of the page will be effectively use.
>      ^first  ^in
>> This could be improved by share the page with multiple grants.
> ^sharing
>>
>> Signed-off-by: Julien Grall 
> 
> LGTM:
> 
> Acked-by: Roger Pau Monné 

Thank you, I will fix all the typos in the next version.

> Just one question.

[..]

>> +gnttab_foreach_grant_in_range(sg_page(sg),
>> +  sg->offset,
>> +  sg->length,
>> +  blkif_setup_rw_req_grant,
>> +  &setup);
> 
> If I'm understanding this right, on x86 gnttab_foreach_grant_in_range is
> only going to perform one iteration, since XEN_PAGE_SIZE == PAGE_SIZE.

Correct, it will only perform when iteration for x86 but also for arm32
and arm64 (when 4KB page is in use).

Regards,

-- 
Julien Grall
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 19/20] xen/privcmd: Add support for Linux 64KB page granularity

2015-09-01 Thread Julien Grall
Hi Stefano,

On 10/08/15 13:03, Stefano Stabellini wrote:
>> +xen_pfn = xen_page_to_pfn(page);
>> +}
>> +fn(pfn_to_gfn(xen_pfn++), data);
> 
> What is the purpose of incrementing xen_pfn here?

Because the Linux page is split into multiple xen_pfn, so we want to get
the next xen_pfn for the next iteration.

Regards,

-- 
Julien Grall
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH 2/2] block/xen-blkfront: Handle non-indirect grant with 64KB pages

2015-10-12 Thread Julien Grall
On 06/10/15 11:06, Roger Pau Monné wrote:
> El 06/10/15 a les 11.58, Julien Grall ha escrit:
>> Hi Roger,
>>
>> On 06/10/2015 10:39, Roger Pau Monné wrote:
>>> El 05/10/15 a les 19.05, Julien Grall ha escrit:
>>>> On 05/10/15 17:01, Roger Pau Monné wrote:
>>>>> El 11/09/15 a les 21.32, Julien Grall ha escrit:
>>>>>>   ring_req->u.rw.nr_segments = num_grant;
>>>>>> +if (unlikely(require_extra_req)) {
>>>>>> +id2 = blkif_ring_get_request(info, req, &ring_req2);
>>>>>
>>>>> How can you guarantee that there's always going to be another free
>>>>> request? AFAICT blkif_queue_rq checks for RING_FULL, but you don't
>>>>> actually know if there's only one slot or more than one available.
>>>>
>>>> Because the depth of the queue is divided by 2 when the extra request is
>>>> used (see xlvbd_init_blk_queue).
>>
>> I just noticed that I didn't mention this restriction in the commit
>> message. I will do it in the next revision.
>>
>>> I see, that's quite restrictive but I guess it's better than introducing
>>> a new ring macro in order to figure out if there are at least two free
>>> slots.
>>
>> I actually didn't think about your suggestion. I choose to divide by two
>> based on the assumption that the block framework will always try to send
>> a request with the maximum data possible.
> 
> AFAIK that depends on the request itself, the block layer will try to
> merge requests if possible, but you can also expect that there are going
> to be requests that will just contain a single block.
> 
>> I don't know if this assumption is correct as I'm not fully aware how
>> the block framework is working.
>>
>> If it's valid, in the case of 64KB guest, the maximum size of a request
>> would be 64KB when indirect segment is not supported. So we would end up
>> with a lot of 64KB request which will require 2 ring request.
> 
> I guess you could add a counter in order to see how many requests were
> split vs total number of requests.

So the number of 64KB request is fairly small compare to the total
number of request (277 for 4687 request) for general usage (i.e cd, find).

Although as soon as I use dd, the block request will be merged. So I
guess a common usage will not provide enough data to fill a 64KB request.

Although as soon as I use dd with a block size of 64KB, most of the
request fill 64KB and an extra request is required.

Note that I had to implement quickly xen_biovec_phys_mergeable for 64KB
page as I left this aside. Without it, the biovec won't be merge except
with dd if you specific the block size (bs=64KB).

I've also looked to the code to see if it's possible to check if there
is 2 ring requests free and if not waiting until they are available.

Currently, we don't need to check if a request if free because the queue
is sized according to the number of request supported by the ring. This
means that the block layer is handling the check and we will always have
space in the ring.

If we decide to avoid dividing the number of request enqueue by the
block layer, we would have to handle ourself if there is 2 ring requests
free.
AFAICT, when BLK_MQ_RQ_BUSY is returned the block layer will stop the
queue. So we need to have some logic in blkfront to know when the 2 ring
requests become free and restart the queue. I guest it would be similar
to gnttab_request_free_callback.

I'd like your advice to know whether this is worth to implement it in
blockfront given that it will be only used for 64KB guest with backend
not supporting indirect grant.

Regards,

-- 
Julien Grall
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/3] xenbus: Support multiple grants ring with 64KB

2015-10-13 Thread Julien Grall
The PV ring may use multiple grants and expect them to be mapped
contiguously in the virtual memory.

Although, the current code is relying on a Linux page will be mapped to
a single grant. On build where Linux is using a different page size than
the grant (i.e other than 4KB), the grant will always be mapped on the
first 4KB of each Linux page which make the final ring not contiguous in
the memory.

This can be fixed by mapping multiple grant in a same Linux page.

Signed-off-by: Julien Grall 

---
Cc: Konrad Rzeszutek Wilk 
Cc: Boris Ostrovsky 
Cc: David Vrabel 
Cc: Stefano Stabellini 
---
 drivers/xen/xenbus/xenbus_client.c | 97 --
 1 file changed, 72 insertions(+), 25 deletions(-)

diff --git a/drivers/xen/xenbus/xenbus_client.c 
b/drivers/xen/xenbus/xenbus_client.c
index b776433..056da6e 100644
--- a/drivers/xen/xenbus/xenbus_client.c
+++ b/drivers/xen/xenbus/xenbus_client.c
@@ -49,6 +49,10 @@
 
 #include "xenbus_probe.h"
 
+#define XENBUS_PAGES(_grants)  (DIV_ROUND_UP(_grants, XEN_PFN_PER_PAGE))
+
+#define XENBUS_MAX_RING_PAGES  (XENBUS_PAGES(XENBUS_MAX_RING_GRANTS))
+
 struct xenbus_map_node {
struct list_head next;
union {
@@ -56,7 +60,8 @@ struct xenbus_map_node {
struct vm_struct *area;
} pv;
struct {
-   struct page *pages[XENBUS_MAX_RING_GRANTS];
+   struct page *pages[XENBUS_MAX_RING_PAGES];
+   unsigned long addrs[XENBUS_MAX_RING_GRANTS];
void *addr;
} hvm;
};
@@ -591,19 +596,42 @@ failed:
return err;
 }
 
+struct map_ring_valloc_hvm
+{
+   unsigned int idx;
+
+   /* Why do we need two arrays? See comment of __xenbus_map_ring */
+   phys_addr_t phys_addrs[XENBUS_MAX_RING_GRANTS];
+   unsigned long addrs[XENBUS_MAX_RING_GRANTS];
+};
+
+static void xenbus_map_ring_setup_grant_hvm(unsigned long gfn,
+   unsigned int goffset,
+   unsigned int len,
+   void *data)
+{
+   struct map_ring_valloc_hvm *info = data;
+   unsigned long vaddr = (unsigned long)gfn_to_virt(gfn);
+
+   info->phys_addrs[info->idx] = vaddr;
+   info->addrs[info->idx] = vaddr;
+
+   info->idx++;
+}
+
 static int xenbus_map_ring_valloc_hvm(struct xenbus_device *dev,
  grant_ref_t *gnt_ref,
  unsigned int nr_grefs,
  void **vaddr)
 {
struct xenbus_map_node *node;
-   int i;
int err;
void *addr;
bool leaked = false;
-   /* Why do we need two arrays? See comment of __xenbus_map_ring */
-   phys_addr_t phys_addrs[XENBUS_MAX_RING_GRANTS];
-   unsigned long addrs[XENBUS_MAX_RING_GRANTS];
+   struct map_ring_valloc_hvm info = {
+   .idx = 0,
+   };
+   unsigned int nr_pages = XENBUS_PAGES(nr_grefs);
 
if (nr_grefs > XENBUS_MAX_RING_GRANTS)
return -EINVAL;
@@ -614,24 +642,22 @@ static int xenbus_map_ring_valloc_hvm(struct 
xenbus_device *dev,
if (!node)
return -ENOMEM;
 
-   err = alloc_xenballooned_pages(nr_grefs, node->hvm.pages);
+   err = alloc_xenballooned_pages(nr_pages, node->hvm.pages);
if (err)
goto out_err;
 
-   for (i = 0; i < nr_grefs; i++) {
-   unsigned long pfn = page_to_pfn(node->hvm.pages[i]);
-   phys_addrs[i] = (unsigned long)pfn_to_kaddr(pfn);
-   addrs[i] = (unsigned long)pfn_to_kaddr(pfn);
-   }
+   gnttab_foreach_grant(node->hvm.pages, nr_grefs,
+xenbus_map_ring_setup_grant_hvm,
+&info);
 
err = __xenbus_map_ring(dev, gnt_ref, nr_grefs, node->handles,
-   phys_addrs, GNTMAP_host_map, &leaked);
+   info.phys_addrs, GNTMAP_host_map, &leaked);
node->nr_handles = nr_grefs;
 
if (err)
goto out_free_ballooned_pages;
 
-   addr = vmap(node->hvm.pages, nr_grefs, VM_MAP | VM_IOREMAP,
+   addr = vmap(node->hvm.pages, nr_pages, VM_MAP | VM_IOREMAP,
PAGE_KERNEL);
if (!addr) {
err = -ENOMEM;
@@ -649,14 +675,13 @@ static int xenbus_map_ring_valloc_hvm(struct 
xenbus_device *dev,
 
  out_xenbus_unmap_ring:
if (!leaked)
-   xenbus_unmap_ring(dev, node->handles, node->nr_handles,
- addrs);
+   xenbus_unmap_ring(dev, node->handles, nr_grefs, info.addrs);
else
pr_alert("leaking %p size %u page(s)",
-addr, nr_grefs);
+addr, nr_pages);
  

[PATCH 0/3] xen/xenbus: Support multiple grants ring with 64KB page

2015-10-13 Thread Julien Grall
Hi all,

The support of multiple grants ring was left aside for 64KB page. This series
aims to fix it.

It's based on xentip/for-linus-4.4.

Sincelerely yours,

Cc: Boris Ostrovsky 
Cc: David Vrabel 
Cc: Konrad Rzeszutek Wilk 
Cc: "Roger Pau Monné" 
Cc: Stefano Stabellini 

Julien Grall (3):
  xen/xenbus: Rename *RING_PAGE* to *RING_GRANT*
  xen/grant-table: Add an helper to iterate over a specific number of
grants
  xenbus: Support multiple grants ring with 64KB

 drivers/block/xen-blkback/blkback.c |   8 +--
 drivers/block/xen-blkback/xenbus.c  |   2 +-
 drivers/block/xen-blkfront.c|  10 +--
 drivers/xen/grant-table.c   |  22 +++
 drivers/xen/xenbus/xenbus_client.c  | 121 +---
 include/xen/grant_table.h   |   6 ++
 include/xen/xenbus.h|   4 +-
 7 files changed, 124 insertions(+), 49 deletions(-)

-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/3] xen/xenbus: Rename *RING_PAGE* to *RING_GRANT*

2015-10-13 Thread Julien Grall
Linux may use a different page size than the size of grant. So make
clear that the order is actually in number of grant.

Signed-off-by: Julien Grall 

---
Cc: Konrad Rzeszutek Wilk 
Cc: "Roger Pau Monné" 
Cc: Boris Ostrovsky 
Cc: David Vrabel 
Cc: Stefano Stabellini 
---
 drivers/block/xen-blkback/blkback.c |  8 
 drivers/block/xen-blkback/xenbus.c  |  2 +-
 drivers/block/xen-blkfront.c| 10 +-
 drivers/xen/xenbus/xenbus_client.c  | 34 +-
 include/xen/xenbus.h|  4 ++--
 5 files changed, 29 insertions(+), 29 deletions(-)

diff --git a/drivers/block/xen-blkback/blkback.c 
b/drivers/block/xen-blkback/blkback.c
index 809634c..f909994 100644
--- a/drivers/block/xen-blkback/blkback.c
+++ b/drivers/block/xen-blkback/blkback.c
@@ -87,7 +87,7 @@ MODULE_PARM_DESC(max_persistent_grants,
  * Maximum order of pages to be used for the shared ring between front and
  * backend, 4KB page granularity is used.
  */
-unsigned int xen_blkif_max_ring_order = XENBUS_MAX_RING_PAGE_ORDER;
+unsigned int xen_blkif_max_ring_order = XENBUS_MAX_RING_GRANT_ORDER;
 module_param_named(max_ring_page_order, xen_blkif_max_ring_order, int, 
S_IRUGO);
 MODULE_PARM_DESC(max_ring_page_order, "Maximum order of pages to be used for 
the shared ring");
 /*
@@ -1446,10 +1446,10 @@ static int __init xen_blkif_init(void)
if (!xen_domain())
return -ENODEV;
 
-   if (xen_blkif_max_ring_order > XENBUS_MAX_RING_PAGE_ORDER) {
+   if (xen_blkif_max_ring_order > XENBUS_MAX_RING_GRANT_ORDER) {
pr_info("Invalid max_ring_order (%d), will use default max: 
%d.\n",
-   xen_blkif_max_ring_order, XENBUS_MAX_RING_PAGE_ORDER);
-   xen_blkif_max_ring_order = XENBUS_MAX_RING_PAGE_ORDER;
+   xen_blkif_max_ring_order, XENBUS_MAX_RING_GRANT_ORDER);
+   xen_blkif_max_ring_order = XENBUS_MAX_RING_GRANT_ORDER;
}
 
rc = xen_blkif_interface_init();
diff --git a/drivers/block/xen-blkback/xenbus.c 
b/drivers/block/xen-blkback/xenbus.c
index edd27e4..cde8ccd 100644
--- a/drivers/block/xen-blkback/xenbus.c
+++ b/drivers/block/xen-blkback/xenbus.c
@@ -827,7 +827,7 @@ again:
 static int connect_ring(struct backend_info *be)
 {
struct xenbus_device *dev = be->dev;
-   unsigned int ring_ref[XENBUS_MAX_RING_PAGES];
+   unsigned int ring_ref[XENBUS_MAX_RING_GRANTS];
unsigned int evtchn, nr_grefs, ring_page_order;
unsigned int pers_grants;
char protocol[64] = "";
diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
index 43cda94..3ea948c 100644
--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -111,7 +111,7 @@ MODULE_PARM_DESC(max_ring_page_order, "Maximum order of 
pages to be used for the
__CONST_RING_SIZE(blkif, XEN_PAGE_SIZE * (info)->nr_ring_pages)
 
 #define BLK_MAX_RING_SIZE  \
-   __CONST_RING_SIZE(blkif, XEN_PAGE_SIZE * XENBUS_MAX_RING_PAGES)
+   __CONST_RING_SIZE(blkif, XEN_PAGE_SIZE * XENBUS_MAX_RING_GRANTS)
 
 /*
  * ring-ref%i i=(-1UL) would take 11 characters + 'ring-ref' is 8, so 19
@@ -133,7 +133,7 @@ struct blkfront_info
int vdevice;
blkif_vdev_t handle;
enum blkif_state connected;
-   int ring_ref[XENBUS_MAX_RING_PAGES];
+   int ring_ref[XENBUS_MAX_RING_GRANTS];
unsigned int nr_ring_pages;
struct blkif_front_ring ring;
unsigned int evtchn, irq;
@@ -1412,7 +1412,7 @@ static int setup_blkring(struct xenbus_device *dev,
struct blkif_sring *sring;
int err, i;
unsigned long ring_size = info->nr_ring_pages * XEN_PAGE_SIZE;
-   grant_ref_t gref[XENBUS_MAX_RING_PAGES];
+   grant_ref_t gref[XENBUS_MAX_RING_GRANTS];
 
for (i = 0; i < info->nr_ring_pages; i++)
info->ring_ref[i] = GRANT_INVALID_REF;
@@ -2283,9 +2283,9 @@ static int __init xlblk_init(void)
if (!xen_domain())
return -ENODEV;
 
-   if (xen_blkif_max_ring_order > XENBUS_MAX_RING_PAGE_ORDER) {
+   if (xen_blkif_max_ring_order > XENBUS_MAX_RING_GRANT_ORDER) {
pr_info("Invalid max_ring_order (%d), will use default max: 
%d.\n",
-   xen_blkif_max_ring_order, XENBUS_MAX_RING_PAGE_ORDER);
+   xen_blkif_max_ring_order, XENBUS_MAX_RING_GRANT_ORDER);
xen_blkif_max_ring_order = 0;
}
 
diff --git a/drivers/xen/xenbus/xenbus_client.c 
b/drivers/xen/xenbus/xenbus_client.c
index 42abee3..b776433 100644
--- a/drivers/xen/xenbus/xenbus_client.c
+++ b/drivers/xen/xenbus/xenbus_client.c
@@ -56,11 +56,11 @@ struct xenbus_map_node {
struct vm_struct *area;
} pv;
struct {
-   struct page *pages[XENBUS_MAX_RING_PAGES];
+   struct

[PATCH 2/3] xen/grant-table: Add an helper to iterate over a specific number of grants

2015-10-13 Thread Julien Grall
With the 64KB page granularity support on ARM64, a Linux page may be
split accross multiple grant.

Currently we have the helper gnttab_foreach_grant_in_grant to break a
Linux page based on an offset and a len, but it doesn't fit when we only
have a number of grants in hand.

Introduce a new helper which take an array of Linux page and a number of
grant and will figure out the address of each grant.

Signed-off-by: Julien Grall 

---
Cc: Konrad Rzeszutek Wilk 
Cc: Boris Ostrovsky 
Cc: David Vrabel 
Cc: Stefano Stabellini 
---
 drivers/xen/grant-table.c | 22 ++
 include/xen/grant_table.h |  6 ++
 2 files changed, 28 insertions(+)

diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c
index 72d6339..c49f79ed 100644
--- a/drivers/xen/grant-table.c
+++ b/drivers/xen/grant-table.c
@@ -802,6 +802,28 @@ void gnttab_foreach_grant_in_range(struct page *page,
 }
 EXPORT_SYMBOL_GPL(gnttab_foreach_grant_in_range);
 
+void gnttab_foreach_grant(struct page **pages,
+ unsigned int nr_grefs,
+ xen_grant_fn_t fn,
+ void *data)
+{
+   unsigned int goffset = 0;
+   unsigned long xen_pfn = 0;
+   unsigned int i;
+
+   for (i = 0; i < nr_grefs; i++) {
+   if ((i % XEN_PFN_PER_PAGE) == 0) {
+   xen_pfn = page_to_xen_pfn(pages[i / XEN_PFN_PER_PAGE]);
+   goffset = 0;
+   }
+
+   fn(pfn_to_gfn(xen_pfn), goffset, XEN_PAGE_SIZE, data);
+
+   goffset += XEN_PAGE_SIZE;
+   xen_pfn++;
+   }
+}
+
 int gnttab_map_refs(struct gnttab_map_grant_ref *map_ops,
struct gnttab_map_grant_ref *kmap_ops,
struct page **pages, unsigned int count)
diff --git a/include/xen/grant_table.h b/include/xen/grant_table.h
index e17a4b3..34b1379 100644
--- a/include/xen/grant_table.h
+++ b/include/xen/grant_table.h
@@ -264,6 +264,12 @@ static inline void gnttab_for_one_grant(struct page *page, 
unsigned int offset,
gnttab_foreach_grant_in_range(page, offset, len, fn, data);
 }
 
+/* Get @nr_grefs grants from an array of page and call fn for each grant */
+void gnttab_foreach_grant(struct page **pages,
+ unsigned int nr_grefs,
+ xen_grant_fn_t fn,
+ void *data);
+
 /* Get the number of grant in a specified region
  *
  * start: Offset from the beginning of the first page
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH] xen-blkback: free requests on disconnection

2015-09-07 Thread Julien Grall
On 07/09/15 07:07, Bob Liu wrote:
> Hi Julien,

Hi Bob,

> On 09/04/2015 09:51 PM, Julien Grall wrote:
>> Hi Roger,
>>
>> On 04/09/15 11:08, Roger Pau Monne wrote:
>>> Request allocation has been moved to connect_ring, which is called every
>>> time blkback connects to the frontend (this can happen multiple times during
>>> a blkback instance life cycle). On the other hand, request freeing has not
>>> been moved, so it's only called when destroying the backend instance. Due to
>>> this mismatch, blkback can allocate the request pool multiple times, without
>>> freeing it.
>>>
>>> In order to fix it, move the freeing of requests to xen_blkif_disconnect to
>>> restore the symmetry between request allocation and freeing.
>>>
>>> Reported-by: Julien Grall 
>>> Signed-off-by: Roger Pau Monné 
>>> Cc: Julien Grall 
>>> Cc: Konrad Rzeszutek Wilk 
>>> Cc: Boris Ostrovsky 
>>> Cc: David Vrabel 
>>> Cc: xen-de...@lists.xenproject.org
>>
>> The patch is fixing my problem when using UEFI in the guest. Thank you!
>>
> 
> Could you please explain the problem you met a bit more?
> So that I can know back port this patch if met similar issue.

This is related to commit 86839c56dee28c315a4c19b7bfee450ccd84cd25
"xen/block: add multi-page ring support" (Roger, it may be worth to
indicate the offending commit in you commit message).

When starting a guest using UEFI. After the domain is destroyed I get
the following warning from blkback:


[ cut here ]
WARNING: CPU: 2 PID: 95 at
/home/julien/works/linux/drivers/block/xen-blkback/xenbus.c:274
xen_blkif_deferred_free+0x1f4/0x1f8()
Modules linked in:
CPU: 2 PID: 95 Comm: kworker/2:1 Tainted: GW   4.2.0 #85
Hardware name: APM X-Gene Mustang board (DT)
Workqueue: events xen_blkif_deferred_free
Call trace:
[] dump_backtrace+0x0/0x124
[] show_stack+0x10/0x1c
[] dump_stack+0x78/0x98
[] warn_slowpath_common+0x9c/0xd4
[] warn_slowpath_null+0x14/0x20
[] xen_blkif_deferred_free+0x1f0/0x1f8
[] process_one_work+0x160/0x3b4
[] worker_thread+0x140/0x494
[] kthread+0xd8/0xf0
---[ end trace 6f859b7883c88cdd ]---

This is because the allocation of the requests are done during the
connection but the free is done when the domain is destroyed. Therefore
if the domain is re-initializing the connection (because UEFI or PV Grub
is used), the request won't be free and kept until the end.

So I think this should be backported in Linux 4.2 where the patch has
been introduced.

Regards,

-- 
Julien Grall
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v4 02/20] arm/xen: Drop pte_mfn and mfn_pte

2015-09-07 Thread Julien Grall
They are not used in common code expect in one place in balloon.c which is
only compiled when Linux is using PV MMU. It's not the case on ARM.

Rather than worrying how to handle the 64KB case, drop them.

Signed-off-by: Julien Grall 
Reviewed-by: Stefano Stabellini 

---
Cc: Russell King 

Changes in v4:
- Add Stefano's reviewed

Changes in v3:
- Patch added
---
 arch/arm/include/asm/xen/page.h | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/arch/arm/include/asm/xen/page.h b/arch/arm/include/asm/xen/page.h
index 1279563..98c9fc3 100644
--- a/arch/arm/include/asm/xen/page.h
+++ b/arch/arm/include/asm/xen/page.h
@@ -13,9 +13,6 @@
 
 #define phys_to_machine_mapping_valid(pfn) (1)
 
-#define pte_mfnpte_pfn
-#define mfn_ptepfn_pte
-
 /* Xen machine address */
 typedef struct xmaddr {
phys_addr_t maddr;
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v4 08/20] block/xen-blkfront: split get_grant in 2

2015-09-07 Thread Julien Grall
Prepare the code to support 64KB page granularity. The first
implementation will use a full Linux page per indirect and persistent
grant. When non-persistent grant is used, each page of a bio request
may be split in multiple grant.

Furthermore, the field page of the grant structure is only used to copy
data from persistent grant or indirect grant. Avoid to set it for other
use case as it will have no meaning given the page will be split in
multiple grant.

Provide 2 functions, to setup indirect grant, the other for bio page.

Signed-off-by: Julien Grall 
Acked-by: Roger Pau Monné 

---
Cc: Konrad Rzeszutek Wilk 
Cc: Boris Ostrovsky 
Cc: David Vrabel 

Changes in v4:
- Add Roger's acked-by

Changes in v3:
- Fix errors reported by checkpatch.pl
- gnttab_page_grant_foreign_access_ref has been renamed to
gnttab_page_grant_foreign_access_ref_one
- Fix compilation by using get_indirect_grant rather than
get_grant (the changes was in a later patch...).
- Make grant_foreign_access static inline
- s/mfn/gfn/ based on the new naming

Changes in v2:
- Patch added
---
 drivers/block/xen-blkfront.c | 88 +---
 1 file changed, 59 insertions(+), 29 deletions(-)

diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
index 556475d..4232cbd 100644
--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -245,34 +245,77 @@ out_of_memory:
return -ENOMEM;
 }
 
-static struct grant *get_grant(grant_ref_t *gref_head,
-  struct page *page,
-  struct blkfront_info *info)
+static struct grant *get_free_grant(struct blkfront_info *info)
 {
struct grant *gnt_list_entry;
-   unsigned long buffer_gfn;
 
BUG_ON(list_empty(&info->grants));
gnt_list_entry = list_first_entry(&info->grants, struct grant,
- node);
+ node);
list_del(&gnt_list_entry->node);
 
-   if (gnt_list_entry->gref != GRANT_INVALID_REF) {
+   if (gnt_list_entry->gref != GRANT_INVALID_REF)
info->persistent_gnts_c--;
+
+   return gnt_list_entry;
+}
+
+static inline void grant_foreign_access(const struct grant *gnt_list_entry,
+   const struct blkfront_info *info)
+{
+   gnttab_page_grant_foreign_access_ref_one(gnt_list_entry->gref,
+info->xbdev->otherend_id,
+gnt_list_entry->page,
+0);
+}
+
+static struct grant *get_grant(grant_ref_t *gref_head,
+  unsigned long gfn,
+  struct blkfront_info *info)
+{
+   struct grant *gnt_list_entry = get_free_grant(info);
+
+   if (gnt_list_entry->gref != GRANT_INVALID_REF)
return gnt_list_entry;
+
+   /* Assign a gref to this page */
+   gnt_list_entry->gref = gnttab_claim_grant_reference(gref_head);
+   BUG_ON(gnt_list_entry->gref == -ENOSPC);
+   if (info->feature_persistent)
+   grant_foreign_access(gnt_list_entry, info);
+   else {
+   /* Grant access to the GFN passed by the caller */
+   gnttab_grant_foreign_access_ref(gnt_list_entry->gref,
+   info->xbdev->otherend_id,
+   gfn, 0);
}
 
+   return gnt_list_entry;
+}
+
+static struct grant *get_indirect_grant(grant_ref_t *gref_head,
+   struct blkfront_info *info)
+{
+   struct grant *gnt_list_entry = get_free_grant(info);
+
+   if (gnt_list_entry->gref != GRANT_INVALID_REF)
+   return gnt_list_entry;
+
/* Assign a gref to this page */
gnt_list_entry->gref = gnttab_claim_grant_reference(gref_head);
BUG_ON(gnt_list_entry->gref == -ENOSPC);
if (!info->feature_persistent) {
-   BUG_ON(!page);
-   gnt_list_entry->page = page;
+   struct page *indirect_page;
+
+   /* Fetch a pre-allocated page to use for indirect grefs */
+   BUG_ON(list_empty(&info->indirect_pages));
+   indirect_page = list_first_entry(&info->indirect_pages,
+struct page, lru);
+   list_del(&indirect_page->lru);
+   gnt_list_entry->page = indirect_page;
}
-   buffer_gfn = xen_page_to_gfn(gnt_list_entry->page);
-   gnttab_grant_foreign_access_ref(gnt_list_entry->gref,
-   info->xbdev->otherend_id,
-   buffer_gfn, 0);

[PATCH v4 03/20] xen: Add Xen specific page definition

2015-09-07 Thread Julien Grall
The Xen hypercall interface is always using 4K page granularity on ARM
and x86 architecture.

With the incoming support of 64K page granularity for ARM64 guest, it
won't be possible to re-use the Linux page definition in Xen drivers.

Introduce Xen page definition helpers based on the Linux page
definition. They have exactly the same name but prefixed with
XEN_/xen_ prefix.

Also modify xen_page_to_gfn to use new Xen page definition.

Signed-off-by: Julien Grall 
Reviewed-by: Stefano Stabellini 

---
Cc: Konrad Rzeszutek Wilk 
Cc: Boris Ostrovsky 
Cc: David Vrabel 

Changes in v4:
- Typoes
- Rename xen_page_to_pfn to page_to_xen_pfn

Changes in v3:
- Fix errors reported by checkpatch.pl
- Rename pfn to xen_pfn in xen_pfn_to_page
- Add a comment that we assume PAGE_SIZE to be a multiple of
XEN_PAGE_SIZE
- s/MFN/GFN/ according to new naming
- Add Stefano's reviewed-by

Changes in v2:
- Add XEN_PFN_UP
- Add a comment describing the behavior of page_to_pfn
---
 include/xen/page.h | 27 ++-
 1 file changed, 26 insertions(+), 1 deletion(-)

diff --git a/include/xen/page.h b/include/xen/page.h
index 1daae48..96294ac 100644
--- a/include/xen/page.h
+++ b/include/xen/page.h
@@ -1,11 +1,36 @@
 #ifndef _XEN_PAGE_H
 #define _XEN_PAGE_H
 
+#include 
+
+/* The hypercall interface supports only 4KB page */
+#define XEN_PAGE_SHIFT 12
+#define XEN_PAGE_SIZE  (_AC(1, UL) << XEN_PAGE_SHIFT)
+#define XEN_PAGE_MASK  (~(XEN_PAGE_SIZE-1))
+#define xen_offset_in_page(p)  ((unsigned long)(p) & ~XEN_PAGE_MASK)
+
+/*
+ * We assume that PAGE_SIZE is a multiple of XEN_PAGE_SIZE
+ * XXX: Add a BUILD_BUG_ON?
+ */
+
+#define xen_pfn_to_page(xen_pfn)   \
+   ((pfn_to_page(((unsigned long)(xen_pfn) << XEN_PAGE_SHIFT) >> 
PAGE_SHIFT)))
+#define page_to_xen_pfn(page)  \
+   (((page_to_pfn(page)) << PAGE_SHIFT) >> XEN_PAGE_SHIFT)
+
+#define XEN_PFN_PER_PAGE   (PAGE_SIZE / XEN_PAGE_SIZE)
+
+#define XEN_PFN_DOWN(x)((x) >> XEN_PAGE_SHIFT)
+#define XEN_PFN_UP(x)  (((x) + XEN_PAGE_SIZE-1) >> XEN_PAGE_SHIFT)
+#define XEN_PFN_PHYS(x)((phys_addr_t)(x) << XEN_PAGE_SHIFT)
+
 #include 
 
+/* Return the GFN associated to the first 4KB of the page */
 static inline unsigned long xen_page_to_gfn(struct page *page)
 {
-   return pfn_to_gfn(page_to_pfn(page));
+   return pfn_to_gfn(page_to_xen_pfn(page));
 }
 
 struct xen_memory_region {
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v4 01/20] net/xen-netback: xenvif_gop_frag_copy: move GSO check out of the loop

2015-09-07 Thread Julien Grall
The skb doesn't change within the function. Therefore it's only
necessary to check if we need GSO once at the beginning.

Signed-off-by: Julien Grall 
Acked-by: Wei Liu 

---
Cc: Ian Campbell 
Cc: net...@vger.kernel.org

Changes in v4:
- Add Wei's acked

Changes in v2:
- Patch added
---
 drivers/net/xen-netback/netback.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/net/xen-netback/netback.c 
b/drivers/net/xen-netback/netback.c
index 7c64c74..d4c1bc7 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -277,6 +277,13 @@ static void xenvif_gop_frag_copy(struct xenvif_queue 
*queue, struct sk_buff *skb
unsigned long bytes;
int gso_type = XEN_NETIF_GSO_TYPE_NONE;
 
+   if (skb_is_gso(skb)) {
+   if (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV4)
+   gso_type = XEN_NETIF_GSO_TYPE_TCPV4;
+   else if (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV6)
+   gso_type = XEN_NETIF_GSO_TYPE_TCPV6;
+   }
+
/* Data must not cross a page boundary. */
BUG_ON(size + offset > PAGE_SIZE<gso_type & SKB_GSO_TCPV4)
-   gso_type = XEN_NETIF_GSO_TYPE_TCPV4;
-   else if (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV6)
-   gso_type = XEN_NETIF_GSO_TYPE_TCPV6;
-   }
-
if (*head && ((1 << gso_type) & queue->vif->gso_mask))
queue->rx.req_cons++;
 
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v4 04/20] xen/grant: Introduce helpers to split a page into grant

2015-09-07 Thread Julien Grall
Currently, a grant is always based on the Xen page granularity (i.e
4KB). When Linux is using a different page granularity, a single page
will be split between multiple grants.

The new helpers will be in charge of splitting the Linux page into grants
and call a function given by the caller on each grant.

Also provide an helper to count the number of grants within a given
contiguous region.

Note that the x86/include/asm/xen/page.h is now including
xen/interface/grant_table.h rather than xen/grant_table.h. It's
necessary because xen/grant_table.h depends on asm/xen/page.h and will
break the compilation. Furthermore, only definition in
interface/grant_table.h is required.

Signed-off-by: Julien Grall 
Reviewed-by: David Vrabel 
Reviewed-by: Stefano Stabellini 

---
Cc: Konrad Rzeszutek Wilk 
Cc: Boris Ostrovsky 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Cc: x...@kernel.org

Changes in v4:
- Typoes
- Rename gnttab_one_grant into gnttab_for_one_grant
- Add Stefano and David's reviewed-by
- s/xen_page_to_pfn/page_to_xen_pfn/ based on the new naming

Changes in v3:
- Fix error reported by checkpatch.pl
- Typoes
- s/pfn/xen_pfn/ in gnttab_foreach_grant
- Drop the possibility to use less data. The complexity is moved
in netback which is the only user
- Rename gnttab_foreach_grant into gnttab_foreach_grant_in_range
- s/offset/start/ in gnttab_count_grant and update the
description of the parameter
- s/mfn/gfn base on the new terminologies
- Add EXPORT_SYMBOL_GPL for gnttab_foreach_grant_in_range
- Use xen_offset_in_page and XEN_PFN_DOWN whenever it's possible
- Fix compilation on x86.

Changes in v2:
- Patch added
---
 arch/x86/include/asm/xen/page.h |  2 +-
 drivers/xen/grant-table.c   | 26 +
 include/xen/grant_table.h   | 42 +
 3 files changed, 69 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/xen/page.h b/arch/x86/include/asm/xen/page.h
index 0b762f6..501479e 100644
--- a/arch/x86/include/asm/xen/page.h
+++ b/arch/x86/include/asm/xen/page.h
@@ -12,7 +12,7 @@
 #include 
 
 #include 
-#include 
+#include 
 #include 
 
 /* Xen machine address */
diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c
index 62f591f..7b4e1cf 100644
--- a/drivers/xen/grant-table.c
+++ b/drivers/xen/grant-table.c
@@ -776,6 +776,32 @@ void gnttab_batch_copy(struct gnttab_copy *batch, unsigned 
count)
 }
 EXPORT_SYMBOL_GPL(gnttab_batch_copy);
 
+void gnttab_foreach_grant_in_range(struct page *page,
+  unsigned int offset,
+  unsigned int len,
+  xen_grant_fn_t fn,
+  void *data)
+{
+   unsigned int goffset;
+   unsigned int glen;
+   unsigned long xen_pfn;
+
+   len = min_t(unsigned int, PAGE_SIZE - offset, len);
+   goffset = xen_offset_in_page(offset);
+
+   xen_pfn = page_to_xen_pfn(page) + XEN_PFN_DOWN(offset);
+
+   while (len) {
+   glen = min_t(unsigned int, XEN_PAGE_SIZE - goffset, len);
+   fn(pfn_to_gfn(xen_pfn), goffset, glen, data);
+
+   goffset = 0;
+   xen_pfn++;
+   len -= glen;
+   }
+}
+EXPORT_SYMBOL_GPL(gnttab_foreach_grant_in_range);
+
 int gnttab_map_refs(struct gnttab_map_grant_ref *map_ops,
struct gnttab_map_grant_ref *kmap_ops,
struct page **pages, unsigned int count)
diff --git a/include/xen/grant_table.h b/include/xen/grant_table.h
index 4478f4b..05b5b08 100644
--- a/include/xen/grant_table.h
+++ b/include/xen/grant_table.h
@@ -45,8 +45,10 @@
 #include 
 
 #include 
+#include 
 #include 
 #include 
+#include 
 
 #define GNTTAB_RESERVED_XENSTORE 1
 
@@ -224,4 +226,44 @@ static inline struct xen_page_foreign 
*xen_page_foreign(struct page *page)
 #endif
 }
 
+/* Split Linux page in chunk of the size of the grant and call fn
+ *
+ * Parameters of fn:
+ * gfn: guest frame number
+ * offset: offset in the grant
+ * len: length of the data in the grant.
+ * data: internal information
+ */
+typedef void (*xen_grant_fn_t)(unsigned long gfn, unsigned int offset,
+  unsigned int len, void *data);
+
+void gnttab_foreach_grant_in_range(struct page *page,
+  unsigned int offset,
+  unsigned int len,
+  xen_grant_fn_t fn,
+  void *data);
+
+/* Helper to get to call fn only on the first "grant chunk" */
+static inline void gnttab_for_one_grant(struct page *page, unsigned int offset,
+   unsigned len, xen_grant_fn_t fn,
+   void *data)

[PATCH v4 05/20] xen/grant: Add helper gnttab_page_grant_foreign_access_ref_one

2015-09-07 Thread Julien Grall
Many PV drivers contain the idiom:

pfn = page_to_gfn(...) /* Or similar */
gnttab_grant_foreign_access_ref

Replace it by a new helper. Note that when Linux is using a different
page granularity than Xen, the helper only gives access to the first 4KB
grant.

This is useful where drivers are allocating a full Linux page for each
grant.

Also include xen/interface/grant_table.h rather than xen/grant_table.h in
asm/page.h for x86 to fix a compilation issue [1]. Only the former is
useful in order to get the structure definition.

[1] Interdependency between asm/page.h and xen/grant_table.h which result
to page_mfn not being defined when necessary.

Signed-off-by: Julien Grall 
Reviewed-by: David Vrabel 
Reviewed-by: Stefano Stabellini 

---
Cc: Konrad Rzeszutek Wilk 
Cc: Boris Ostrovsky 

Changes in v3:
- Rename gnttab_page_grant_foreign_access_ref into
gnttab_page_grant_foreign_access_ref_one
- Fix typo in the commit message
- s/mfn/gfn based on the new naming
- Add David and Stefano's reviewed-by

Changes in v2:
- Patch added
---
 include/xen/grant_table.h | 9 +
 1 file changed, 9 insertions(+)

diff --git a/include/xen/grant_table.h b/include/xen/grant_table.h
index 05b5b08..e17a4b3 100644
--- a/include/xen/grant_table.h
+++ b/include/xen/grant_table.h
@@ -131,6 +131,15 @@ void gnttab_cancel_free_callback(struct 
gnttab_free_callback *callback);
 void gnttab_grant_foreign_access_ref(grant_ref_t ref, domid_t domid,
 unsigned long frame, int readonly);
 
+/* Give access to the first 4K of the page */
+static inline void gnttab_page_grant_foreign_access_ref_one(
+   grant_ref_t ref, domid_t domid,
+   struct page *page, int readonly)
+{
+   gnttab_grant_foreign_access_ref(ref, domid, xen_page_to_gfn(page),
+   readonly);
+}
+
 void gnttab_grant_foreign_transfer_ref(grant_ref_t, domid_t domid,
   unsigned long pfn);
 
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v4 09/20] xen/biomerge: Don't allow biovec's to be merged when Linux is not using 4KB pages

2015-09-07 Thread Julien Grall
On ARM all dma-capable devices on a same platform may not be protected
by an IOMMU. The DMA requests have to use the BFN (i.e MFN on ARM) in
order to use correctly the device.

While the DOM0 memory is allocated in a 1:1 fashion (PFN == MFN), grant
mapping will screw this contiguous mapping.

When Linux is using 64KB page granularitary, the page may be split
accross multiple non-contiguous MFN (Xen is using 4KB page
granularity). Therefore a DMA request will likely fail.

Checking that a 64KB page is using contiguous MFN is tedious. For
now, always says that biovec are not mergeable.

Signed-off-by: Julien Grall 
Reviewed-by: Stefano Stabellini 

---
Cc: Konrad Rzeszutek Wilk 
Cc: Boris Ostrovsky 
Cc: David Vrabel 

There is some ideas to check whether two biovec could be merged
(see [1]) but it's not critical and can be consider as a performance
improvement.

Changes in v4:
- Fix typoes in the subject
- Add Stefano's reviewed-by

Changes in v3:
- Update commit message
- s/mfn/bfn/ base on the new renaming
- Update TODO

Changes in v2:
- Remove the workaround and check if the Linux page granularity
is the same as Xen or not

[1] https://lkml.org/lkml/2015/7/17/418
---
 drivers/xen/biomerge.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/drivers/xen/biomerge.c b/drivers/xen/biomerge.c
index 8ae2fc90..4da69db 100644
--- a/drivers/xen/biomerge.c
+++ b/drivers/xen/biomerge.c
@@ -6,10 +6,18 @@
 bool xen_biovec_phys_mergeable(const struct bio_vec *vec1,
   const struct bio_vec *vec2)
 {
+#if XEN_PAGE_SIZE == PAGE_SIZE
unsigned long bfn1 = pfn_to_bfn(page_to_pfn(vec1->bv_page));
unsigned long bfn2 = pfn_to_bfn(page_to_pfn(vec2->bv_page));
 
return __BIOVEC_PHYS_MERGEABLE(vec1, vec2) &&
((bfn1 == bfn2) || ((bfn1+1) == bfn2));
+#else
+   /*
+* XXX: Add support for merging bio_vec when using different page
+* size in Xen and Linux.
+*/
+   return 0;
+#endif
 }
 EXPORT_SYMBOL(xen_biovec_phys_mergeable);
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v4 00/20] xen/arm64: Add support for 64KB page in Linux

2015-09-07 Thread Julien Grall
Hi all,

ARM64 Linux is supporting both 4KB and 64KB page granularity. Although, Xen
hypercall interface and PV protocol are always based on 4KB page granularity.

Any attempt to boot a Linux guest with 64KB pages enabled will result to a
guest crash.

This series is a first attempt to allow those Linux running with the current
hypercall interface and PV protocol.

This solution has been chosen because we want to run Linux 64KB in released
Xen ARM version or/and platform using an old version of Linux DOM0.

There is room for improvement, such as support of 64KB grant, modification
of PV protocol to support different page size... They will be explored in a
separate patch series later.

TODO list:
- Convert swiotlb to 64KB
- Convert xenfb to 64KB
- Support for multiple page ring support
- Support for 64KB in gnttdev
- Support of non-indirect grant with 64KB frontend
- It may be possible to move some common define between
netback/netfront and blkfront/blkback in an header

I've got most of the patches for the TODO items. I'm planning to send them as
a follow-up as it's not a requirement for a basic guests.

All patches has been built tested for ARM32, ARM64, x86. But I haven't tested
to run it on x86 as I don't have a box with Xen x86 running. I would be
happy if someone give a try and see possible regression for x86.

I know that Konrad as a test-suite for x86. Konrand, would it be possible to
give a run to for this series?

A branch based on the latest xentip/for-linus-4.3 can be found here:

git://xenbits.xen.org/people/julieng/linux-arm.git branch xen-64k-v4

Comments, suggestions are welcomed.

Sincerely yours,

Cc: david.vra...@citrix.com
Cc: konrad.w...@oracle.com
Cc: boris.ostrov...@oracle.com
Cc: wei.l...@citrix.com
Cc: roger@citrix.com

Status of each patch:

A: Reviewed-by - Acked-by
M: Patch modified in this series
m: Minor changes in this series (i.e renaming due to previous patches, typoes)
L: Missing Acked-by from a Linux maintainers (Boris, David, Konrad)
N: Missing Acked-by from a Netback maintainers (Ian or Wei)

Julien Grall (20):
A   net/xen-netback: xenvif_gop_frag_copy: move GSO check out of the loop
A   arm/xen: Drop pte_mfn and mfn_pte
A M L   xen: Add Xen specific page definition
A M xen/grant: Introduce helpers to split a page into grant
A   xen/grant: Add helper gnttab_page_grant_foreign_access_ref_one
A   block/xen-blkfront: Split blkif_queue_request in 2
A m block/xen-blkfront: Store a page rather a pfn in the grant structure
A   block/xen-blkfront: split get_grant in 2
A m L   xen/biomerge: Don't allow biovec's to be merged when Linux is not
  using 4KB pages
A   xen/xenbus: Use Xen page definition
A m L   tty/hvc: xen: Use xen page definition
  M L   xen/balloon: Don't rely on the page granularity is the same for Xen
  and Linux
A   xen/events: fifo: Make it running on 64KB granularity
A   xen/grant-table: Make it running on 64KB granularity
A m block/xen-blkfront: Make it running on 64KB page granularity
A   block/xen-blkback: Make it running on 64KB page granularity
A m net/xen-netfront: Make it running on 64KB page granularity
  m  N  net/xen-netback: Make it running on 64KB page granularity
A m xen/privcmd: Add support for Linux 64KB page granularity
A   arm/xen: Add support for 64KB page granularity

 arch/arm/include/asm/xen/page.h |  18 +-
 arch/arm/xen/enlighten.c|   6 +-
 arch/arm/xen/p2m.c  |   6 +-
 arch/x86/include/asm/xen/page.h |   2 +-
 drivers/block/xen-blkback/blkback.c |   5 +-
 drivers/block/xen-blkback/common.h  |  17 +-
 drivers/block/xen-blkback/xenbus.c  |   9 +-
 drivers/block/xen-blkfront.c| 552 +++-
 drivers/net/xen-netback/common.h|  18 +-
 drivers/net/xen-netback/netback.c   | 163 +++
 drivers/net/xen-netfront.c  | 122 +---
 drivers/tty/hvc/hvc_xen.c   |   4 +-
 drivers/xen/balloon.c   |  59 +++-
 drivers/xen/biomerge.c  |   8 +
 drivers/xen/events/events_base.c|   2 +-
 drivers/xen/events/events_fifo.c|   2 +-
 drivers/xen/grant-table.c   |  32 ++-
 drivers/xen/privcmd.c   |   8 +-
 drivers/xen/xenbus/xenbus_client.c  |   6 +-
 drivers/xen/xenbus/xenbus_probe.c   |   3 +-
 drivers/xen/xlate_mmu.c | 124 +---
 include/xen/grant_table.h   |  51 
 include/xen/page.h  |  27 +-
 23 files changed, 855 insertions(+), 389 deletions(-)

-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v4 07/20] block/xen-blkfront: Store a page rather a pfn in the grant structure

2015-09-07 Thread Julien Grall
All the usage of the field pfn are done using the same idiom:

pfn_to_page(grant->pfn)

This will  return always the same page. Store directly the page in the
grant to clean up the code.

Signed-off-by: Julien Grall 
Acked-by: Roger Pau Monné 
Reviewed-by: Stefano Stabellini 

---
Cc: Konrad Rzeszutek Wilk 
Cc: Boris Ostrovsky 
Cc: David Vrabel 

Roger, Stefano, I kept your Acked-by/Reviewed-by because the rebase was
minor. Let me know if you disagree.

Changes in v4:
- rebase after 7adf12b87f45a77d364464018fb8e9e1ac875152
"xen-blkfront: don't add indirect pages to list when
!feature_persistent"

Changes in v3:
- Use the correct indentation in get_grant. The current
indentation (i.e without this patch) was wrong because it was
using space rather than tabulation.
- Add Roger's acked and Stefano's reviewed
- s/mfn/gfn based on the new naming

Changes in v2:
- Patch added
---
 drivers/block/xen-blkfront.c | 39 +++
 1 file changed, 19 insertions(+), 20 deletions(-)

diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
index b11f084..556475d 100644
--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -68,7 +68,7 @@ enum blkif_state {
 
 struct grant {
grant_ref_t gref;
-   unsigned long pfn;
+   struct page *page;
struct list_head node;
 };
 
@@ -222,7 +222,7 @@ static int fill_grant_buffer(struct blkfront_info *info, 
int num)
kfree(gnt_list_entry);
goto out_of_memory;
}
-   gnt_list_entry->pfn = page_to_pfn(granted_page);
+   gnt_list_entry->page = granted_page;
}
 
gnt_list_entry->gref = GRANT_INVALID_REF;
@@ -237,7 +237,7 @@ out_of_memory:
 &info->grants, node) {
list_del(&gnt_list_entry->node);
if (info->feature_persistent)
-   __free_page(pfn_to_page(gnt_list_entry->pfn));
+   __free_page(gnt_list_entry->page);
kfree(gnt_list_entry);
i--;
}
@@ -246,8 +246,8 @@ out_of_memory:
 }
 
 static struct grant *get_grant(grant_ref_t *gref_head,
-   unsigned long pfn,
-   struct blkfront_info *info)
+  struct page *page,
+  struct blkfront_info *info)
 {
struct grant *gnt_list_entry;
unsigned long buffer_gfn;
@@ -266,10 +266,10 @@ static struct grant *get_grant(grant_ref_t *gref_head,
gnt_list_entry->gref = gnttab_claim_grant_reference(gref_head);
BUG_ON(gnt_list_entry->gref == -ENOSPC);
if (!info->feature_persistent) {
-   BUG_ON(!pfn);
-   gnt_list_entry->pfn = pfn;
+   BUG_ON(!page);
+   gnt_list_entry->page = page;
}
-   buffer_gfn = pfn_to_gfn(gnt_list_entry->pfn);
+   buffer_gfn = xen_page_to_gfn(gnt_list_entry->page);
gnttab_grant_foreign_access_ref(gnt_list_entry->gref,
info->xbdev->otherend_id,
buffer_gfn, 0);
@@ -525,7 +525,7 @@ static int blkif_queue_rw_req(struct request *req)
 
if ((ring_req->operation == BLKIF_OP_INDIRECT) &&
(i % SEGS_PER_INDIRECT_FRAME == 0)) {
-   unsigned long uninitialized_var(pfn);
+   struct page *uninitialized_var(page);
 
if (segments)
kunmap_atomic(segments);
@@ -542,15 +542,15 @@ static int blkif_queue_rw_req(struct request *req)
indirect_page = 
list_first_entry(&info->indirect_pages,
 struct page, 
lru);
list_del(&indirect_page->lru);
-   pfn = page_to_pfn(indirect_page);
+   page = indirect_page;
}
-   gnt_list_entry = get_grant(&gref_head, pfn, info);
+   gnt_list_entry = get_grant(&gref_head, page, info);
info->shadow[id].indirect_grants[n] = gnt_list_entry;
-   segments = 
kmap_atomic(pfn_to_page(gnt_list_entry->pfn));
+   segments = kmap_atomic(gnt_list_entry->page);
ring_req->u.indirect.indirect_grefs[n] = 
gnt_list_entry->gref;
}
 
-   gnt_list_entry = get_grant(&gref_head, 
page_to_pfn(sg_page(sg)), info);
+   gnt_list_entry = get_grant(&gref_h

[PATCH v4 06/20] block/xen-blkfront: Split blkif_queue_request in 2

2015-09-07 Thread Julien Grall
Currently, blkif_queue_request has 2 distinct execution path:
- Send a discard request
- Send a read/write request

The function is also allocating grants to use for generating the
request. Although, this is only used for read/write request.

Rather than having a function with 2 distinct execution path, separate
the function in 2. This will also remove one level of tabulation.

Signed-off-by: Julien Grall 
Reviewed-by: Roger Pau Monné 

---
Cc: Konrad Rzeszutek Wilk 
Cc: Boris Ostrovsky 
Cc: David Vrabel 

Roger, if you really want if can drop the else clause in
blkif_queue_request, IHMO it's more clear here. Although I've kept
your Reviewed-by. Let me know if it's not fine.

Changes in v3:
- Fix errors reported by checkpatch.pl
- Add Roger's Reviewed-by

Changes in v2:
- Patch added
---
 drivers/block/xen-blkfront.c | 277 ---
 1 file changed, 153 insertions(+), 124 deletions(-)

diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
index 432e105..b11f084 100644
--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -395,13 +395,35 @@ static int blkif_ioctl(struct block_device *bdev, fmode_t 
mode,
return 0;
 }
 
-/*
- * Generate a Xen blkfront IO request from a blk layer request.  Reads
- * and writes are handled as expected.
- *
- * @req: a request struct
- */
-static int blkif_queue_request(struct request *req)
+static int blkif_queue_discard_req(struct request *req)
+{
+   struct blkfront_info *info = req->rq_disk->private_data;
+   struct blkif_request *ring_req;
+   unsigned long id;
+
+   /* Fill out a communications ring structure. */
+   ring_req = RING_GET_REQUEST(&info->ring, info->ring.req_prod_pvt);
+   id = get_id_from_freelist(info);
+   info->shadow[id].request = req;
+
+   ring_req->operation = BLKIF_OP_DISCARD;
+   ring_req->u.discard.nr_sectors = blk_rq_sectors(req);
+   ring_req->u.discard.id = id;
+   ring_req->u.discard.sector_number = (blkif_sector_t)blk_rq_pos(req);
+   if ((req->cmd_flags & REQ_SECURE) && info->feature_secdiscard)
+   ring_req->u.discard.flag = BLKIF_DISCARD_SECURE;
+   else
+   ring_req->u.discard.flag = 0;
+
+   info->ring.req_prod_pvt++;
+
+   /* Keep a private copy so we can reissue requests when recovering. */
+   info->shadow[id].req = *ring_req;
+
+   return 0;
+}
+
+static int blkif_queue_rw_req(struct request *req)
 {
struct blkfront_info *info = req->rq_disk->private_data;
struct blkif_request *ring_req;
@@ -421,9 +443,6 @@ static int blkif_queue_request(struct request *req)
struct scatterlist *sg;
int nseg, max_grefs;
 
-   if (unlikely(info->connected != BLKIF_STATE_CONNECTED))
-   return 1;
-
max_grefs = req->nr_phys_segments;
if (max_grefs > BLKIF_MAX_SEGMENTS_PER_REQUEST)
/*
@@ -453,139 +472,131 @@ static int blkif_queue_request(struct request *req)
id = get_id_from_freelist(info);
info->shadow[id].request = req;
 
-   if (unlikely(req->cmd_flags & (REQ_DISCARD | REQ_SECURE))) {
-   ring_req->operation = BLKIF_OP_DISCARD;
-   ring_req->u.discard.nr_sectors = blk_rq_sectors(req);
-   ring_req->u.discard.id = id;
-   ring_req->u.discard.sector_number = 
(blkif_sector_t)blk_rq_pos(req);
-   if ((req->cmd_flags & REQ_SECURE) && info->feature_secdiscard)
-   ring_req->u.discard.flag = BLKIF_DISCARD_SECURE;
-   else
-   ring_req->u.discard.flag = 0;
+   BUG_ON(info->max_indirect_segments == 0 &&
+  req->nr_phys_segments > BLKIF_MAX_SEGMENTS_PER_REQUEST);
+   BUG_ON(info->max_indirect_segments &&
+  req->nr_phys_segments > info->max_indirect_segments);
+   nseg = blk_rq_map_sg(req->q, req, info->shadow[id].sg);
+   ring_req->u.rw.id = id;
+   if (nseg > BLKIF_MAX_SEGMENTS_PER_REQUEST) {
+   /*
+* The indirect operation can only be a BLKIF_OP_READ or
+* BLKIF_OP_WRITE
+*/
+   BUG_ON(req->cmd_flags & (REQ_FLUSH | REQ_FUA));
+   ring_req->operation = BLKIF_OP_INDIRECT;
+   ring_req->u.indirect.indirect_op = rq_data_dir(req) ?
+   BLKIF_OP_WRITE : BLKIF_OP_READ;
+   ring_req->u.indirect.sector_number = 
(blkif_sector_t)blk_rq_pos(req);
+   ring_req->u.indirect.handle = info->handle;
+   ring_req->u.indirect.nr_segments = nseg;
} else {
-   BUG_ON(info->max_indirect_segments == 0 &&
-   

[PATCH v4 19/20] xen/privcmd: Add support for Linux 64KB page granularity

2015-09-07 Thread Julien Grall
The hypercall interface (as well as the toolstack) is always using 4KB
page granularity. When the toolstack is asking for mapping a series of
guest PFN in a batch, it expects to have the page map contiguously in
its virtual memory.

When Linux is using 64KB page granularity, the privcmd driver will have
to map multiple Xen PFN in a single Linux page.

Note that this solution works on page granularity which is a multiple of
4KB.

Signed-off-by: Julien Grall 
Reviewed-by: David Vrabel 

---
Cc: Konrad Rzeszutek Wilk 
Cc: Boris Ostrovsky 

I kept the hypercall arguments in remap_data to avoid allocating them on
the stack every time that remap_pte_fn is called.
I will keep like that unless someone is strongly disagree.

Changes in v4:
- s/xen_page_to_pfn/page_to_xen_pfn/ based on the new naming
- Add David's reviewed-by

Changes in v3:
- The function to split a Linux page in mutiple Xen page has
been moved internally. It was the only use (not used anymore in
the balloon) and it's not quite clear what should be the common
interface. Differ the question until someone need to use it.
- s/nr_pfn/numgfns/ to make clear that we are dealing with GFN
- Use DIV_ROUND_UP rather round_up and fix the usage in
xen_xlate_unmap_gfn_range

Changes in v2:
- Use xen_apply_to_page
---
 drivers/xen/privcmd.c   |   8 ++--
 drivers/xen/xlate_mmu.c | 124 
 2 files changed, 89 insertions(+), 43 deletions(-)

diff --git a/drivers/xen/privcmd.c b/drivers/xen/privcmd.c
index c6deb87..c8798ee 100644
--- a/drivers/xen/privcmd.c
+++ b/drivers/xen/privcmd.c
@@ -446,7 +446,7 @@ static long privcmd_ioctl_mmap_batch(void __user *udata, 
int version)
return -EINVAL;
}
 
-   nr_pages = m.num;
+   nr_pages = DIV_ROUND_UP(m.num, XEN_PFN_PER_PAGE);
if ((m.num <= 0) || (nr_pages > (LONG_MAX >> PAGE_SHIFT)))
return -EINVAL;
 
@@ -494,7 +494,7 @@ static long privcmd_ioctl_mmap_batch(void __user *udata, 
int version)
goto out_unlock;
}
if (xen_feature(XENFEAT_auto_translated_physmap)) {
-   ret = alloc_empty_pages(vma, m.num);
+   ret = alloc_empty_pages(vma, nr_pages);
if (ret < 0)
goto out_unlock;
} else
@@ -518,6 +518,7 @@ static long privcmd_ioctl_mmap_batch(void __user *udata, 
int version)
state.global_error  = 0;
state.version   = version;
 
+   BUILD_BUG_ON(((PAGE_SIZE / sizeof(xen_pfn_t)) % XEN_PFN_PER_PAGE) != 0);
/* mmap_batch_fn guarantees ret == 0 */
BUG_ON(traverse_pages_block(m.num, sizeof(xen_pfn_t),
&pagelist, mmap_batch_fn, &state));
@@ -582,12 +583,13 @@ static void privcmd_close(struct vm_area_struct *vma)
 {
struct page **pages = vma->vm_private_data;
int numpgs = (vma->vm_end - vma->vm_start) >> PAGE_SHIFT;
+   int numgfns = (vma->vm_end - vma->vm_start) >> XEN_PAGE_SHIFT;
int rc;
 
if (!xen_feature(XENFEAT_auto_translated_physmap) || !numpgs || !pages)
return;
 
-   rc = xen_unmap_domain_gfn_range(vma, numpgs, pages);
+   rc = xen_unmap_domain_gfn_range(vma, numgfns, pages);
if (rc == 0)
free_xenballooned_pages(numpgs, pages);
else
diff --git a/drivers/xen/xlate_mmu.c b/drivers/xen/xlate_mmu.c
index cff2387..5063c5e 100644
--- a/drivers/xen/xlate_mmu.c
+++ b/drivers/xen/xlate_mmu.c
@@ -38,31 +38,28 @@
 #include 
 #include 
 
-/* map fgfn of domid to lpfn in the current domain */
-static int map_foreign_page(unsigned long lpfn, unsigned long fgfn,
-   unsigned int domid)
-{
-   int rc;
-   struct xen_add_to_physmap_range xatp = {
-   .domid = DOMID_SELF,
-   .foreign_domid = domid,
-   .size = 1,
-   .space = XENMAPSPACE_gmfn_foreign,
-   };
-   xen_ulong_t idx = fgfn;
-   xen_pfn_t gpfn = lpfn;
-   int err = 0;
+typedef void (*xen_gfn_fn_t)(unsigned long gfn, void *data);
 
-   set_xen_guest_handle(xatp.idxs, &idx);
-   set_xen_guest_handle(xatp.gpfns, &gpfn);
-   set_xen_guest_handle(xatp.errs, &err);
+/* Break down the pages in 4KB chunk and call fn for each gfn */
+static void xen_for_each_gfn(struct page **pages, unsigned nr_gfn,
+xen_gfn_fn_t fn, void *data)
+{
+   unsigned long xen_pfn = 0;
+   struct page *page;
+   int i;
 
-   rc = HYPERVISOR_memory_op(XENMEM_add_to_physmap_range, &xatp);
-   return rc < 0 ? rc : err;
+   for (i = 0; i < nr_gfn; i++) {
+   if ((i % XEN_PFN_PER_PAGE) == 0) {
+   page = p

[PATCH v4 11/20] tty/hvc: xen: Use xen page definition

2015-09-07 Thread Julien Grall
The console ring is always based on the page granularity of Xen.

Signed-off-by: Julien Grall 
Reviewed-by: Stefano Stabellini 

---
Cc: Greg Kroah-Hartman 
Cc: Jiri Slaby 
Cc: David Vrabel 
Cc: Boris Ostrovsky 
Cc: linuxppc-...@lists.ozlabs.org

Changes in v4:
- The ring is always 4K (i.e XEN_PAGE_SIZE), so no need to
map with PAGE_SIZE. This was correctly done in v2 but lost with
the rebase to the "s/mfn/gfn/" series

Changes in v3:
- Some changes has been moved in the series "Use correctly the
Xen memory terminologies in Linux".
- Add Stefano's reviewed-by
---
 drivers/tty/hvc/hvc_xen.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/tty/hvc/hvc_xen.c b/drivers/tty/hvc/hvc_xen.c
index 10beb15..fa816b7 100644
--- a/drivers/tty/hvc/hvc_xen.c
+++ b/drivers/tty/hvc/hvc_xen.c
@@ -230,7 +230,7 @@ static int xen_hvm_console_init(void)
if (r < 0 || v == 0)
goto err;
gfn = v;
-   info->intf = xen_remap(gfn << PAGE_SHIFT, PAGE_SIZE);
+   info->intf = xen_remap(gfn << XEN_PAGE_SHIFT, XEN_PAGE_SIZE);
if (info->intf == NULL)
goto err;
info->vtermno = HVC_COOKIE;
@@ -472,7 +472,7 @@ static int xencons_resume(struct xenbus_device *dev)
struct xencons_info *info = dev_get_drvdata(&dev->dev);
 
xencons_disconnect_backend(info);
-   memset(info->intf, 0, PAGE_SIZE);
+   memset(info->intf, 0, XEN_PAGE_SIZE);
return xencons_connect_backend(dev, info);
 }
 
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v4 13/20] xen/events: fifo: Make it running on 64KB granularity

2015-09-07 Thread Julien Grall
Only use the first 4KB of the page to store the events channel info. It
means that we will waste 60KB every time we allocate page for:
 * control block: a page is allocating per CPU
 * event array: a page is allocating everytime we need to expand it

I think we can reduce the memory waste for the 2 areas by:

* control block: sharing between multiple vCPUs. Although it will
require some bookkeeping in order to not free the page when the CPU
goes offline and the other CPUs sharing the page still there

* event array: always extend the array event by 64K (i.e 16 4K
chunk). That would require more care when we fail to expand the
event channel.

Signed-off-by: Julien Grall 
Reviewed-by: David Vrabel 
Reviewed-by: Stefano Stabellini 

---
Cc: Konrad Rzeszutek Wilk 
Cc: Boris Ostrovsky 

Note I haven't updated the suggestion to reduce the memory waste
after David's email [1]. I can do it if necessary.

Changes in v3:
- Add David and Stefano's reviewed-by

[1] http://lists.xen.org/archives/html/xen-devel/2015-07/msg04596.html
---
 drivers/xen/events/events_base.c | 2 +-
 drivers/xen/events/events_fifo.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c
index c49bb7a..00dd923 100644
--- a/drivers/xen/events/events_base.c
+++ b/drivers/xen/events/events_base.c
@@ -40,11 +40,11 @@
 #include 
 #include 
 #include 
-#include 
 #endif
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
diff --git a/drivers/xen/events/events_fifo.c b/drivers/xen/events/events_fifo.c
index 1d4baf5..e3e9e3d 100644
--- a/drivers/xen/events/events_fifo.c
+++ b/drivers/xen/events/events_fifo.c
@@ -54,7 +54,7 @@
 
 #include "events_internal.h"
 
-#define EVENT_WORDS_PER_PAGE (PAGE_SIZE / sizeof(event_word_t))
+#define EVENT_WORDS_PER_PAGE (XEN_PAGE_SIZE / sizeof(event_word_t))
 #define MAX_EVENT_ARRAY_PAGES (EVTCHN_FIFO_NR_CHANNELS / EVENT_WORDS_PER_PAGE)
 
 struct evtchn_fifo_queue {
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v4 10/20] xen/xenbus: Use Xen page definition

2015-09-07 Thread Julien Grall
All the ring (xenstore, and PV rings) are always based on the page
granularity of Xen.

Signed-off-by: Julien Grall 
Reviewed-by: David Vrabel 
Reviewed-by: Stefano Stabellini 

---
Cc: Konrad Rzeszutek Wilk 
Cc: Boris Ostrovsky 

Changes in v3:
- Fix errors reported by checkpatch.pl
- s/MFN/GFN base on the new naming
- Add David and Stefano's reviewed-by

Changes in v2:
- Also update the ring mapping function
---
 drivers/xen/xenbus/xenbus_client.c | 6 +++---
 drivers/xen/xenbus/xenbus_probe.c  | 3 ++-
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/xen/xenbus/xenbus_client.c 
b/drivers/xen/xenbus/xenbus_client.c
index 2ba09c1..359e654 100644
--- a/drivers/xen/xenbus/xenbus_client.c
+++ b/drivers/xen/xenbus/xenbus_client.c
@@ -388,7 +388,7 @@ int xenbus_grant_ring(struct xenbus_device *dev, void 
*vaddr,
}
grefs[i] = err;
 
-   vaddr = vaddr + PAGE_SIZE;
+   vaddr = vaddr + XEN_PAGE_SIZE;
}
 
return 0;
@@ -555,7 +555,7 @@ static int xenbus_map_ring_valloc_pv(struct xenbus_device 
*dev,
if (!node)
return -ENOMEM;
 
-   area = alloc_vm_area(PAGE_SIZE * nr_grefs, ptes);
+   area = alloc_vm_area(XEN_PAGE_SIZE * nr_grefs, ptes);
if (!area) {
kfree(node);
return -ENOMEM;
@@ -750,7 +750,7 @@ static int xenbus_unmap_ring_vfree_pv(struct xenbus_device 
*dev, void *vaddr)
unsigned long addr;
 
memset(&unmap[i], 0, sizeof(unmap[i]));
-   addr = (unsigned long)vaddr + (PAGE_SIZE * i);
+   addr = (unsigned long)vaddr + (XEN_PAGE_SIZE * i);
unmap[i].host_addr = arbitrary_virt_to_machine(
lookup_address(addr, &level)).maddr;
unmap[i].dev_bus_addr = 0;
diff --git a/drivers/xen/xenbus/xenbus_probe.c 
b/drivers/xen/xenbus/xenbus_probe.c
index 3cbe055..33a31cf 100644
--- a/drivers/xen/xenbus/xenbus_probe.c
+++ b/drivers/xen/xenbus/xenbus_probe.c
@@ -802,7 +802,8 @@ static int __init xenbus_init(void)
goto out_error;
xen_store_gfn = (unsigned long)v;
xen_store_interface =
-   xen_remap(xen_store_gfn << PAGE_SHIFT, PAGE_SIZE);
+   xen_remap(xen_store_gfn << XEN_PAGE_SHIFT,
+ XEN_PAGE_SIZE);
break;
default:
pr_warn("Xenstore state unknown\n");
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v4 14/20] xen/grant-table: Make it running on 64KB granularity

2015-09-07 Thread Julien Grall
The Xen interface is using 4KB page granularity. This means that each
grant is 4KB.

The current implementation allocates a Linux page per grant. On Linux
using 64KB page granularity, only the first 4KB of the page will be
used.

We could decrease the memory wasted by sharing the page with multiple
grant. It will require some care with the {Set,Clear}ForeignPage macro.

Note that no changes has been made in the x86 code because both Linux
and Xen will only use 4KB page granularity.

Signed-off-by: Julien Grall 
Reviewed-by: David Vrabel 
Reviewed-by: Stefano Stabellini 

---
Cc: Stefano Stabellini 
Cc: Russell King 
Cc: Konrad Rzeszutek Wilk 
Cc: Boris Ostrovsky 

Changes in v3:
- Add Stefano's reviewed-by

Changes in v2
- Add David's reviewed-by
---
 arch/arm/xen/p2m.c| 6 +++---
 drivers/xen/grant-table.c | 6 +++---
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/arm/xen/p2m.c b/arch/arm/xen/p2m.c
index 887596c..0ed01f2 100644
--- a/arch/arm/xen/p2m.c
+++ b/arch/arm/xen/p2m.c
@@ -93,8 +93,8 @@ int set_foreign_p2m_mapping(struct gnttab_map_grant_ref 
*map_ops,
for (i = 0; i < count; i++) {
if (map_ops[i].status)
continue;
-   set_phys_to_machine(map_ops[i].host_addr >> PAGE_SHIFT,
-   map_ops[i].dev_bus_addr >> PAGE_SHIFT);
+   set_phys_to_machine(map_ops[i].host_addr >> XEN_PAGE_SHIFT,
+   map_ops[i].dev_bus_addr >> XEN_PAGE_SHIFT);
}
 
return 0;
@@ -108,7 +108,7 @@ int clear_foreign_p2m_mapping(struct gnttab_unmap_grant_ref 
*unmap_ops,
int i;
 
for (i = 0; i < count; i++) {
-   set_phys_to_machine(unmap_ops[i].host_addr >> PAGE_SHIFT,
+   set_phys_to_machine(unmap_ops[i].host_addr >> XEN_PAGE_SHIFT,
INVALID_P2M_ENTRY);
}
 
diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c
index 7b4e1cf..99ed9c2 100644
--- a/drivers/xen/grant-table.c
+++ b/drivers/xen/grant-table.c
@@ -642,7 +642,7 @@ int gnttab_setup_auto_xlat_frames(phys_addr_t addr)
if (xen_auto_xlat_grant_frames.count)
return -EINVAL;
 
-   vaddr = xen_remap(addr, PAGE_SIZE * max_nr_gframes);
+   vaddr = xen_remap(addr, XEN_PAGE_SIZE * max_nr_gframes);
if (vaddr == NULL) {
pr_warn("Failed to ioremap gnttab share frames (addr=%pa)!\n",
&addr);
@@ -654,7 +654,7 @@ int gnttab_setup_auto_xlat_frames(phys_addr_t addr)
return -ENOMEM;
}
for (i = 0; i < max_nr_gframes; i++)
-   pfn[i] = PFN_DOWN(addr) + i;
+   pfn[i] = XEN_PFN_DOWN(addr) + i;
 
xen_auto_xlat_grant_frames.vaddr = vaddr;
xen_auto_xlat_grant_frames.pfn = pfn;
@@ -1004,7 +1004,7 @@ static void gnttab_request_version(void)
 {
/* Only version 1 is used, which will always be available. */
grant_table_version = 1;
-   grefs_per_grant_frame = PAGE_SIZE / sizeof(struct grant_entry_v1);
+   grefs_per_grant_frame = XEN_PAGE_SIZE / sizeof(struct grant_entry_v1);
gnttab_interface = &gnttab_v1_ops;
 
pr_info("Grant tables using version %d layout\n", grant_table_version);
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v4 16/20] block/xen-blkback: Make it running on 64KB page granularity

2015-09-07 Thread Julien Grall
The PV block protocol is using 4KB page granularity. The goal of this
patch is to allow a Linux using 64KB page granularity behaving as a
block backend on a non-modified Xen.

It's only necessary to adapt the ring size and the number of request per
indirect frames. The rest of the code is relying on the grant table
code.

Note that the grant table code is allocating a Linux page per grant
which will result to waste 6OKB for every grant when Linux is using 64KB
page granularity. This could be improved by sharing the page between
multiple grants.

Signed-off-by: Julien Grall 
Acked-by: "Roger Pau Monné" 
---

Cc: Konrad Rzeszutek Wilk 
Cc: Boris Ostrovsky 
Cc: David Vrabel 

Improvement such as support of 64KB grant is not taken into
consideration in this patch because we have the requirement to run a
Linux using 64KB pages on a non-modified Xen.

This has been tested only with a loop device. I plan to test passing
hard drive partition but I didn't yet convert the swiotlb code.

Changes in v4:
- Add Roger's acked-by

Changes in v3:
- Use DIV_ROUND_UP in INDIRECT_PAGES to avoid a line over 80
characters
---
 drivers/block/xen-blkback/blkback.c |  5 +++--
 drivers/block/xen-blkback/common.h  | 17 +
 drivers/block/xen-blkback/xenbus.c  |  9 ++---
 3 files changed, 22 insertions(+), 9 deletions(-)

diff --git a/drivers/block/xen-blkback/blkback.c 
b/drivers/block/xen-blkback/blkback.c
index 954c002..802319a 100644
--- a/drivers/block/xen-blkback/blkback.c
+++ b/drivers/block/xen-blkback/blkback.c
@@ -961,7 +961,7 @@ static int xen_blkbk_parse_indirect(struct blkif_request 
*req,
seg[n].nsec = segments[i].last_sect -
segments[i].first_sect + 1;
seg[n].offset = (segments[i].first_sect << 9);
-   if ((segments[i].last_sect >= (PAGE_SIZE >> 9)) ||
+   if ((segments[i].last_sect >= (XEN_PAGE_SIZE >> 9)) ||
(segments[i].last_sect < segments[i].first_sect)) {
rc = -EINVAL;
goto unmap;
@@ -1210,6 +1210,7 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif,
 
req_operation = req->operation == BLKIF_OP_INDIRECT ?
req->u.indirect.indirect_op : req->operation;
+
if ((req->operation == BLKIF_OP_INDIRECT) &&
(req_operation != BLKIF_OP_READ) &&
(req_operation != BLKIF_OP_WRITE)) {
@@ -1268,7 +1269,7 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif,
seg[i].nsec = req->u.rw.seg[i].last_sect -
req->u.rw.seg[i].first_sect + 1;
seg[i].offset = (req->u.rw.seg[i].first_sect << 9);
-   if ((req->u.rw.seg[i].last_sect >= (PAGE_SIZE >> 9)) ||
+   if ((req->u.rw.seg[i].last_sect >= (XEN_PAGE_SIZE >> 
9)) ||
(req->u.rw.seg[i].last_sect <
 req->u.rw.seg[i].first_sect))
goto fail_response;
diff --git a/drivers/block/xen-blkback/common.h 
b/drivers/block/xen-blkback/common.h
index 45a044a..68e87a0 100644
--- a/drivers/block/xen-blkback/common.h
+++ b/drivers/block/xen-blkback/common.h
@@ -39,6 +39,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -51,12 +52,20 @@ extern unsigned int xen_blkif_max_ring_order;
  */
 #define MAX_INDIRECT_SEGMENTS 256
 
-#define SEGS_PER_INDIRECT_FRAME \
-   (PAGE_SIZE/sizeof(struct blkif_request_segment))
+/*
+ * Xen use 4K pages. The guest may use different page size (4K or 64K)
+ * Number of Xen pages per segment
+ */
+#define XEN_PAGES_PER_SEGMENT   (PAGE_SIZE / XEN_PAGE_SIZE)
+
+#define XEN_PAGES_PER_INDIRECT_FRAME \
+   (XEN_PAGE_SIZE/sizeof(struct blkif_request_segment))
+#define SEGS_PER_INDIRECT_FRAME\
+   (XEN_PAGES_PER_INDIRECT_FRAME / XEN_PAGES_PER_SEGMENT)
+
 #define MAX_INDIRECT_PAGES \
((MAX_INDIRECT_SEGMENTS + SEGS_PER_INDIRECT_FRAME - 
1)/SEGS_PER_INDIRECT_FRAME)
-#define INDIRECT_PAGES(_segs) \
-   ((_segs + SEGS_PER_INDIRECT_FRAME - 1)/SEGS_PER_INDIRECT_FRAME)
+#define INDIRECT_PAGES(_segs) DIV_ROUND_UP(_segs, XEN_PAGES_PER_INDIRECT_FRAME)
 
 /* Not a real protocol.  Used to generate ring structs which contain
  * the elements common to all protocols only.  This way we get a
diff --git a/drivers/block/xen-blkback/xenbus.c 
b/drivers/block/xen-blkback/xenbus.c
index deb3f00..edd27e4 100644
--- a/drivers/block/xen-blkback/xenbus.c
+++ b/drivers/block/xen-blkback/xenbus.c
@@ -176,21 +176,24 @@ static int xen_blkif_map(struct xen_blkif *blkif, 
grant_ref_t *gref,
{
struct blkif_sring *sring;
sring = (struct blkif_sring *)blkif->blk_ring;
-   BAC

[PATCH v4 12/20] xen/balloon: Don't rely on the page granularity is the same for Xen and Linux

2015-09-07 Thread Julien Grall
For ARM64 guests, Linux is able to support either 64K or 4K page
granularity. Although, the hypercall interface is always based on 4K
page granularity.

With 64K page granularity, a single page will be spread over multiple
Xen frame.

To avoid splitting the page into 4K frame, take advantage of the
extent_order field to directly allocate/free chunk of the Linux page
size.

Note that PVMMU is only used for PV guest (which is x86) and the page
granularity is always 4KB. Some BUILD_BUG_ON has been added to ensure
that because the code has not been modified.

Signed-off-by: Julien Grall 

---
Cc: Konrad Rzeszutek Wilk 
Cc: Boris Ostrovsky 
Cc: David Vrabel 
Cc: Wei Liu 

Note that two BUILD_BUG_ON(XEN_PAGE_SIZE != PAGE_SIZE) in code built
for the PV MMU code is kept in order to have at least one even if we
ever decide to drop of code section.

Changes in v4:
- s/xen_page_to_pfn/page_to_xen_pfn/ based on the new naming
- Use the field lru in the page to get a list of pages when
decreasing the memory reservation. It avoids to use a static
array to store the pages (see v3).
- Update comment for EXTENT_ORDER.

Changes in v3:
- Fix errors reported by checkpatch.pl
- s/mfn/gfn/ based on the new naming
- Rather than splitting the page into 4KB chunk, use the
extent_order field to allocate directly a Linux page size. This
is avoid lots of code for no benefits.

Changes in v2:
- Use xen_apply_to_page to split a page in 4K chunk
- It's not necessary to have a smaller frame list. Re-use
PAGE_SIZE
- Convert reserve_additional_memory to use XEN_... macro
---
 drivers/xen/balloon.c | 59 ++-
 1 file changed, 44 insertions(+), 15 deletions(-)

diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
index c79329f..3babf13 100644
--- a/drivers/xen/balloon.c
+++ b/drivers/xen/balloon.c
@@ -70,6 +70,11 @@
 #include 
 #include 
 
+/* Use one extent per PAGE_SIZE to avoid to break down the page into
+ * multiple frame.
+ */
+#define EXTENT_ORDER (fls(XEN_PFN_PER_PAGE) - 1)
+
 /*
  * balloon_process() state:
  *
@@ -230,6 +235,11 @@ static enum bp_state reserve_additional_memory(long credit)
nid = memory_add_physaddr_to_nid(hotplug_start_paddr);
 
 #ifdef CONFIG_XEN_HAVE_PVMMU
+   /* We don't support PV MMU when Linux and Xen is using
+* different page granularity.
+*/
+   BUILD_BUG_ON(XEN_PAGE_SIZE != PAGE_SIZE);
+
 /*
  * add_memory() will build page tables for the new memory so
  * the p2m must contain invalid entries so the correct
@@ -326,11 +336,11 @@ static enum bp_state reserve_additional_memory(long 
credit)
 static enum bp_state increase_reservation(unsigned long nr_pages)
 {
int rc;
-   unsigned long  pfn, i;
+   unsigned long i;
struct page   *page;
struct xen_memory_reservation reservation = {
.address_bits = 0,
-   .extent_order = 0,
+   .extent_order = EXTENT_ORDER,
.domid= DOMID_SELF
};
 
@@ -352,7 +362,11 @@ static enum bp_state increase_reservation(unsigned long 
nr_pages)
nr_pages = i;
break;
}
-   frame_list[i] = page_to_pfn(page);
+
+   /* XENMEM_populate_physmap requires a PFN based on Xen
+* granularity.
+*/
+   frame_list[i] = page_to_xen_pfn(page);
page = balloon_next_page(page);
}
 
@@ -366,10 +380,15 @@ static enum bp_state increase_reservation(unsigned long 
nr_pages)
page = balloon_retrieve(false);
BUG_ON(page == NULL);
 
-   pfn = page_to_pfn(page);
-
 #ifdef CONFIG_XEN_HAVE_PVMMU
+   /* We don't support PV MMU when Linux and Xen is using
+* different page granularity.
+*/
+   BUILD_BUG_ON(XEN_PAGE_SIZE != PAGE_SIZE);
+
if (!xen_feature(XENFEAT_auto_translated_physmap)) {
+   unsigned long pfn = page_to_pfn(page);
+
set_phys_to_machine(pfn, frame_list[i]);
 
/* Link back into the page tables if not highmem. */
@@ -396,14 +415,15 @@ static enum bp_state increase_reservation(unsigned long 
nr_pages)
 static enum bp_state decrease_reservation(unsigned long nr_pages, gfp_t gfp)
 {
enum bp_state state = BP_DONE;
-   unsigned long  pfn, i;
-   struct page   *page;
+   unsigned long i;
+   struct page *page, *tmp;
int ret;
struct xen_memory_reservation reservation = {
.address_bits = 0,
-   .extent_order = 0,
+   .extent_order = EXTENT_ORDER,
.domid= DOMID_SELF
};
+   LIST_HEAD(pages);
 
 #ifdef CONFIG_XEN_BALLOON

[PATCH v4 18/20] net/xen-netback: Make it running on 64KB page granularity

2015-09-07 Thread Julien Grall
The PV network protocol is using 4KB page granularity. The goal of this
patch is to allow a Linux using 64KB page granularity working as a
network backend on a non-modified Xen.

It's only necessary to adapt the ring size and break skb data in small
chunk of 4KB. The rest of the code is relying on the grant table code.

Signed-off-by: Julien Grall 

---
Cc: Ian Campbell 
Cc: Wei Liu 
Cc: net...@vger.kernel.org

Improvement such as support of 64KB grant is not taken into
consideration in this patch because we have the requirement to run a
Linux using 64KB pages on a non-modified Xen.

Note that I haven't add a comment why the offset is 0 after the first
iteration. See [1] for more details.

[1] https://lkml.org/lkml/2015/8/10/456

Changes in v4:
- Add a comment to explain how we compute MAX_XEN_SKB_FRAGS

Changes in v3:
- Fix errors reported by checkpatch.pl
- s/mfn/gfn/ based on the new naming
- gnttab_foreach_grant has been renamed to gnttab_forach_grant_in_range
- The grant callback doesn't allow anymore to use less data. An
helpers has been added in netback to handle this.

Changes in v2:
- Correctly set MAX_GRANT_COPY_OPS and XEN_NETBK_RX_SLOTS_MAX
- Don't use XEN_PAGE_SIZE in handle_frag_list as we coalesce
fragment into a new skb
- Use gnntab_foreach_grant to split a Linux page into grant
---
 drivers/net/xen-netback/common.h  |  18 +++--
 drivers/net/xen-netback/netback.c | 153 --
 2 files changed, 110 insertions(+), 61 deletions(-)

diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index 8a495b3..24cb365 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -44,6 +44,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 typedef unsigned int pending_ring_idx_t;
@@ -64,8 +65,8 @@ struct pending_tx_info {
struct ubuf_info callback_struct;
 };
 
-#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE)
-#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE)
+#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, XEN_PAGE_SIZE)
+#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, XEN_PAGE_SIZE)
 
 struct xenvif_rx_meta {
int id;
@@ -80,16 +81,21 @@ struct xenvif_rx_meta {
 /* Discriminate from any valid pending_idx value. */
 #define INVALID_PENDING_IDX 0x
 
-#define MAX_BUFFER_OFFSET PAGE_SIZE
+#define MAX_BUFFER_OFFSET XEN_PAGE_SIZE
 
 #define MAX_PENDING_REQS XEN_NETIF_TX_RING_SIZE
 
+/* The maximum number of frags is derived from the size of a grant (same
+ * as a Xen page size for now).
+ */
+#define MAX_XEN_SKB_FRAGS (65536 / XEN_PAGE_SIZE + 1)
+
 /* It's possible for an skb to have a maximal number of frags
  * but still be less than MAX_BUFFER_OFFSET in size. Thus the
- * worst-case number of copy operations is MAX_SKB_FRAGS per
+ * worst-case number of copy operations is MAX_XEN_SKB_FRAGS per
  * ring slot.
  */
-#define MAX_GRANT_COPY_OPS (MAX_SKB_FRAGS * XEN_NETIF_RX_RING_SIZE)
+#define MAX_GRANT_COPY_OPS (MAX_XEN_SKB_FRAGS * XEN_NETIF_RX_RING_SIZE)
 
 #define NETBACK_INVALID_HANDLE -1
 
@@ -203,7 +209,7 @@ struct xenvif_queue { /* Per-queue data for xenvif */
 /* Maximum number of Rx slots a to-guest packet may use, including the
  * slot needed for GSO meta-data.
  */
-#define XEN_NETBK_RX_SLOTS_MAX (MAX_SKB_FRAGS + 1)
+#define XEN_NETBK_RX_SLOTS_MAX ((MAX_XEN_SKB_FRAGS + 1))
 
 enum state_bit_shift {
/* This bit marks that the vif is connected */
diff --git a/drivers/net/xen-netback/netback.c 
b/drivers/net/xen-netback/netback.c
index d4c1bc7..b1649aa 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -263,6 +263,80 @@ static struct xenvif_rx_meta *get_next_rx_buffer(struct 
xenvif_queue *queue,
return meta;
 }
 
+struct gop_frag_copy {
+   struct xenvif_queue *queue;
+   struct netrx_pending_operations *npo;
+   struct xenvif_rx_meta *meta;
+   int head;
+   int gso_type;
+
+   struct page *page;
+};
+
+static void xenvif_setup_copy_gop(unsigned long gfn,
+ unsigned int offset,
+ unsigned int *len,
+ struct gop_frag_copy *info)
+{
+   struct gnttab_copy *copy_gop;
+   struct xen_page_foreign *foreign;
+   /* Convenient aliases */
+   struct xenvif_queue *queue = info->queue;
+   struct netrx_pending_operations *npo = info->npo;
+   struct page *page = info->page;
+
+   BUG_ON(npo->copy_off > MAX_BUFFER_OFFSET);
+
+   if (npo->copy_off == MAX_BUFFER_OFFSET)
+   info->meta = get_next_rx_buffer(queue, npo);
+
+   if (npo->copy_off + *len > MAX_BUFFER_OFFSET)
+   *len = MAX_BUFFER_OFFSET - npo->copy_off;
+
+   copy_gop 

[PATCH v4 17/20] net/xen-netfront: Make it running on 64KB page granularity

2015-09-07 Thread Julien Grall
The PV network protocol is using 4KB page granularity. The goal of this
patch is to allow a Linux using 64KB page granularity using network
device on a non-modified Xen.

It's only necessary to adapt the ring size and break skb data in small
chunk of 4KB. The rest of the code is relying on the grant table code.

Note that we allocate a Linux page for each rx skb but only the first
4KB is used. We may improve the memory usage by extending the size of
the rx skb.

Signed-off-by: Julien Grall 
Reviewed-by: David Vrabel 

---
Cc: Konrad Rzeszutek Wilk 
Cc: Boris Ostrovsky 
Cc: net...@vger.kernel.org

Improvement such as support of 64KB grant is not taken into
consideration in this patch because we have the requirement to run a Linux
using 64KB pages on a non-modified Xen.

Tested with workload such as ping, ssh, wget, git... I would happy if
someone give details how to test all the path.

Changes in v4:
- s/gnttab_one_grant/gnttab_for_one_grant/ based on the new naming
- Add David's reviewed-by

Changes in v3:
- Fix errors reported by checkpatch.pl
- s/mfn/gfn/ base on the new naming
- xennet_tx_setup_grant was calling itself resulting an
guest stall when using iperf.
- The grant callback doesn't allow anymore to change the len
(wasn't used here)
- gnttab_foreach_grant has been renamed to gnttab_foreach_grant_in_range
- gnttab_page_grant_foreign_ref has been renamed to
gnttab_foreach_grant_foreign_ref_one

Changes in v2:
- Use gnttab_foreach_grant to split a Linux page in grant
- Fix count slots
---
 drivers/net/xen-netfront.c | 122 -
 1 file changed, 86 insertions(+), 36 deletions(-)

diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index 47f791e..17b1013 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -74,8 +74,8 @@ struct netfront_cb {
 
 #define GRANT_INVALID_REF  0
 
-#define NET_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE)
-#define NET_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE)
+#define NET_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, XEN_PAGE_SIZE)
+#define NET_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, XEN_PAGE_SIZE)
 
 /* Minimum number of Rx slots (includes slot for GSO metadata). */
 #define NET_RX_SLOTS_MIN (XEN_NETIF_NR_SLOTS_MIN + 1)
@@ -291,7 +291,7 @@ static void xennet_alloc_rx_buffers(struct netfront_queue 
*queue)
struct sk_buff *skb;
unsigned short id;
grant_ref_t ref;
-   unsigned long gfn;
+   struct page *page;
struct xen_netif_rx_request *req;
 
skb = xennet_alloc_one_rx_buffer(queue);
@@ -307,14 +307,13 @@ static void xennet_alloc_rx_buffers(struct netfront_queue 
*queue)
BUG_ON((signed short)ref < 0);
queue->grant_rx_ref[id] = ref;
 
-   gfn = 
xen_page_to_gfn(skb_frag_page(&skb_shinfo(skb)->frags[0]));
+   page = skb_frag_page(&skb_shinfo(skb)->frags[0]);
 
req = RING_GET_REQUEST(&queue->rx, req_prod);
-   gnttab_grant_foreign_access_ref(ref,
-   queue->info->xbdev->otherend_id,
-   gfn,
-   0);
-
+   gnttab_page_grant_foreign_access_ref_one(ref,
+
queue->info->xbdev->otherend_id,
+page,
+0);
req->id = id;
req->gref = ref;
}
@@ -415,25 +414,33 @@ static void xennet_tx_buf_gc(struct netfront_queue *queue)
xennet_maybe_wake_tx(queue);
 }
 
-static struct xen_netif_tx_request *xennet_make_one_txreq(
-   struct netfront_queue *queue, struct sk_buff *skb,
-   struct page *page, unsigned int offset, unsigned int len)
+struct xennet_gnttab_make_txreq {
+   struct netfront_queue *queue;
+   struct sk_buff *skb;
+   struct page *page;
+   struct xen_netif_tx_request *tx; /* Last request */
+   unsigned int size;
+};
+
+static void xennet_tx_setup_grant(unsigned long gfn, unsigned int offset,
+ unsigned int len, void *data)
 {
+   struct xennet_gnttab_make_txreq *info = data;
unsigned int id;
struct xen_netif_tx_request *tx;
grant_ref_t ref;
-
-   len = min_t(unsigned int, PAGE_SIZE - offset, len);
+   /* convenient aliases */
+   struct page *page = info->page;
+   struct netfront_queue *queue = info->queue;
+   struct sk_buff *skb = info->skb;
 
id = get_id_from_freelist(&queue->tx_skb_freelist, queue->

[PATCH v4 15/20] block/xen-blkfront: Make it running on 64KB page granularity

2015-09-07 Thread Julien Grall
The PV block protocol is using 4KB page granularity. The goal of this
patch is to allow a Linux using 64KB page granularity using block
device on a non-modified Xen.

The block API is using segment which should at least be the size of a
Linux page. Therefore, the driver will have to break the page in chunk
of 4K before giving the page to the backend.

When breaking a 64KB segment in 4KB chunks, it is possible that some
chunks are empty. As the PV protocol always require to have data in the
chunk, we have to count the number of Xen page which will be in use and
avoid sending empty chunks.

Note that, a pre-defined number of grants are reserved before preparing
the request. This pre-defined number is based on the number and the
maximum size of the segments. If each segment contains a very small
amount of data, the driver may reserve too many grants (16 grants is
reserved per segment with 64KB page granularity).

Furthermore, in the case of persistent grants we allocate one Linux page
per grant although only the first 4KB of the page will be effectively
in use. This could be improved by sharing the page with multiple grants.

Signed-off-by: Julien Grall 
Acked-by: Roger Pau Monné 

---
Cc: Konrad Rzeszutek Wilk 
Cc: Boris Ostrovsky 
Cc: David Vrabel 

Improvement such as support 64KB grant is not taken into consideration in
this patch because we have the requirement to run a Linux using 64KB page
on a non-modified Xen.

Changes in v4:
- Rebase after d50babbe300eedf33ea5b00a12c5df3a05bd96c7 "
xen-blkfront: introduce blkfront_gather_backend_features()"
- Fix typoes
- Add Roger's acked-by

Changes in v3:
- Use DIV_ROUND_UP in INDIRECT_GREFS
- Split lines over 80 characters whenever it's possible
- s/mfn/gfn/ based on the new naming
- The grant callback doesn't allow anymore to change the len
(wasn't used here).
- gnttab_foreach_grant has been renamed to gnttab_foreach_grant_in_range
- Use gnttab_count_grant to get the number of grants in a sg
- Do some renaming to use the correct variable every time

Changes in v2:
- Use gnttab_foreach_grant to split a Linux page into grant
---
 drivers/block/xen-blkfront.c | 324 ---
 1 file changed, 213 insertions(+), 111 deletions(-)

diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
index 4232cbd..f2cdc73 100644
--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -78,6 +78,7 @@ struct blk_shadow {
struct grant **grants_used;
struct grant **indirect_grants;
struct scatterlist *sg;
+   unsigned int num_sg;
 };
 
 struct split_bio {
@@ -107,8 +108,12 @@ static unsigned int xen_blkif_max_ring_order;
 module_param_named(max_ring_page_order, xen_blkif_max_ring_order, int, 
S_IRUGO);
 MODULE_PARM_DESC(max_ring_page_order, "Maximum order of pages to be used for 
the shared ring");
 
-#define BLK_RING_SIZE(info) __CONST_RING_SIZE(blkif, PAGE_SIZE * 
(info)->nr_ring_pages)
-#define BLK_MAX_RING_SIZE __CONST_RING_SIZE(blkif, PAGE_SIZE * 
XENBUS_MAX_RING_PAGES)
+#define BLK_RING_SIZE(info)\
+   __CONST_RING_SIZE(blkif, XEN_PAGE_SIZE * (info)->nr_ring_pages)
+
+#define BLK_MAX_RING_SIZE  \
+   __CONST_RING_SIZE(blkif, XEN_PAGE_SIZE * XENBUS_MAX_RING_PAGES)
+
 /*
  * ring-ref%i i=(-1UL) would take 11 characters + 'ring-ref' is 8, so 19
  * characters are enough. Define to 20 to keep consist with backend.
@@ -147,6 +152,7 @@ struct blkfront_info
unsigned int discard_granularity;
unsigned int discard_alignment;
unsigned int feature_persistent:1;
+   /* Number of 4KB segments handled */
unsigned int max_indirect_segments;
int is_ready;
struct blk_mq_tag_set tag_set;
@@ -175,10 +181,23 @@ static DEFINE_SPINLOCK(minor_lock);
 
 #define DEV_NAME   "xvd"   /* name in /dev */
 
-#define SEGS_PER_INDIRECT_FRAME \
-   (PAGE_SIZE/sizeof(struct blkif_request_segment))
-#define INDIRECT_GREFS(_segs) \
-   ((_segs + SEGS_PER_INDIRECT_FRAME - 1)/SEGS_PER_INDIRECT_FRAME)
+/*
+ * Grants are always the same size as a Xen page (i.e 4KB).
+ * A physical segment is always the same size as a Linux page.
+ * Number of grants per physical segment
+ */
+#define GRANTS_PER_PSEG(PAGE_SIZE / XEN_PAGE_SIZE)
+
+#define GRANTS_PER_INDIRECT_FRAME \
+   (XEN_PAGE_SIZE / sizeof(struct blkif_request_segment))
+
+#define PSEGS_PER_INDIRECT_FRAME   \
+   (GRANTS_INDIRECT_FRAME / GRANTS_PSEGS)
+
+#define INDIRECT_GREFS(_grants)\
+   DIV_ROUND_UP(_grants, GRANTS_PER_INDIRECT_FRAME)
+
+#define GREFS(_psegs)  ((_psegs) * GRANTS_PER_PSEG)
 
 static int blkfront_setup_indirect(struct blkfront_info *info);
 static int blkfront_gather_backend_features(struct blkfront_info *info);
@@ -466,14 +485,100 @@ static int blkif_que

[PATCH v4 20/20] arm/xen: Add support for 64KB page granularity

2015-09-07 Thread Julien Grall
The hypercall interface is always using 4KB page granularity. This is
requiring to use xen page definition macro when we deal with hypercall.

Note that pfn_to_gfn is working with a Xen pfn (i.e 4KB). We may want to
rename pfn_gfn to make this explicit.

We also allocate a 64KB page for the shared page even though only the
first 4KB is used. I don't think this is really important for now as it
helps to have the pointer 4KB aligned (XENMEM_add_to_physmap is taking a
Xen PFN).

Signed-off-by: Julien Grall 
Reviewed-by: Stefano Stabellini 

---
Cc: Russell King 

Stefano, I've dropped your reviewed-by given I've updated the doc and do
changes to avoid usage of XEN_PAGE_SHIFT

Changes in v4:
- Add Stefano's Reviewed-by

Changes in v3:
- s/MFN/GFN/ base on the new naming
- Use virt_to_gfn to avoid use XEN_PAGE_SHIFT
- Drop Stefano's reviewed-by
- Add some docs in arch/arm/asm/xen/page.h

Changes in v2
- Add Stefano's reviewed-by
---
 arch/arm/include/asm/xen/page.h | 15 +--
 arch/arm/xen/enlighten.c|  6 +++---
 2 files changed, 16 insertions(+), 5 deletions(-)

diff --git a/arch/arm/include/asm/xen/page.h b/arch/arm/include/asm/xen/page.h
index 98c9fc3..e3d94cf 100644
--- a/arch/arm/include/asm/xen/page.h
+++ b/arch/arm/include/asm/xen/page.h
@@ -28,6 +28,17 @@ typedef struct xpaddr {
 
 #define INVALID_P2M_ENTRY  (~0UL)
 
+/*
+ * The pseudo-physical frame (pfn) used in all the helpers is always based
+ * on Xen page granularity (i.e 4KB).
+ *
+ * A Linux page may be split across multiple non-contiguous Xen page so we
+ * have to keep track with frame based on 4KB page granularity.
+ *
+ * PV drivers should never make a direct usage of those helpers (particularly
+ * pfn_to_gfn and gfn_to_pfn).
+ */
+
 unsigned long __pfn_to_mfn(unsigned long pfn);
 extern struct rb_root phys_to_mach;
 
@@ -64,8 +75,8 @@ static inline unsigned long bfn_to_pfn(unsigned long bfn)
 #define bfn_to_local_pfn(bfn)  bfn_to_pfn(bfn)
 
 /* VIRT <-> GUEST conversion */
-#define virt_to_gfn(v) (pfn_to_gfn(virt_to_pfn(v)))
-#define gfn_to_virt(m) (__va(gfn_to_pfn(m) << PAGE_SHIFT))
+#define virt_to_gfn(v) (pfn_to_gfn(virt_to_phys(v) >> XEN_PAGE_SHIFT))
+#define gfn_to_virt(m) (__va(gfn_to_pfn(m) << XEN_PAGE_SHIFT))
 
 /* Only used in PV code. But ARM guests are always HVM. */
 static inline xmaddr_t arbitrary_virt_to_machine(void *vaddr)
diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
index eeeab07..50b4769 100644
--- a/arch/arm/xen/enlighten.c
+++ b/arch/arm/xen/enlighten.c
@@ -89,8 +89,8 @@ static void xen_percpu_init(void)
pr_info("Xen: initializing cpu%d\n", cpu);
vcpup = per_cpu_ptr(xen_vcpu_info, cpu);
 
-   info.mfn = __pa(vcpup) >> PAGE_SHIFT;
-   info.offset = offset_in_page(vcpup);
+   info.mfn = virt_to_gfn(vcpup);
+   info.offset = xen_offset_in_page(vcpup);
 
err = HYPERVISOR_vcpu_op(VCPUOP_register_vcpu_info, cpu, &info);
BUG_ON(err);
@@ -213,7 +213,7 @@ static int __init xen_guest_init(void)
xatp.domid = DOMID_SELF;
xatp.idx = 0;
xatp.space = XENMAPSPACE_shared_info;
-   xatp.gpfn = __pa(shared_info_page) >> PAGE_SHIFT;
+   xatp.gpfn = virt_to_gfn(shared_info_page);
if (HYPERVISOR_memory_op(XENMEM_add_to_physmap, &xatp))
BUG();
 
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v4 18/20] net/xen-netback: Make it running on 64KB page granularity

2015-09-08 Thread Julien Grall
Hi Wei,

On 07/09/15 17:57, Wei Liu wrote:
> You might need to rebase you patch. A patch to netback went it recently.

Do you mean 210c34dcd8d912dcc740f1f17625a7293af5cb56 "xen-netback: add
support for multicast control"?

If so I didn't see any specific issue while rebasing on the latest
linus' master.

> On Mon, Sep 07, 2015 at 04:33:56PM +0100, Julien Grall wrote:
>> The PV network protocol is using 4KB page granularity. The goal of this
>> patch is to allow a Linux using 64KB page granularity working as a
>> network backend on a non-modified Xen.
>>
>> It's only necessary to adapt the ring size and break skb data in small
>> chunk of 4KB. The rest of the code is relying on the grant table code.
>>
>> Signed-off-by: Julien Grall 
>>
> 
> Reviewed-by: Wei Liu 

Thank you!

Regards,

-- 
Julien Grall
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4 3/4] irqchip: GIC: Convert to EOImode == 1

2015-09-09 Thread Julien Grall
; + .irq_eoi= gic_eoimode1_eoi_irq,
> + .irq_set_type   = gic_set_type,
> +#ifdef CONFIG_SMP
> + .irq_set_affinity   = gic_set_affinity,
> +#endif
> + .irq_get_irqchip_state  = gic_irq_get_irqchip_state,
> + .irq_set_irqchip_state  = gic_irq_set_irqchip_state,
> + .flags  = IRQCHIP_SET_TYPE_MASKED,
> +};
> +
>  void __init gic_cascade_irq(unsigned int gic_nr, unsigned int irq)
>  {
>   if (gic_nr >= MAX_GIC_NR)
> @@ -359,6 +390,10 @@ static void gic_cpu_if_up(void)
>  {
>   void __iomem *cpu_base = gic_data_cpu_base(&gic_data[0]);
>   u32 bypass = 0;
> + u32 mode = 0;
> +
> + if (static_key_true(&supports_deactivate))
> + mode = GIC_CPU_CTRL_EOImodeNS;
>  
>   /*
>   * Preserve bypass disable bits to be written back later
> @@ -366,7 +401,7 @@ static void gic_cpu_if_up(void)
>   bypass = readl(cpu_base + GIC_CPU_CTRL);
>   bypass &= GICC_DIS_BYPASS_MASK;
>  
> - writel_relaxed(bypass | GICC_ENABLE, cpu_base + GIC_CPU_CTRL);
> + writel_relaxed(bypass | mode | GICC_ENABLE, cpu_base + GIC_CPU_CTRL);
>  }
>  
>  
> @@ -789,13 +824,20 @@ void __init gic_init_physaddr(struct device_node *node)
>  static int gic_irq_domain_map(struct irq_domain *d, unsigned int irq,
>   irq_hw_number_t hw)
>  {
> + struct irq_chip *chip = &gic_chip;
> +
> + if (static_key_true(&supports_deactivate)) {
> + if (d->host_data == (void *)&gic_data[0])
> + chip = &gic_eoimode1_chip;
> + }
> +
>   if (hw < 32) {
>   irq_set_percpu_devid(irq);
> - irq_domain_set_info(d, irq, hw, &gic_chip, d->host_data,
> + irq_domain_set_info(d, irq, hw, chip, d->host_data,
>   handle_percpu_devid_irq, NULL, NULL);
>   set_irq_flags(irq, IRQF_VALID | IRQF_NOAUTOEN);
>   } else {
> - irq_domain_set_info(d, irq, hw, &gic_chip, d->host_data,
> + irq_domain_set_info(d, irq, hw, chip, d->host_data,
>   handle_fasteoi_irq, NULL, NULL);
>   set_irq_flags(irq, IRQF_VALID | IRQF_PROBE);
>   }
> @@ -986,6 +1028,8 @@ void __init gic_init_bases(unsigned int gic_nr, int 
> irq_start,
>   register_cpu_notifier(&gic_cpu_notifier);
>  #endif
>   set_handle_irq(gic_handle_irq);
> + if (static_key_true(&supports_deactivate))
> + pr_info("GIC: Using split EOI/Deactivate mode\n");
>   }
>  
>   gic_dist_init(gic);
> @@ -1001,6 +1045,7 @@ gic_of_init(struct device_node *node, struct 
> device_node *parent)
>  {
>   void __iomem *cpu_base;
>   void __iomem *dist_base;
> + struct resource cpu_res;
>   u32 percpu_offset;
>   int irq;
>  
> @@ -1013,6 +1058,16 @@ gic_of_init(struct device_node *node, struct 
> device_node *parent)
>   cpu_base = of_iomap(node, 1);
>   WARN(!cpu_base, "unable to map gic cpu registers\n");
>  
> + of_address_to_resource(node, 1, &cpu_res);
> +
> + /*
> +  * Disable split EOI/Deactivate if either HYP is not available
> +  * or the CPU interface is too small.
> +  */
> + if (gic_cnt == 0 && (!is_hyp_mode_available() ||
> +  resource_size(&cpu_res) < SZ_8K))
> + static_key_slow_dec(&supports_deactivate);
> +
>   if (of_property_read_u32(node, "cpu-offset", &percpu_offset))
>   percpu_offset = 0;
>  
> @@ -1132,6 +1187,14 @@ gic_v2_acpi_init(struct acpi_table_header *table)
>   }
>  
>   /*
> +  * Disable split EOI/Deactivate if HYP is not available. ACPI
> +  * guarantees that we'll always have a GICv2, so the CPU
> +  * interface will always be the right size.
> +  */
> + if (!is_hyp_mode_available())
> + static_key_slow_dec(&supports_deactivate);
> +
> + /*
>* Initialize zero GIC instance (no multi-GIC support). Also, set GIC
>* as default IRQ domain to allow for GSI registration and GSI to IRQ
>* number translation (see acpi_register_gsi() and acpi_gsi_to_irq()).
> diff --git a/include/linux/irqchip/arm-gic.h b/include/linux/irqchip/arm-gic.h
> index 9de976b..b1533c0 100644
> --- a/include/linux/irqchip/arm-gic.h
> +++ b/include/linux/irqchip/arm-gic.h
> @@ -20,9 +20,13 @@
>  #define GIC_CPU_ALIAS_BINPOINT   0x1c
>  #define GIC_CPU_ACTIVEPRIO   0xd0
>  #define GIC_CPU_IDENT0xfc
> +#define GIC_CPU_DEACTIVATE   0x1000
>  
>  #define GICC_ENABLE  0x1
>  #define GICC_INT_PRI_THRESHOLD   0xf0
> +
> +#define GIC_CPU_CTRL_EOImodeNS   (1 << 9)
> +
>  #define GICC_IAR_INT_ID_MASK 0x3ff
>  #define GICC_INT_SPURIOUS1023
>  #define GICC_DIS_BYPASS_MASK 0x1e0
> 


-- 
Julien Grall
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] xen: check return value of xenbus_printf

2015-10-15 Thread Julien Grall
Hi Insu,

On 15/10/15 19:12, Insu Yun wrote:
> 
> 
> On Thu, Oct 15, 2015 at 12:40 PM, David Vrabel  <mailto:david.vra...@citrix.com>> wrote:
> 
> On 15/10/15 17:25, Insu Yun wrote:
> > Internally, xenbus_printf uses memory allocation, so it can be failed in
> > memory pressure.Therefore, xenbus_printf's return should be checked
> > and properly handled.
> [...]
> > --- a/drivers/input/misc/xen-kbdfront.c
> > +++ b/drivers/input/misc/xen-kbdfront.c
> > @@ -129,8 +129,11 @@ static int xenkbd_probe(struct xenbus_device *dev,
> >
> >   if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-abs-pointer", 
> "%d", &abs) < 0)
> >   abs = 0;
> > - if (abs)
> > - xenbus_printf(XBT_NIL, dev->nodename, 
> "request-abs-pointer", "1");
> > + if (abs) {
> > + ret = xenbus_printf(XBT_NIL, dev->nodename, 
> "request-abs-pointer", "1");
> > + if (ret)
> 
> > + pr_warning("xenkbd: can't request abs-pointer");
> 
> 
> This error handling is from other code .
> I am not sure that it is right error handling.
>  
> 
> 
> I think you want abs = 0 here or input device will be configured as
> absolute but the backend will supply relative coordinates.
> 
> 
> I cannot understand

If the frontend is not able to write the node "request-abs-pointer" in
the xenstore, the backend will always supply relative coordinates.

Although, as abs = 1, the frontend will be configured to handle absolute
coordinate. So the backend and frontend won't be able to understand each
other.

So you have to set abs to 0 if xebus_printf fails.

Regards,

-- 
Julien Grall
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] device property: Don't overwrite addr when failing in device_get_mac_address

2015-09-03 Thread Julien Grall
The function device_get_mac_address is trying different property names
in order to get the mac address. To check the return value, the variable
addr (which contain the buffer pass by the caller) will be re-used. This
means that if the previous property is not found, the next property will
be read using a NULL buffer.

Therefore it's only possible to retrieve the mac if node contains a
property "mac-address". Fix it by using a temporary buffer for the
return value.

This has been introduced by commit 4c96b7dc0d393f12c17e0d81db15aa4a820a6ab3
"Add a matching set of device_ functions for determining mac/phy"

Signed-off-by: Julien Grall 
Cc: Jeremy Linton 
Cc: David S. Miller 

---
Cc: Greg Kroah-Hartman 
Cc: net...@vger.kernel.org
---
 drivers/base/property.c | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/base/property.c b/drivers/base/property.c
index ff03f23..2d75366 100644
--- a/drivers/base/property.c
+++ b/drivers/base/property.c
@@ -611,13 +611,15 @@ static void *device_get_mac_addr(struct device *dev,
 */
 void *device_get_mac_address(struct device *dev, char *addr, int alen)
 {
-   addr = device_get_mac_addr(dev, "mac-address", addr, alen);
-   if (addr)
-   return addr;
+   char *res;
 
-   addr = device_get_mac_addr(dev, "local-mac-address", addr, alen);
-   if (addr)
-   return addr;
+   res = device_get_mac_addr(dev, "mac-address", addr, alen);
+   if (res)
+   return res;
+
+   res = device_get_mac_addr(dev, "local-mac-address", addr, alen);
+   if (res)
+   return res;
 
return device_get_mac_addr(dev, "address", addr, alen);
 }
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH] xen-blkback: free requests on disconnection

2015-09-04 Thread Julien Grall
Hi Roger,

On 04/09/15 11:08, Roger Pau Monne wrote:
> Request allocation has been moved to connect_ring, which is called every
> time blkback connects to the frontend (this can happen multiple times during
> a blkback instance life cycle). On the other hand, request freeing has not
> been moved, so it's only called when destroying the backend instance. Due to
> this mismatch, blkback can allocate the request pool multiple times, without
> freeing it.
> 
> In order to fix it, move the freeing of requests to xen_blkif_disconnect to
> restore the symmetry between request allocation and freeing.
> 
> Reported-by: Julien Grall 
> Signed-off-by: Roger Pau Monné 
> Cc: Julien Grall 
> Cc: Konrad Rzeszutek Wilk 
> Cc: Boris Ostrovsky 
> Cc: David Vrabel 
> Cc: xen-de...@lists.xenproject.org

The patch is fixing my problem when using UEFI in the guest. Thank you!

Tested-by: Julien Grall 

Regards,

-- 
Julien Grall
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH 2/2] block/xen-blkfront: Handle non-indirect grant with 64KB pages

2015-10-19 Thread Julien Grall
On 19/10/15 12:16, Roger Pau Monné wrote:
> At this point I don't think it's worth implementing it, if you feel like
> doing that later in order to improve performance that would be fine, but
> I don't think it should be required in order to get this merged.

I would rather avoid to improve performance in the frontend and
encourage people to implement indirect descriptor in their backend.

It would be more beneficial for any block frontend (x86, arm, 4K pages,
64K pages...).

> I think you had to resend the patch anyway to fix some comments, but
> apart from that:

Note that I'm planning to update the commit message to summarize my
previous mail.

> Acked-by: Roger Pau Monné 

Thank you, I will try to resend a new version today.

Regards,

-- 
Julien Grall
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 0/2] block/xen-blkfront: Support non-indirect grant with 64KB page granularity

2015-10-19 Thread Julien Grall
Hi all,

This is a follow-up on the previous discussion [1] related to guest using 64KB
page granularity which doesn't boot when the backend isn't using indirect
descriptor.

This has been successfully tested on ARM64 with both 64KB and 4KB page
granularity guests and QEMU as the backend. Indeed QEMU doesn't support
indirect descriptor.

This series is based on xentip/for-linus-4.4 which include the support for
64KB Linux guest.

For all the changes see in each patch.

Sincerely yours,

[1] http://lists.xen.org/archives/html/xen-devel/2015-08/msg01659.html

Cc: Konrad Rzeszutek Wilk 
Cc: "Roger Pau Monné" 
Cc: Boris Ostrovsky 
Cc: David Vrabel 

Julien Grall (2):
  block/xen-blkfront: Introduce blkif_ring_get_request
  block/xen-blkfront: Handle non-indirect grant with 64KB pages

 drivers/block/xen-blkfront.c | 230 ++-
 1 file changed, 203 insertions(+), 27 deletions(-)

-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 1/2] block/xen-blkfront: Introduce blkif_ring_get_request

2015-10-19 Thread Julien Grall
The code to get a request is always the same. Therefore we can factorize
it in a single function.

Signed-off-by: Julien Grall 
Acked-by: Roger Pau Monné 

---
Cc: Konrad Rzeszutek Wilk 
Cc: Boris Ostrovsky 
Cc: David Vrabel 

Changes in v2:
- Add Royger's acked-by
---
 drivers/block/xen-blkfront.c | 30 +++---
 1 file changed, 19 insertions(+), 11 deletions(-)

diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
index 3ea948c..5982768 100644
--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -456,6 +456,23 @@ static int blkif_ioctl(struct block_device *bdev, fmode_t 
mode,
return 0;
 }
 
+static unsigned long blkif_ring_get_request(struct blkfront_info *info,
+   struct request *req,
+   struct blkif_request **ring_req)
+{
+   unsigned long id;
+
+   *ring_req = RING_GET_REQUEST(&info->ring, info->ring.req_prod_pvt);
+   info->ring.req_prod_pvt++;
+
+   id = get_id_from_freelist(info);
+   info->shadow[id].request = req;
+
+   (*ring_req)->u.rw.id = id;
+
+   return id;
+}
+
 static int blkif_queue_discard_req(struct request *req)
 {
struct blkfront_info *info = req->rq_disk->private_data;
@@ -463,9 +480,7 @@ static int blkif_queue_discard_req(struct request *req)
unsigned long id;
 
/* Fill out a communications ring structure. */
-   ring_req = RING_GET_REQUEST(&info->ring, info->ring.req_prod_pvt);
-   id = get_id_from_freelist(info);
-   info->shadow[id].request = req;
+   id = blkif_ring_get_request(info, req, &ring_req);
 
ring_req->operation = BLKIF_OP_DISCARD;
ring_req->u.discard.nr_sectors = blk_rq_sectors(req);
@@ -476,8 +491,6 @@ static int blkif_queue_discard_req(struct request *req)
else
ring_req->u.discard.flag = 0;
 
-   info->ring.req_prod_pvt++;
-
/* Keep a private copy so we can reissue requests when recovering. */
info->shadow[id].req = *ring_req;
 
@@ -613,9 +626,7 @@ static int blkif_queue_rw_req(struct request *req)
new_persistent_gnts = 0;
 
/* Fill out a communications ring structure. */
-   ring_req = RING_GET_REQUEST(&info->ring, info->ring.req_prod_pvt);
-   id = get_id_from_freelist(info);
-   info->shadow[id].request = req;
+   id = blkif_ring_get_request(info, req, &ring_req);
 
BUG_ON(info->max_indirect_segments == 0 &&
   GREFS(req->nr_phys_segments) > BLKIF_MAX_SEGMENTS_PER_REQUEST);
@@ -628,7 +639,6 @@ static int blkif_queue_rw_req(struct request *req)
for_each_sg(info->shadow[id].sg, sg, num_sg, i)
   num_grant += gnttab_count_grant(sg->offset, sg->length);
 
-   ring_req->u.rw.id = id;
info->shadow[id].num_sg = num_sg;
if (num_grant > BLKIF_MAX_SEGMENTS_PER_REQUEST) {
/*
@@ -694,8 +704,6 @@ static int blkif_queue_rw_req(struct request *req)
if (setup.segments)
kunmap_atomic(setup.segments);
 
-   info->ring.req_prod_pvt++;
-
/* Keep a private copy so we can reissue requests when recovering. */
info->shadow[id].req = *ring_req;
 
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 2/2] block/xen-blkfront: Handle non-indirect grant with 64KB pages

2015-10-19 Thread Julien Grall
The minimal size of request in the block framework is always PAGE_SIZE.
It means that when 64KB guest is support, the request will at least be
64KB.

Although, if the backend doesn't support indirect descriptor (such as QDISK
in QEMU), a ring request is only able to accommodate 11 segments of 4KB
(i.e 44KB).

The current frontend is assuming that an I/O request will always fit in
a ring request. This is not true any more when using 64KB page
granularity and will therefore crash during boot.

On ARM64, the ABI is completely neutral to the page granularity used by
the domU. The guest has the choice between different page granularity
supported by the processors (for instance on ARM64: 4KB, 16KB, 64KB).
This can't be enforced by the hypervisor and therefore it's possible to
run guests using different page granularity.

So we can't mandate the block backend to support indirect descriptor
when the frontend is using 64KB page granularity and have to fix it
properly in the frontend.

The solution exposed below is based on modifying directly the frontend
guest rather than asking the block framework to support smaller size
(i.e < PAGE_SIZE). This is because the change is the block framework are
not trivial as everything seems to relying on a struct *page (see [1]).
Although, it may be possible that someone succeed to do it in the future
and we would therefore be able to use it.

Given that a block request may not fit in a single ring request, a
second request is introduced for the data that cannot fit in the first
one. This means that the second ring request should never be used on
Linux if the page size is smaller than 44KB.

To achieve the support of the extra ring request, the block queue size
is divided by two. Therefore, the ring will always contain enough space
to accommodate 2 ring requests. While this will reduce the overall
performance, it will make the implementation more contained. The way
forward to get better performance is to implement in the backend either
indirect descriptor or multiple grants ring.

Note that the parameters blk_queue_max_* helpers haven't been updated.
The block code will set the mimimum size supported and we may be able
to support directly any change in the block framework that lower down
the minimal size of a request.

[1] http://lists.xen.org/archives/html/xen-devel/2015-08/msg02200.html

Signed-off-by: Julien Grall 

---
Cc: Konrad Rzeszutek Wilk 
Cc: "Roger Pau Monné" 
Cc: Boris Ostrovsky 
Cc: David Vrabel 

Changes in v2:
- Update the commit message
- Typoes
- Rename ring_req2/id2 to extra_ring_req/extra_id
---
 drivers/block/xen-blkfront.c | 200 +++
 1 file changed, 184 insertions(+), 16 deletions(-)

diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
index 5982768..8ea2c97 100644
--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -60,6 +60,20 @@
 
 #include 
 
+/*
+ * The minimal size of segment supportd by the block framework is PAGE_SIZE.
+ * When Linux is using a different page size than xen, it may not be possible
+ * to put all the data in a single segment.
+ * This can happen when the backend doesn't support indirect descriptor and
+ * therefore the maximum amount of data that a request can carry is
+ * BLKIF_MAX_SEGMENTS_PER_REQUEST * XEN_PAGE_SIZE = 44KB
+ *
+ * Note that we only support one extra request. So the Linux page size
+ * should be <= ( 2 * BLKIF_MAX_SEGMENTS_PER_REQUEST * XEN_PAGE_SIZE) =
+ * 88KB.
+ */
+#define HAS_EXTRA_REQ (BLKIF_MAX_SEGMENTS_PER_REQUEST < XEN_PFN_PER_PAGE)
+
 enum blkif_state {
BLKIF_STATE_DISCONNECTED,
BLKIF_STATE_CONNECTED,
@@ -79,6 +93,18 @@ struct blk_shadow {
struct grant **indirect_grants;
struct scatterlist *sg;
unsigned int num_sg;
+   enum {
+   REQ_WAITING,
+   REQ_DONE,
+   REQ_FAIL
+   } status;
+
+   #define NO_ASSOCIATED_ID ~0UL
+   /*
+* Id of the sibling if we ever need 2 requests when handling a
+* block I/O request
+*/
+   unsigned long associated_id;
 };
 
 struct split_bio {
@@ -467,6 +493,8 @@ static unsigned long blkif_ring_get_request(struct 
blkfront_info *info,
 
id = get_id_from_freelist(info);
info->shadow[id].request = req;
+   info->shadow[id].status = REQ_WAITING;
+   info->shadow[id].associated_id = NO_ASSOCIATED_ID;
 
(*ring_req)->u.rw.id = id;
 
@@ -508,6 +536,9 @@ struct setup_rw_req {
bool need_copy;
unsigned int bvec_off;
char *bvec_data;
+
+   bool require_extra_req;
+   struct blkif_request *extra_ring_req;
 };
 
 static void blkif_setup_rw_req_grant(unsigned long gfn, unsigned int offset,
@@ -521,8 +552,24 @@ static void blkif_setup_rw_req_grant(unsigned long gfn, 
unsigned int offset,
unsigned int grant_idx = setup->gra

Re: [PATCH] xen: check return value of xenbus_printf

2015-10-19 Thread Julien Grall
Hi,

On 19/10/15 15:10, Insu Yun wrote:
> Internally, xenbus_printf uses memory allocation, so it can be failed in
> memory pressure.Therefore, xenbus_printf's return should be checked
> and properly handled.
> 
> Signed-off-by: Insu Yun 
> ---
>  drivers/input/misc/xen-kbdfront.c | 10 --
>  1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/input/misc/xen-kbdfront.c 
> b/drivers/input/misc/xen-kbdfront.c
> index 23d0549..9d465d7 100644
> --- a/drivers/input/misc/xen-kbdfront.c
> +++ b/drivers/input/misc/xen-kbdfront.c
> @@ -129,8 +129,14 @@ static int xenkbd_probe(struct xenbus_device *dev,
>  
>   if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-abs-pointer", "%d", 
> &abs) < 0)
>   abs = 0;
> - if (abs)
> - xenbus_printf(XBT_NIL, dev->nodename, "request-abs-pointer", 
> "1");
> + if (abs) {
> + ret = xenbus_printf(XBT_NIL, dev->nodename,
> + "request-abs-pointer", "1");

The second line of arguments should be aligned to the first parameter. I.e:

xenbus_printf(XBT_NIL, dev->nodename,
  "request-abs-pointer", "1");

See an example in xenkbd_backend_changed.

With that fixed:

Reviewed-by: Julien Grall 

> + if (ret) {
> + pr_warning("xenkbd: can't request abs-pointer");

Note that checkpatch.pl will print a warning here:

WARNING: Prefer pr_warn(... to pr_warning(...
#27: FILE: drivers/input/misc/xen-kbdfront.c:136:
+   pr_warning("xenkbd: can't request abs-pointer");


Although, I'm fine if you don't fix this one.

> + abs = 0;
> + }
> + }
>  
>   /* keyboard */
>   kbd = input_allocate_device();
> 

Regards,

-- 
Julien Grall
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] mm: hotplug: Don't release twice the resource on error

2015-10-23 Thread Julien Grall
The function add_memory_resource take in parameter a resource allocated
by the caller. On error, both add_memory_resource and the caller will
release the resource via release_memory_source.

This will result to Linux crashing when the caller is trying to release
the resource:

CPU: 1 PID: 45 Comm: xenwatch Not tainted 4.3.0-rc6-00043-g5e1d6ca-dirty #170
Hardware name: XENVM-4.7 (DT)
task: ffc1fb2421c0 ti: ffc1fb27 task.ti:
ffc1fb27
PC is at __release_resource+0x28/0x8c
LR is at __release_resource+0x24/0x8c

[...]

Call trace:
[] __release_resource+0x28/0x8c
[] release_resource+0x24/0x44
[] reserve_additional_memory+0x114/0x128
[] alloc_xenballooned_pages+0x98/0x16c
[] blkfront_gather_backend_features+0x14c/0xd68
[] blkback_changed+0x678/0x150c
[] xenbus_otherend_changed+0x9c/0xa4
[] backend_changed+0xc/0x18
[] xenwatch_thread+0xa0/0x13c
[] kthread+0xdc/0xf4

As the caller is allocating the resource, let him handle the release.
This has been introduced by commit b75351f "mm: memory hotplug with
an existing resource".

Signed-off-by: Julien Grall 

---
Cc: David Vrabel 
Cc: linux...@kvack.org

The patch who introduced this issue is in xentip/for-linus-4.4. So
this patch is a good candidate for Linus 4.4.
---
 mm/memory_hotplug.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 5f394e7..0780d11 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1298,7 +1298,6 @@ error:
/* rollback pgdat allocation and others */
if (new_pgdat)
rollback_node_hotadd(nid, pgdat);
-   release_memory_resource(res);
memblock_remove(start, size);
 
 out:
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH 0/2] block/xen-blkfront: Support non-indirect with 64KB page granularity

2015-10-02 Thread Julien Grall
Hi,

Ping, any comment on this series?

Regards,

On 11/09/15 20:31, Julien Grall wrote:
> Hi all,
> 
> This is a follow-up on the previous discussion [1] related to guest using 64KB
> page granularity not booting with backend using non-indirect grant.
> 
> This has been successly tested on ARM64 with both 64KB and 4KB page 
> granularity
> guests and QEMU as the backend. Indeed QEMU is not supported indirect.
> 
> For a summary of the previous discussion see patch #2.
> 
> This series is based on top of my 64KB page granularity support [2].
> 
> Comments are welcomed.
> 
> Sincerely yours,
> 
> [1] http://lists.xen.org/archives/html/xen-devel/2015-08/msg01659.html
> [2] https://lwn.net/Articles/656797/
> 
> Cc: Konrad Rzeszutek Wilk 
> Cc: "Roger Pau Monné" 
> Cc: Boris Ostrovsky 
> Cc: David Vrabel 
> 
> Julien Grall (2):
>   block/xen-blkfront: Introduce blkif_ring_get_request
>   block/xen-blkfront: Handle non-indirect grant with 64KB pages
> 
>  drivers/block/xen-blkfront.c | 229 
> ++++++-
>  1 file changed, 202 insertions(+), 27 deletions(-)
> 


-- 
Julien Grall
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 12/22] xen/balloon: Don't rely on the page granularity is the same for Xen and Linux

2015-10-02 Thread Julien Grall
Hi David,

On 02/10/15 15:09, David Vrabel wrote:
> On 30/09/15 11:45, Julien Grall wrote:
>> For ARM64 guests, Linux is able to support either 64K or 4K page
>> granularity. Although, the hypercall interface is always based on 4K
>> page granularity.
>>
>> With 64K page granularity, a single page will be spread over multiple
>> Xen frame.
>>
>> To avoid splitting the page into 4K frame, take advantage of the
>> extent_order field to directly allocate/free chunk of the Linux page
>> size.
>>
>> Note that PVMMU is only used for PV guest (which is x86) and the page
>> granularity is always 4KB. Some BUILD_BUG_ON has been added to ensure
>> that because the code has not been modified.
> 
> This causes a BUG() in x86 PV guests when decreasing the reservation.
> 
> Xen says:
> 
> (XEN) d0v2 Error pfn 0: rd=0 od=32753 caf=8001
> taf=7401
> (XEN) memory.c:250:d0v2 Bad page free for domain 0
> 
> And Linux BUGs with:
> 
> [   82.032654] kernel BUG at
> /anfs/drall/scratch/davidvr/x86/linux/drivers/xen/balloon.c:540!
> 
> Which is a non-zero return value from the decrease_reservation hypercall.
> 
> The frame_list[] has been incorrectly populated.  The below patch fixes
> it for me.  Please test as well.

Sorry for the breakage, I think I haven't spot the bug on my board
because most the PV drivers are allocating one balloon page at the time
by default.

This patch looks valid to me. i was resetting and incremented for each
loop on an early version. Although I dropped it by mistake when I use a
different way to decrease the reservation.

Regards,

-- 
Julien Grall
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 12/22] xen/balloon: Don't rely on the page granularity is the same for Xen and Linux

2015-10-02 Thread Julien Grall
On 02/10/15 15:31, Julien Grall wrote:
> Hi David,
> 
> On 02/10/15 15:09, David Vrabel wrote:
>> On 30/09/15 11:45, Julien Grall wrote:
>>> For ARM64 guests, Linux is able to support either 64K or 4K page
>>> granularity. Although, the hypercall interface is always based on 4K
>>> page granularity.
>>>
>>> With 64K page granularity, a single page will be spread over multiple
>>> Xen frame.
>>>
>>> To avoid splitting the page into 4K frame, take advantage of the
>>> extent_order field to directly allocate/free chunk of the Linux page
>>> size.
>>>
>>> Note that PVMMU is only used for PV guest (which is x86) and the page
>>> granularity is always 4KB. Some BUILD_BUG_ON has been added to ensure
>>> that because the code has not been modified.
>>
>> This causes a BUG() in x86 PV guests when decreasing the reservation.
>>
>> Xen says:
>>
>> (XEN) d0v2 Error pfn 0: rd=0 od=32753 caf=8001
>> taf=7401
>> (XEN) memory.c:250:d0v2 Bad page free for domain 0
>>
>> And Linux BUGs with:
>>
>> [   82.032654] kernel BUG at
>> /anfs/drall/scratch/davidvr/x86/linux/drivers/xen/balloon.c:540!
>>
>> Which is a non-zero return value from the decrease_reservation hypercall.
>>
>> The frame_list[] has been incorrectly populated.  The below patch fixes
>> it for me.  Please test as well.

FIY, I've just tested with the patch on ARM64 and I haven't see any issue.

> Sorry for the breakage, I think I haven't spot the bug on my board
> because most the PV drivers are allocating one balloon page at the time
> by default.
> 
> This patch looks valid to me. i was resetting and incremented for each
> loop on an early version. Although I dropped it by mistake when I use a
> different way to decrease the reservation.




-- 
Julien Grall
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4] xen/privcmd: add IOCTL_PRIVCMD_MMAP_RESOURCE

2018-04-11 Thread Julien Grall

Hi,

On 10/04/18 08:58, Paul Durrant wrote:

+static long privcmd_ioctl_mmap_resource(struct file *file, void __user *udata)
+{
+   struct privcmd_data *data = file->private_data;
+   struct mm_struct *mm = current->mm;
+   struct vm_area_struct *vma;
+   struct privcmd_mmap_resource kdata;
+   xen_pfn_t *pfns = NULL;
+   struct xen_mem_acquire_resource xdata;
+   int rc;
+
+   if (copy_from_user(&kdata, udata, sizeof(kdata)))
+   return -EFAULT;
+
+   /* If restriction is in place, check the domid matches */
+   if (data->domid != DOMID_INVALID && data->domid != kdata.dom)
+   return -EPERM;
+
+   down_write(&mm->mmap_sem);
+
+   vma = find_vma(mm, kdata.addr);
+   if (!vma || vma->vm_ops != &privcmd_vm_ops) {
+   rc = -EINVAL;
+   goto out;
+   }
+
+   pfns = kcalloc(kdata.num, sizeof(*pfns), GFP_KERNEL);
+   if (!pfns) {
+   rc = -ENOMEM;
+   goto out;
+   }
+
+   if (xen_feature(XENFEAT_auto_translated_physmap)) {
+   struct page **pages;
+   unsigned int i;
+
+   rc = alloc_empty_pages(vma, kdata.num);
+   if (rc < 0)
+   goto out;
+
+   pages = vma->vm_private_data;
+   for (i = 0; i < kdata.num; i++)
+   pfns[i] = page_to_pfn(pages[i]);


I don't think this is going to work well if the hypervisor is using a 
different granularity for the page.


Imagine Xen is using 4K but the kernel 64K. You would end up to have the 
resource not mapped contiguously in the memory.


Cheers,

--
Julien Grall


Re: [Xen-devel] [RFC, v2, 1/9] hyper_dmabuf: initial upload of hyper_dmabuf drv core framework

2018-04-10 Thread Julien Grall

Hi,

On 04/10/2018 09:53 AM, Oleksandr Andrushchenko wrote:

On 02/14/2018 03:50 AM, Dongwon Kim wrote:
diff --git a/drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_id.h 


[...]


+#ifndef __HYPER_DMABUF_ID_H__
+#define __HYPER_DMABUF_ID_H__
+
+#define HYPER_DMABUF_ID_CREATE(domid, cnt) \
+    domid) & 0xFF) << 24) | ((cnt) & 0xFF))

I would define hyper_dmabuf_id_t.id as a union or 2 separate
fields to avoid his magic


I am not sure the union would be right here because the layout will 
differs between big and little endian. So does that value will be passed 
to other guest?


Cheers,

--
Julien Grall


Re: [PATCH v2 1/8] xen/events: reset affinity of 2-level event when tearing it down

2021-02-14 Thread Julien Grall

Hi Juergen,

On 11/02/2021 10:16, Juergen Gross wrote:

When creating a new event channel with 2-level events the affinity
needs to be reset initially in order to avoid using an old affinity
from earlier usage of the event channel port. So when tearing an event
channel down reset all affinity bits.

The same applies to the affinity when onlining a vcpu: all old
affinity settings for this vcpu must be reset. As percpu events get
initialized before the percpu event channel hook is called,
resetting of the affinities happens after offlining a vcpu (this is
working, as initial percpu memory is zeroed out).

Cc: sta...@vger.kernel.org
Reported-by: Julien Grall 
Signed-off-by: Juergen Gross 


Reviewed-by: Julien Grall 

Cheers,


---
V2:
- reset affinity when tearing down the event (Julien Grall)
---
  drivers/xen/events/events_2l.c   | 15 +++
  drivers/xen/events/events_base.c |  1 +
  drivers/xen/events/events_internal.h |  8 
  3 files changed, 24 insertions(+)

diff --git a/drivers/xen/events/events_2l.c b/drivers/xen/events/events_2l.c
index da87f3a1e351..a7f413c5c190 100644
--- a/drivers/xen/events/events_2l.c
+++ b/drivers/xen/events/events_2l.c
@@ -47,6 +47,11 @@ static unsigned evtchn_2l_max_channels(void)
return EVTCHN_2L_NR_CHANNELS;
  }
  
+static void evtchn_2l_remove(evtchn_port_t evtchn, unsigned int cpu)

+{
+   clear_bit(evtchn, BM(per_cpu(cpu_evtchn_mask, cpu)));
+}
+
  static void evtchn_2l_bind_to_cpu(evtchn_port_t evtchn, unsigned int cpu,
  unsigned int old_cpu)
  {
@@ -355,9 +360,18 @@ static void evtchn_2l_resume(void)
EVTCHN_2L_NR_CHANNELS/BITS_PER_EVTCHN_WORD);
  }
  
+static int evtchn_2l_percpu_deinit(unsigned int cpu)

+{
+   memset(per_cpu(cpu_evtchn_mask, cpu), 0, sizeof(xen_ulong_t) *
+   EVTCHN_2L_NR_CHANNELS/BITS_PER_EVTCHN_WORD);
+
+   return 0;
+}
+
  static const struct evtchn_ops evtchn_ops_2l = {
.max_channels  = evtchn_2l_max_channels,
.nr_channels   = evtchn_2l_max_channels,
+   .remove= evtchn_2l_remove,
.bind_to_cpu   = evtchn_2l_bind_to_cpu,
.clear_pending = evtchn_2l_clear_pending,
.set_pending   = evtchn_2l_set_pending,
@@ -367,6 +381,7 @@ static const struct evtchn_ops evtchn_ops_2l = {
.unmask= evtchn_2l_unmask,
.handle_events = evtchn_2l_handle_events,
.resume= evtchn_2l_resume,
+   .percpu_deinit = evtchn_2l_percpu_deinit,
  };
  
  void __init xen_evtchn_2l_init(void)

diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c
index e850f79351cb..6c539db81f8f 100644
--- a/drivers/xen/events/events_base.c
+++ b/drivers/xen/events/events_base.c
@@ -368,6 +368,7 @@ static int xen_irq_info_pirq_setup(unsigned irq,
  static void xen_irq_info_cleanup(struct irq_info *info)
  {
set_evtchn_to_irq(info->evtchn, -1);
+   xen_evtchn_port_remove(info->evtchn, info->cpu);
info->evtchn = 0;
channels_on_cpu_dec(info);
  }
diff --git a/drivers/xen/events/events_internal.h 
b/drivers/xen/events/events_internal.h
index 0a97c0549db7..18a4090d0709 100644
--- a/drivers/xen/events/events_internal.h
+++ b/drivers/xen/events/events_internal.h
@@ -14,6 +14,7 @@ struct evtchn_ops {
unsigned (*nr_channels)(void);
  
  	int (*setup)(evtchn_port_t port);

+   void (*remove)(evtchn_port_t port, unsigned int cpu);
void (*bind_to_cpu)(evtchn_port_t evtchn, unsigned int cpu,
unsigned int old_cpu);
  
@@ -54,6 +55,13 @@ static inline int xen_evtchn_port_setup(evtchn_port_t evtchn)

return 0;
  }
  
+static inline void xen_evtchn_port_remove(evtchn_port_t evtchn,

+ unsigned int cpu)
+{
+   if (evtchn_ops->remove)
+   evtchn_ops->remove(evtchn, cpu);
+}
+
  static inline void xen_evtchn_port_bind_to_cpu(evtchn_port_t evtchn,
   unsigned int cpu,
   unsigned int old_cpu)



--
Julien Grall


Re: [PATCH v2 3/8] xen/events: avoid handling the same event on two cpus at the same time

2021-02-14 Thread Julien Grall

Hi Juergen,

On 11/02/2021 10:16, Juergen Gross wrote:

When changing the cpu affinity of an event it can happen today that
(with some unlucky timing) the same event will be handled on the old
and the new cpu at the same time.

Avoid that by adding an "event active" flag to the per-event data and
call the handler only if this flag isn't set.

Reported-by: Julien Grall 
Signed-off-by: Juergen Gross 
---
V2:
- new patch
---
  drivers/xen/events/events_base.c | 19 +++
  1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c
index e157e7506830..f7e22330dcef 100644
--- a/drivers/xen/events/events_base.c
+++ b/drivers/xen/events/events_base.c
@@ -102,6 +102,7 @@ struct irq_info {
  #define EVT_MASK_REASON_EXPLICIT  0x01
  #define EVT_MASK_REASON_TEMPORARY 0x02
  #define EVT_MASK_REASON_EOI_PENDING   0x04
+   u8 is_active;   /* Is event just being handled? */
unsigned irq;
evtchn_port_t evtchn;   /* event channel */
unsigned short cpu; /* cpu bound */
@@ -622,6 +623,7 @@ static void xen_irq_lateeoi_locked(struct irq_info *info, 
bool spurious)
}
  
  	info->eoi_time = 0;

+   smp_store_release(&info->is_active, 0);
do_unmask(info, EVT_MASK_REASON_EOI_PENDING);
  }
  
@@ -809,13 +811,15 @@ static void pirq_query_unmask(int irq)
  
  static void eoi_pirq(struct irq_data *data)

  {
-   evtchn_port_t evtchn = evtchn_from_irq(data->irq);
+   struct irq_info *info = info_for_irq(data->irq);
+   evtchn_port_t evtchn = info ? info->evtchn : 0;
struct physdev_eoi eoi = { .irq = pirq_from_irq(data->irq) };
int rc = 0;
  
  	if (!VALID_EVTCHN(evtchn))

return;
  
+	smp_store_release(&info->is_active, 0);


Would you mind to explain why you are using the release semantics?

It is also not clear to me if there are any expected ordering between 
clearing is_active and clearing the pending bit.



clear_evtchn(evtchn);



The 2 lines here seems to be a common pattern in this patch. So I would 
suggest to create a new helper.


  
  	if (pirq_needs_eoi(data->irq)) {

@@ -1640,6 +1644,8 @@ void handle_irq_for_port(evtchn_port_t port, struct 
evtchn_loop_ctrl *ctrl)
}
  
  	info = info_for_irq(irq);

+   if (xchg_acquire(&info->is_active, 1))
+   return;
  
  	if (ctrl->defer_eoi) {

info->eoi_cpu = smp_processor_id();
@@ -1823,11 +1829,13 @@ static void disable_dynirq(struct irq_data *data)
  
  static void ack_dynirq(struct irq_data *data)

  {
-   evtchn_port_t evtchn = evtchn_from_irq(data->irq);
+   struct irq_info *info = info_for_irq(data->irq);
+   evtchn_port_t evtchn = info ? info->evtchn : 0;
  
  	if (!VALID_EVTCHN(evtchn))

return;
  
+	smp_store_release(&info->is_active, 0);

clear_evtchn(evtchn);
  }
  
@@ -1969,10 +1977,13 @@ static void restore_cpu_ipis(unsigned int cpu)

  /* Clear an irq's pending state, in preparation for polling on it */
  void xen_clear_irq_pending(int irq)
  {
-   evtchn_port_t evtchn = evtchn_from_irq(irq);
+   struct irq_info *info = info_for_irq(irq);
+   evtchn_port_t evtchn = info ? info->evtchn : 0;
  
-	if (VALID_EVTCHN(evtchn))

+   if (VALID_EVTCHN(evtchn)) {
+   smp_store_release(&info->is_active, 0);
clear_evtchn(evtchn);
+   }
  }
  EXPORT_SYMBOL(xen_clear_irq_pending);
  void xen_set_irq_pending(int irq)



--
Julien Grall


Re: [Xen-devel] [PATCH v3 2/2] x86, arm64, platform, xen, kconfig: add xen defconfig helper

2014-12-09 Thread Julien Grall

Hello Luis,

On 08/12/2014 23:05, Luis R. Rodriguez wrote:

diff --git a/kernel/configs/xen.config b/kernel/configs/xen.config
new file mode 100644
index 000..0d0eb6d
--- /dev/null
+++ b/kernel/configs/xen.config
+CONFIG_XEN_MCE_LOG=y


MCE is x86 specific.


+CONFIG_XEN_HAVE_PVMMU=y


We don't have PVMMU support on ARM. Shouldn't you move this config in 
architecture specific code?


Regards

--
Julien Grall
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 2/2] x86, arm64, platform, xen, kconfig: add xen defconfig helper

2014-12-09 Thread Julien Grall
On 09/12/14 20:22, Luis R. Rodriguez wrote:
> On Tue, Dec 9, 2014 at 1:06 AM, Julien Grall  wrote:
>> Hello Luis,
>>
>> On 08/12/2014 23:05, Luis R. Rodriguez wrote:
>>>
>>> diff --git a/kernel/configs/xen.config b/kernel/configs/xen.config
>>> new file mode 100644
>>> index 000..0d0eb6d
>>> --- /dev/null
>>> +++ b/kernel/configs/xen.config
>>> +CONFIG_XEN_MCE_LOG=y
>>
>>
>> MCE is x86 specific.
> 
> That's what I thought too but its available for arm64, so should we
> fix that Kconfig to depend on x86?

Are you sure? On the Linus's repo I have:

config XEN_MCE_LOG
bool "Xen platform mcelog"
depends on XEN_DOM0 && X86_64 && X86_MCE

Anyway, the MCE interface in the hypervisor is implemented in arch/x86
not in common code.

>>> +CONFIG_XEN_HAVE_PVMMU=y
>>
>>
>> We don't have PVMMU support on ARM. Shouldn't you move this config in
>> architecture specific code?
> 
> If you are sure then yes.

I'm 100% sure. MMU is handled by the hardware on ARM.

Thinking a bit more about this option. CONFIG_XEN_HAVE_PVMMU can't be
selected by the user. It's automatically added per platform (for
instance see arch/x86/xen/Kconfig).

So maybe it should not even appear in the one of the fragment configs?

Regards,

-- 
Julien Grall
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [RFC v3 2/2] x86/xen: allow privcmd hypercalls to be preempted

2015-01-22 Thread Julien Grall
On 22/01/15 18:56, Luis R. Rodriguez wrote:
> On Thu, Jan 22, 2015 at 01:10:49PM +0000, Julien Grall wrote:
>> Hi Luis,
>>
>> On 22/01/15 02:17, Luis R. Rodriguez wrote:
>>> diff --git a/drivers/xen/events/events_base.c 
>>> b/drivers/xen/events/events_base.c
>>> index b4bca2d..23c526b 100644
>>> --- a/drivers/xen/events/events_base.c
>>> +++ b/drivers/xen/events/events_base.c
>>> @@ -32,6 +32,8 @@
>>>  #include 
>>>  #include 
>>>  #include 
>>> +#include 
>>> +#include 
>>>  
>>>  #ifdef CONFIG_X86
>>>  #include 
>>> @@ -1243,6 +1245,17 @@ void xen_evtchn_do_upcall(struct pt_regs *regs)
>>> set_irq_regs(old_regs);
>>>  }
>>>  
>>> +notrace void xen_end_upcall(struct pt_regs *regs)
>>> +{
>>> +   if (!xen_is_preemptible_hypercall(regs) ||
>>
>> I don't see any definition of xen_is_preemptible_hypercall for ARM32/ARM64.
>>
>> As this function is called from the generic code, you have at least to
>> stub this function for those architectures.
> 
> Will add as:
> 
> diff --git a/arch/arm/include/asm/xen/hypercall.h 
> b/arch/arm/include/asm/xen/hypercall.h
> index 712b50e..4fc8395 100644
> --- a/arch/arm/include/asm/xen/hypercall.h
> +++ b/arch/arm/include/asm/xen/hypercall.h
> @@ -74,4 +74,9 @@ MULTI_mmu_update(struct multicall_entry *mcl, struct 
> mmu_update *req,
>   BUG();
>  }
>  
> +static inline bool xen_is_preemptible_hypercall(struct pt_regs *regs)
> +{
> + return false;
> +}
> +
>  #endif /* _ASM_ARM_XEN_HYPERCALL_H */
> 
> This will cover both arm and arm64 as arm64 includes the arm header.

I'm fine with this solution.

Regards,

-- 
Julien Grall
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v2 2/2] x86, arm, platform, xen, kconfig: add xen defconfig helper

2015-01-13 Thread Julien Grall
Hello Luis,

On 13/01/15 19:03, Luis R. Rodriguez wrote:
>>> diff --git a/kernel/configs/xen.config b/kernel/configs/xen.config
>>> new file mode 100644
>>> index 000..d2ec010
>>> --- /dev/null
>>> +++ b/kernel/configs/xen.config
>>> @@ -0,0 +1,30 @@
>>> +# generic config
>>> +CONFIG_XEN=y
>>> +CONFIG_XEN_DOM0=y
>>> +CONFIG_PCI_XEN=y
>>> +CONFIG_XEN_PCIDEV_FRONTEND=m
>>> +CONFIG_XEN_BLKDEV_FRONTEND=m
>>> +CONFIG_XEN_BLKDEV_BACKEND=m
>>> +CONFIG_XEN_NETDEV_FRONTEND=m
>>> +CONFIG_XEN_NETDEV_BACKEND=m
>>> +CONFIG_INPUT_XEN_KBDDEV_FRONTEND=y
>>> +CONFIG_HVC_XEN=y
>>> +CONFIG_HVC_XEN_FRONTEND=y
>>> +CONFIG_TCG_XEN=m
>>> +CONFIG_XEN_WDT=m
>>> +CONFIG_XEN_FBDEV_FRONTEND=y
>>> +CONFIG_XEN_BALLOON=y
>>> +CONFIG_XEN_BALLOON_MEMORY_HOTPLUG=y
>>> +CONFIG_XEN_SCRUB_PAGES=y
>>> +CONFIG_XEN_DEV_EVTCHN=m
>>> +CONFIG_XEN_BACKEND=y
>>> +CONFIG_XENFS=m
>>> +CONFIG_XEN_COMPAT_XENFS=y
>>> +CONFIG_XEN_SYS_HYPERVISOR=y
>>> +CONFIG_XEN_XENBUS_FRONTEND=y
>>> +CONFIG_XEN_GNTDEV=m
>>> +CONFIG_XEN_GRANT_DEV_ALLOC=m
>>> +CONFIG_SWIOTLB_XEN=y
>>> +CONFIG_XEN_PCIDEV_BACKEND=m
>>> +CONFIG_XEN_PRIVCMD=m
>>> +CONFIG_XEN_ACPI_PROCESSOR=m
>>
>> The common fragment config looks good for both ARM32 and ARM64:
>>
>> Acked-by: Julien Grall 
> 
> Can someone apply this? Who should this go through?

Stefano had some comments on this patch. See:

http://lists.xenproject.org/archives/html/xen-devel/2014-12/msg01531.html

Regards,

-- 
Julien Grall
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v10 00/21] Introduce ACPI for ARM64 based on ACPI 5.1

2015-03-22 Thread Julien Grall

Hello,

On 21/03/2015 12:09, Naresh Bhat wrote:

 From 268dcdafa34a690e2f99c0784ca33a6d2352ecf5 Mon Sep 17 00:00:00 2001
From: Hanjun Guo mailto:hanjun@linaro.org>>
Date: Sat, 21 Mar 2015 14:43:54 +0800
Subject: [PATCH] XEN / ACPI: Make XEN ACPI depend on X86

When ACPI is enabled on ARM64, XEN ACPI will also compiled
into the kernel, but XEN ACPI is x86 dependent, so introduce
CONFIG_XEN_ACPI to make it depend on x86 before XEN ACPI is
functional on ARM64.

Signed-off-by: Hanjun Guo mailto:hanjun@linaro.org>>
---
  drivers/xen/Kconfig  | 4 
  drivers/xen/Makefile | 2 +-
  2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/xen/Kconfig b/drivers/xen/Kconfig
index b812462..a31cd29 100644
--- a/drivers/xen/Kconfig
+++ b/drivers/xen/Kconfig
@@ -253,4 +253,8 @@ config XEN_EFI
  def_bool y
  depends on X86_64 && EFI

+config XEN_ACPI
+def_bool y
+depends on X86 && ACPI
+
  endmenu
diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
index 2ccd359..f4622ab 100644
--- a/drivers/xen/Makefile
+++ b/drivers/xen/Makefile
@@ -13,7 +13,7 @@ CFLAGS_efi.o+= -fshort-wchar

  dom0-$(CONFIG_PCI) += pci.o
  dom0-$(CONFIG_USB_SUPPORT) += dbgp.o
-dom0-$(CONFIG_ACPI) += acpi.o $(xen-pad-y)
+dom0-$(CONFIG_XEN_ACPI) += acpi.o $(xen-pad-y)
  xen-pad-$(CONFIG_X86) += xen-acpi-pad.o
  dom0-$(CONFIG_X86) += pcpu.o
  obj-$(CONFIG_XEN_DOM0)+= $(dom0-y)


[..]



AFAIK,  There is already a kernel patch exists to fix this issue.  I
think  Julien or Parth is a right person to ask.  Hence I am CCed Julien
Grall too.


The ACPI support for Xen is not ready. So I think avoiding to compile 
drivers/xen/acpi.c on ARM64/ARM seems the better solution for now.


Although, rather than introducing a new CONFIG option, I would use the 
same trick we use within the Makefile to avoid hotplug.c on ARM/ARM64.


ifeq ($(filter y, $(CONFIG_ARM) $(CONFIG_ARM64)), )
dom0-$(CONFIG_ACPI) += acpi.o $(xen-pad-y)
endif

Regards,

--
Julien Grall
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v10 00/21] Introduce ACPI for ARM64 based on ACPI 5.1

2015-03-22 Thread Julien Grall



On 22/03/2015 21:49, Rafael J. Wysocki wrote:

On Sunday, March 22, 2015 09:05:21 PM Julien Grall wrote:

Hello,

On 21/03/2015 12:09, Naresh Bhat wrote:

  From 268dcdafa34a690e2f99c0784ca33a6d2352ecf5 Mon Sep 17 00:00:00 2001
 From: Hanjun Guo mailto:hanjun@linaro.org>>
 Date: Sat, 21 Mar 2015 14:43:54 +0800
 Subject: [PATCH] XEN / ACPI: Make XEN ACPI depend on X86

 When ACPI is enabled on ARM64, XEN ACPI will also compiled
 into the kernel, but XEN ACPI is x86 dependent, so introduce
 CONFIG_XEN_ACPI to make it depend on x86 before XEN ACPI is
 functional on ARM64.

 Signed-off-by: Hanjun Guo mailto:hanjun@linaro.org>>
 ---
   drivers/xen/Kconfig  | 4 
   drivers/xen/Makefile | 2 +-
   2 files changed, 5 insertions(+), 1 deletion(-)

 diff --git a/drivers/xen/Kconfig b/drivers/xen/Kconfig
 index b812462..a31cd29 100644
 --- a/drivers/xen/Kconfig
 +++ b/drivers/xen/Kconfig
 @@ -253,4 +253,8 @@ config XEN_EFI
   def_bool y
   depends on X86_64 && EFI

 +config XEN_ACPI
 +def_bool y
 +depends on X86 && ACPI
 +
   endmenu
 diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
 index 2ccd359..f4622ab 100644
 --- a/drivers/xen/Makefile
 +++ b/drivers/xen/Makefile
 @@ -13,7 +13,7 @@ CFLAGS_efi.o+= -fshort-wchar

   dom0-$(CONFIG_PCI) += pci.o
   dom0-$(CONFIG_USB_SUPPORT) += dbgp.o
 -dom0-$(CONFIG_ACPI) += acpi.o $(xen-pad-y)
 +dom0-$(CONFIG_XEN_ACPI) += acpi.o $(xen-pad-y)
   xen-pad-$(CONFIG_X86) += xen-acpi-pad.o
   dom0-$(CONFIG_X86) += pcpu.o
   obj-$(CONFIG_XEN_DOM0)+= $(dom0-y)


[..]



AFAIK,  There is already a kernel patch exists to fix this issue.  I
think  Julien or Parth is a right person to ask.  Hence I am CCed Julien
Grall too.


The ACPI support for Xen is not ready. So I think avoiding to compile
drivers/xen/acpi.c on ARM64/ARM seems the better solution for now.

Although, rather than introducing a new CONFIG option, I would use the
same trick we use within the Makefile to avoid hotplug.c on ARM/ARM64.

ifeq ($(filter y, $(CONFIG_ARM) $(CONFIG_ARM64)), )
dom0-$(CONFIG_ACPI) += acpi.o $(xen-pad-y)
endif


Well, is avoiding an extra CONFIG_ option worth the ugliness of this?


When the support of ACPI for Xen will come, the CONFIG_ option will be 
an alias to CONFIG_XEN.


In this case the CONFIG_ option won't bring much improvement to the code 
and add an extra indirection.


The "ugliness" option has, at least, the advantage to be tiny and 
self-contained.


Regards,

--
Julien Grall
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v4 0/3] net/xen: Clean up

2015-06-16 Thread Julien Grall
Hi,

The first 2 patches was originally part of the the Xen 64KB series [1].
Although, I think they can go, assuming everything will be acked/reviewed,
without waiting the rest of the 64KB series.

The third patch as been added in this version.

Regards,

Julien Grall (3):
  net/xen-netfront: Correct printf format in xennet_get_responses
  net/xen-netback: Remove unused code in xenvif_rx_action
  net/xen-netback: Don't mix hexa and decimal with 0x in the printf
format

 drivers/net/xen-netback/netback.c | 19 +++
 drivers/net/xen-netfront.c|  2 +-
 2 files changed, 8 insertions(+), 13 deletions(-)

-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v4 2/3] net/xen-netback: Remove unused code in xenvif_rx_action

2015-06-16 Thread Julien Grall
The variables old_req_cons and ring_slots_used are assigned but never
used since commit 1650d5455bd2dc6b5ee134bd6fc1a3236c266b5b "xen-netback:
always fully coalesce guest Rx packets".

Signed-off-by: Julien Grall 
Acked-by: Wei Liu 
Cc: Ian Campbell 
Cc: net...@vger.kernel.org

---
Changes in v2:
- Add Wei's Acked-by
---
 drivers/net/xen-netback/netback.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/drivers/net/xen-netback/netback.c 
b/drivers/net/xen-netback/netback.c
index 0d25943..ba3ae30 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -515,14 +515,9 @@ static void xenvif_rx_action(struct xenvif_queue *queue)
 
while (xenvif_rx_ring_slots_available(queue, XEN_NETBK_RX_SLOTS_MAX)
   && (skb = xenvif_rx_dequeue(queue)) != NULL) {
-   RING_IDX old_req_cons;
-   RING_IDX ring_slots_used;
-
queue->last_rx_time = jiffies;
 
-   old_req_cons = queue->rx.req_cons;
XENVIF_RX_CB(skb)->meta_slots_used = xenvif_gop_skb(skb, &npo, 
queue);
-   ring_slots_used = queue->rx.req_cons - old_req_cons;
 
__skb_queue_tail(&rxq, skb);
}
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v4 1/3] net/xen-netfront: Correct printf format in xennet_get_responses

2015-06-16 Thread Julien Grall
rx->status is an int16_t, print it using %d rather than %u in order to
have a meaningful value when the field is negative.

Also use %u rather than %x for rx->offset.

Signed-off-by: Julien Grall 
Reviewed-by: David Vrabel 
Cc: Konrad Rzeszutek Wilk 
Cc: Boris Ostrovsky 
Cc: net...@vger.kernel.org

---
Changes in v4:
- Use %u for the rx->offset because offset is unsigned

Changes in v3:
- Use %d for the rx->offset too.

Changes in v2:
- Add David's Reviewed-by
---
 drivers/net/xen-netfront.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index e031c94..281720f 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -733,7 +733,7 @@ static int xennet_get_responses(struct netfront_queue 
*queue,
if (unlikely(rx->status < 0 ||
 rx->offset + rx->status > PAGE_SIZE)) {
if (net_ratelimit())
-   dev_warn(dev, "rx->offset: %x, size: %u\n",
+   dev_warn(dev, "rx->offset: %u, size: %d\n",
 rx->offset, rx->status);
xennet_move_rx_slot(queue, skb, ref);
err = -EINVAL;
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v4 3/3] net/xen-netback: Don't mix hexa and decimal with 0x in the printf format

2015-06-16 Thread Julien Grall
Append 0x to all %x in order to avoid while reading when there is other
decimal value in the log.

Also replace some of the hexadecimal print to decimal to uniformize the
format with netfront.

Signed-off-by: Julien Grall 
Cc: Wei Liu 
Cc: Ian Campbell 
Cc: net...@vger.kernel.org

---
Changes in v4:
- Patch added
---
 drivers/net/xen-netback/netback.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/net/xen-netback/netback.c 
b/drivers/net/xen-netback/netback.c
index ba3ae30..11bd9d8 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -748,7 +748,7 @@ static int xenvif_count_requests(struct xenvif_queue *queue,
slots++;
 
if (unlikely((txp->offset + txp->size) > PAGE_SIZE)) {
-   netdev_err(queue->vif->dev, "Cross page boundary, 
txp->offset: %x, size: %u\n",
+   netdev_err(queue->vif->dev, "Cross page boundary, 
txp->offset: %u, size: %u\n",
 txp->offset, txp->size);
xenvif_fatal_tx_err(queue->vif);
return -EINVAL;
@@ -874,7 +874,7 @@ static inline void xenvif_grant_handle_set(struct 
xenvif_queue *queue,
if (unlikely(queue->grant_tx_handle[pending_idx] !=
 NETBACK_INVALID_HANDLE)) {
netdev_err(queue->vif->dev,
-  "Trying to overwrite active handle! pending_idx: 
%x\n",
+  "Trying to overwrite active handle! pending_idx: 
0x%x\n",
   pending_idx);
BUG();
}
@@ -887,7 +887,7 @@ static inline void xenvif_grant_handle_reset(struct 
xenvif_queue *queue,
if (unlikely(queue->grant_tx_handle[pending_idx] ==
 NETBACK_INVALID_HANDLE)) {
netdev_err(queue->vif->dev,
-  "Trying to unmap invalid handle! pending_idx: %x\n",
+  "Trying to unmap invalid handle! pending_idx: 
0x%x\n",
   pending_idx);
BUG();
}
@@ -1243,7 +1243,7 @@ static void xenvif_tx_build_gops(struct xenvif_queue 
*queue,
/* No crossing a page as the payload mustn't fragment. */
if (unlikely((txreq.offset + txreq.size) > PAGE_SIZE)) {
netdev_err(queue->vif->dev,
-  "txreq.offset: %x, size: %u, end: %lu\n",
+  "txreq.offset: %u, size: %u, end: %lu\n",
   txreq.offset, txreq.size,
   (unsigned long)(txreq.offset&~PAGE_MASK) + 
txreq.size);
xenvif_fatal_tx_err(queue->vif);
@@ -1593,12 +1593,12 @@ static inline void xenvif_tx_dealloc_action(struct 
xenvif_queue *queue)
queue->pages_to_unmap,
gop - queue->tx_unmap_ops);
if (ret) {
-   netdev_err(queue->vif->dev, "Unmap fail: nr_ops %tx ret 
%d\n",
+   netdev_err(queue->vif->dev, "Unmap fail: nr_ops %tu ret 
%d\n",
   gop - queue->tx_unmap_ops, ret);
for (i = 0; i < gop - queue->tx_unmap_ops; ++i) {
if (gop[i].status != GNTST_okay)
netdev_err(queue->vif->dev,
-  " host_addr: %llx handle: %x 
status: %d\n",
+  " host_addr: 0x%llx handle: 
0x%x status: %d\n",
   gop[i].host_addr,
   gop[i].handle,
   gop[i].status);
@@ -1731,7 +1731,7 @@ void xenvif_idx_unmap(struct xenvif_queue *queue, u16 
pending_idx)
&queue->mmap_pages[pending_idx], 1);
if (ret) {
netdev_err(queue->vif->dev,
-  "Unmap fail: ret: %d pending_idx: %d host_addr: %llx 
handle: %x status: %d\n",
+  "Unmap fail: ret: %d pending_idx: %d host_addr: %llx 
handle: 0x%x status: %d\n",
   ret,
   pending_idx,
   tx_unmap_op.host_addr,
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4 3/3] net/xen-netback: Don't mix hexa and decimal with 0x in the printf format

2015-06-17 Thread Julien Grall

Hi Ian,

On 17/06/2015 10:25, Ian Campbell wrote:

On Tue, 2015-06-16 at 20:10 +0100, Julien Grall wrote:

Append 0x to all %x in order to avoid while reading when there is other
decimal value in the log.

Also replace some of the hexadecimal print to decimal to uniformize the
format with netfront.

Signed-off-by: Julien Grall 
Cc: Wei Liu 
Cc: Ian Campbell 
Cc: net...@vger.kernel.org


You meant s/Append/Prepend/, nonetheless:


I noticed a missing word after "avoid" in the commit message too. I will 
update to:


"Prepend 0x to all %x in order to avoid confusion while reading when 
there is other decimal value in the log.


[...]".



Acked-by: Ian Campbell 


I see different opinion on whether using 0x% or %#. As I plan to resend 
a version with the commit message update, shall I use %#?


Regards,

--
Julien Grall
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v5 0/3] net/xen: Clean up

2015-06-17 Thread Julien Grall
Hi,

The first 2 patches were originally part of the Xen 64KB series [1]. Although
I think they can go without waiting the rest of the 64KB series.

The third patch has been added in the v4.

Regards,

Cc: Wei Liu 
Cc: Ian Campbell 
Cc: David Vrabel 
Cc: Konrad Rzeszutek Wilk 
Cc: Boris Ostrovsky 
Cc: net...@vger.kernel.org

[1] http://lkml.org/lkml/2015/5/14/533

Julien Grall (3):
  net/xen-netfront: Correct printf format in xennet_get_responses
  net/xen-netback: Remove unused code in xenvif_rx_action
  net/xen-netback: Don't mix hexa and decimal with 0x in the printf
format

 drivers/net/xen-netback/netback.c | 19 +++
 drivers/net/xen-netfront.c|  2 +-
 2 files changed, 8 insertions(+), 13 deletions(-)

-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v5 3/3] net/xen-netback: Don't mix hexa and decimal with 0x in the printf format

2015-06-17 Thread Julien Grall
Prepend 0x to all %x in order to avoid confusion while reading when there is
other decimal value in the log.

Also replace some of the hexadecimal print to decimal to uniformize the
format with netfront.

Signed-off-by: Julien Grall 
Acked-by: Ian Campbell 
Cc: Wei Liu 
Cc: net...@vger.kernel.org

---
Changes in v5:
- Fix commit message
- Add Ian's ack.

Changes in v4:
- Patch added
---
 drivers/net/xen-netback/netback.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/net/xen-netback/netback.c 
b/drivers/net/xen-netback/netback.c
index ba3ae30..11bd9d8 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -748,7 +748,7 @@ static int xenvif_count_requests(struct xenvif_queue *queue,
slots++;
 
if (unlikely((txp->offset + txp->size) > PAGE_SIZE)) {
-   netdev_err(queue->vif->dev, "Cross page boundary, 
txp->offset: %x, size: %u\n",
+   netdev_err(queue->vif->dev, "Cross page boundary, 
txp->offset: %u, size: %u\n",
 txp->offset, txp->size);
xenvif_fatal_tx_err(queue->vif);
return -EINVAL;
@@ -874,7 +874,7 @@ static inline void xenvif_grant_handle_set(struct 
xenvif_queue *queue,
if (unlikely(queue->grant_tx_handle[pending_idx] !=
 NETBACK_INVALID_HANDLE)) {
netdev_err(queue->vif->dev,
-  "Trying to overwrite active handle! pending_idx: 
%x\n",
+  "Trying to overwrite active handle! pending_idx: 
0x%x\n",
   pending_idx);
BUG();
}
@@ -887,7 +887,7 @@ static inline void xenvif_grant_handle_reset(struct 
xenvif_queue *queue,
if (unlikely(queue->grant_tx_handle[pending_idx] ==
 NETBACK_INVALID_HANDLE)) {
netdev_err(queue->vif->dev,
-  "Trying to unmap invalid handle! pending_idx: %x\n",
+  "Trying to unmap invalid handle! pending_idx: 
0x%x\n",
   pending_idx);
BUG();
}
@@ -1243,7 +1243,7 @@ static void xenvif_tx_build_gops(struct xenvif_queue 
*queue,
/* No crossing a page as the payload mustn't fragment. */
if (unlikely((txreq.offset + txreq.size) > PAGE_SIZE)) {
netdev_err(queue->vif->dev,
-  "txreq.offset: %x, size: %u, end: %lu\n",
+  "txreq.offset: %u, size: %u, end: %lu\n",
   txreq.offset, txreq.size,
   (unsigned long)(txreq.offset&~PAGE_MASK) + 
txreq.size);
xenvif_fatal_tx_err(queue->vif);
@@ -1593,12 +1593,12 @@ static inline void xenvif_tx_dealloc_action(struct 
xenvif_queue *queue)
queue->pages_to_unmap,
gop - queue->tx_unmap_ops);
if (ret) {
-   netdev_err(queue->vif->dev, "Unmap fail: nr_ops %tx ret 
%d\n",
+   netdev_err(queue->vif->dev, "Unmap fail: nr_ops %tu ret 
%d\n",
   gop - queue->tx_unmap_ops, ret);
for (i = 0; i < gop - queue->tx_unmap_ops; ++i) {
if (gop[i].status != GNTST_okay)
netdev_err(queue->vif->dev,
-  " host_addr: %llx handle: %x 
status: %d\n",
+  " host_addr: 0x%llx handle: 
0x%x status: %d\n",
   gop[i].host_addr,
   gop[i].handle,
   gop[i].status);
@@ -1731,7 +1731,7 @@ void xenvif_idx_unmap(struct xenvif_queue *queue, u16 
pending_idx)
&queue->mmap_pages[pending_idx], 1);
if (ret) {
netdev_err(queue->vif->dev,
-  "Unmap fail: ret: %d pending_idx: %d host_addr: %llx 
handle: %x status: %d\n",
+  "Unmap fail: ret: %d pending_idx: %d host_addr: %llx 
handle: 0x%x status: %d\n",
   ret,
   pending_idx,
   tx_unmap_op.host_addr,
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v5 2/3] net/xen-netback: Remove unused code in xenvif_rx_action

2015-06-17 Thread Julien Grall
The variables old_req_cons and ring_slots_used are assigned but never
used since commit 1650d5455bd2dc6b5ee134bd6fc1a3236c266b5b "xen-netback:
always fully coalesce guest Rx packets".

Signed-off-by: Julien Grall 
Acked-by: Wei Liu 
Cc: Ian Campbell 
Cc: net...@vger.kernel.org

---
Changes in v2:
- Add Wei's Acked-by
---
 drivers/net/xen-netback/netback.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/drivers/net/xen-netback/netback.c 
b/drivers/net/xen-netback/netback.c
index 0d25943..ba3ae30 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -515,14 +515,9 @@ static void xenvif_rx_action(struct xenvif_queue *queue)
 
while (xenvif_rx_ring_slots_available(queue, XEN_NETBK_RX_SLOTS_MAX)
   && (skb = xenvif_rx_dequeue(queue)) != NULL) {
-   RING_IDX old_req_cons;
-   RING_IDX ring_slots_used;
-
queue->last_rx_time = jiffies;
 
-   old_req_cons = queue->rx.req_cons;
XENVIF_RX_CB(skb)->meta_slots_used = xenvif_gop_skb(skb, &npo, 
queue);
-   ring_slots_used = queue->rx.req_cons - old_req_cons;
 
__skb_queue_tail(&rxq, skb);
}
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   5   6   7   8   >