Re: [Xen-devel] [v2][PATCH 2/2] libxl: introduce gfx_passthru_kind

2015-03-18 Thread Chen, Tiejun

I think the _libxl_ message needs to be just "Unable to detect graphics
passthru kind". i.e. it can't/shouldn't reference anything to do with xl
config options etc (which would make no sense if libvirt was being
used).

That's not very user friendly though, so you may want to consider adding
a new specific error code for this case and returning it here, such that
xl or libvirt can then give a more comprehensible error message.


But,
args = libxl__build_device_model_args(gc, "stubdom-dm", guest_domid,
  guest_config, d_state, NULL);
|
	+ return libxl__build_device_model_args_new(gc, dm,guest_domid, 
guest_config, state, dm_state_fd);

|
+  Currently we always return "NULL" inside.

if (!args) {
ret = ERROR_FAIL;
goto out;
}

So I'm not very sure how to pass this new specific error code to xl/libvirt.




So looks the whole policy should be something like this,

diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index 5c40e84..5518759 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -1953,8 +1953,27 @@ skip_vfb:
   xlu_cfg_replace_string (config, "spice_streaming_video",
   &b_info->u.hvm.spice.streaming_video, 0);
   xlu_cfg_get_defbool(config, "nographic",
&b_info->u.hvm.nographic, 0);
-xlu_cfg_get_defbool(config, "gfx_passthru",
-&b_info->u.hvm.gfx_passthru, 0);
+if (!xlu_cfg_get_long(config, "gfx_passthru", &l, 1)) {
+libxl_defbool_set(&b_info->u.hvm.gfx_passthru, l);
+} else if (!xlu_cfg_get_string(config, "gfx_passthru", &buf, 0)) {
+if (libxl_gfx_passthru_kind_from_string(buf,
+
&b_info->u.hvm.gfx_passthru_kind)) {
+fprintf(stderr,
+"ERROR: invalid value \"%s\" for
\"gfx_passthru\"\n",
+buf);
+exit (1);
+}
+libxl_defbool_set(&b_info->u.hvm.gfx_passthru, true);
+}


Up to here is fine.


Thanks but I guess I'd better to paste this whole patch to avoid I'm 
still missing something :)


---
 tools/libxl/libxl.h  |  6 ++
 tools/libxl/libxl_dm.c   | 25 ++---
 tools/libxl/libxl_internal.h |  5 +
 tools/libxl/libxl_types.idl  |  6 ++
 tools/libxl/xl_cmdimpl.c | 14 --
 5 files changed, 51 insertions(+), 5 deletions(-)

diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index 6bbc52d..62b3ae5 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -720,6 +720,12 @@ void libxl_mac_copy(libxl_ctx *ctx, libxl_mac *dst, 
libxl_mac *src);

 #define LIBXL_HAVE_PSR_MBM 1
 #endif

+/*
+ * libxl_domain_build_info has the u.hvm.gfx_passthru_kind field and
+ * the libxl_gfx_passthru_kind enumeration is defined.
+*/
+#define LIBXL_HAVE_GFX_PASSTHRU_KIND
+
 typedef char **libxl_string_list;
 void libxl_string_list_dispose(libxl_string_list *sl);
 int libxl_string_list_length(const libxl_string_list *sl);
diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index 8599a6a..045d48a 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -710,9 +710,6 @@ static char ** 
libxl__build_device_model_args_new(libxl__gc *gc,

 flexarray_append(dm_args, "-net");
 flexarray_append(dm_args, "none");
 }
-if (libxl_defbool_val(b_info->u.hvm.gfx_passthru)) {
-flexarray_append(dm_args, "-gfx_passthru");
-}
 } else {
 if (!sdl && !vnc) {
 flexarray_append(dm_args, "-nographic");
@@ -757,6 +754,28 @@ static char ** 
libxl__build_device_model_args_new(libxl__gc *gc,

 machinearg, max_ram_below_4g);
 }
 }
+
+if (libxl_defbool_val(b_info->u.hvm.gfx_passthru)) {
+switch (b_info->u.hvm.gfx_passthru_kind) {
+case LIBXL_GFX_PASSTHRU_KIND_DEFAULT:
+if (libxl__is_igd_vga_passthru(gc, guest_config)) {
+machinearg = GCSPRINTF("%s,igd-passthru=on", 
machinearg);

+} else {
+LOG(ERROR, "Unable to detect graphics passthru kind,"
+" please set gfx_passthru_kind. See xl.cfg(5) 
for more"

+" information.\n");
+return NULL;
+}
+break;
+case LIBXL_GFX_PASSTHRU_KIND_IGD:
+machinearg = GCSPRINTF("%s,igd-passthru=on", machinearg);
+break;
+default:
+LOG(WARN, "gfx_passthru_kind is invalid so ignored.\n");
+break;
+}
+}
+
 flexarray_append(dm_args, machinearg);
 for (i = 0; b_info->extra_hvm && b_info->extra_hvm[i] != NULL; 
i++)

 flexarray_append(dm_args, b_info->extra_hvm[i]);
diff --git a/tools/libxl/libxl_internal.h

Re: [Xen-devel] [PATCH] dpci: Put the dpci back on the list if scheduled from another CPU.

2015-03-18 Thread Jan Beulich
>>> On 17.03.15 at 18:15,  wrote:
> On Tue, Mar 17, 2015 at 04:06:14PM +, Jan Beulich wrote:
>> >>> On 17.03.15 at 16:38,  wrote:
>> > --- a/xen/drivers/passthrough/io.c
>> > +++ b/xen/drivers/passthrough/io.c
>> > @@ -804,7 +804,17 @@ static void dpci_softirq(void)
>> >  d = pirq_dpci->dom;
>> >  smp_mb(); /* 'd' MUST be saved before we set/clear the bits. */
>> >  if ( test_and_set_bit(STATE_RUN, &pirq_dpci->state) )
>> > -BUG();
>> > +{
>> > +unsigned long flags;
>> > +
>> > +/* Put back on the list and retry. */
>> > +local_irq_save(flags);
>> > +list_add_tail(&pirq_dpci->softirq_list, &this_cpu(dpci_list));
>> > +local_irq_restore(flags);
>> > +
>> > +raise_softirq(HVM_DPCI_SOFTIRQ);
>> > +continue;
>> > +}
>> 
>> As just said in another mail - unless there are convincing new
>> arguments in favor of this (more of a hack than a real fix), I'm
>> not going to accept it and instead consider reverting the
>> offending commit. Iirc the latest we had come to looked quite a
>> bit better than this one.
> 
> The latest one (please see attached) would cause an dead-lock iff
> on the CPU we are running the softirq and an do_IRQ comes for the
> exact dpci we are in process of executing.

I'm afraid I'm not seeing it - please explain.

As to the code - I think switch() is rather hiding the intentions
here, i.e. the code would be better readable if using two if()s:

+for ( ;; )
+{
+old = cmpxchg(&pirq_dpci->state, 0, 1 << STATE_SCHED);
+/* If the 'state' is 0 (not in use) we can schedule it. */
+if ( !old )
+break;
+if ( !(old & (1 << STATE_RUN)) )
+{
+/* Whenever STATE_SCHED is set we MUST not schedule it. */
+assert(old & (1 << STATE_SCHED));
+return;
+}
+}

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH] dpci: Put the dpci back on the list if running on another CPU.

2015-03-18 Thread Jan Beulich
>>> On 17.03.15 at 18:44,  wrote:
> As you can see to preserve the existing functionality such as
> being able to schedule N amount of interrupt injections
> for the N interrupts we might get - I modified '->masked'
> to be an atomic counter.

Why would that be? When an earlier interrupt wasn't fully handled,
real hardware wouldn't latch more than one further instance either.

> The end result is that we can still live-lock. Unless we:
>  - Drop on the floor the injection of N interrupts and
>just deliever at max one per VMX_EXIT (and not bother
>with interrupts arriving when we are in the VMX handler).

I'm afraid I again don't see the point here.

>  - Alter the softirq code slightly - to have an variant
>which will only iterate once over the pending softirq
>bits per call. (so save an copy of the bitmap on the
>stack when entering the softirq handler - and use that.
>We could also xor it against the current to catch any
>non-duplicate bits being set that we should deal with).

That's clearly not an option: The solution should be isolated
to DPCI code, i.e. without altering existing behavior in other
(more generic) components.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 2/2] tools/libxl/libxl_qmp.c: Close fds on error path in qmp_open

2015-03-18 Thread PRAMOD DEVENDRA
From: Pramod Devendra 

Signed-off-by: Pramod Devendra 
CC: Ian Jackson 
CC: Stefano Stabellini 
CC: Ian Campbell 
CC: Wei Liu 
---
 tools/libxl/libxl_qmp.c |   17 ++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/tools/libxl/libxl_qmp.c b/tools/libxl/libxl_qmp.c
index c7324e6..992c1ee 100644
--- a/tools/libxl/libxl_qmp.c
+++ b/tools/libxl/libxl_qmp.c
@@ -365,14 +365,24 @@ static int qmp_open(libxl__qmp_handler *qmp, const char 
*qmp_socket_path,
 return -1;
 }
 ret = libxl_fd_set_nonblock(qmp->ctx, qmp->qmp_fd, 1);
-if (ret) return -1;
+if (ret) {
+close(qmp->qmp_fd);
+return -1;
+}
 ret = libxl_fd_set_cloexec(qmp->ctx, qmp->qmp_fd, 1);
-if (ret) return -1;
+if (ret) {
+close(qmp->qmp_fd);
+return -1;
+}
 
+if(sizeof (qmp->addr.sun_path) <= strlen(qmp_socket_path)) {
+close(qmp->qmp_fd);
+return -1;
+}
 memset(&qmp->addr, 0, sizeof (qmp->addr));
 qmp->addr.sun_family = AF_UNIX;
 strncpy(qmp->addr.sun_path, qmp_socket_path,
-sizeof (qmp->addr.sun_path));
+sizeof (qmp->addr.sun_path)-1);
 
 do {
 ret = connect(qmp->qmp_fd, (struct sockaddr *) &qmp->addr,
@@ -384,6 +394,7 @@ static int qmp_open(libxl__qmp_handler *qmp, const char 
*qmp_socket_path,
  * ECONNREFUSED : Leftover socket hasn't been removed yet */
 continue;
 }
+close(qmp->qmp_fd);
 return -1;
 } while ((++i / 5 <= timeout) && (usleep(200 * 1000) <= 0));
 
-- 
1.7.10.4



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 2/2] sched_credit2.c: runqueue_per_core code

2015-03-18 Thread Jan Beulich
>>> On 17.03.15 at 19:18,  wrote:
> On Mon, 2015-03-16 at 12:56 +, Jan Beulich wrote:
>> >>> On 16.03.15 at 13:51,  wrote:
>> > On 03/16/2015 12:48 PM, Jan Beulich wrote:
>> My preferred solution would be, as said, to leverage system_state.
>> Provided the state to look for is consistent between x86 and ARM.
>> 
> Would something like this make sense?

Yes, albeit ...

> --- a/xen/common/sched_credit2.c
> +++ b/xen/common/sched_credit2.c
> @@ -1936,12 +1936,8 @@ static void init_pcpu(const struct scheduler *ops, int 
> cpu)
>  }
>  
>  /* Figure out which runqueue to put it in */
> -rqi = 0;
> -
> -/* Figure out which runqueue to put it in */
> -/* NB: cpu 0 doesn't get a STARTING callback, so we hard-code it to 
> runqueue 0. */
> -if ( cpu == 0 )
> -rqi = 0;
> +if ( system_state == SYS_STATE_boot )

... I'd suggest using <= SYS_STATE_boot or < SYS_STATE_smp_boot.

> @@ -1986,9 +1982,13 @@ static void init_pcpu(const struct scheduler *ops, int 
> cpu)
>  static void *
>  csched2_alloc_pdata(const struct scheduler *ops, int cpu)
>  {
> -/* Check to see if the cpu is online yet */
> -/* Note: cpu 0 doesn't get a STARTING callback */
> -if ( cpu == 0 || cpu_to_socket(cpu) >= 0 )
> +/*
> + * Actual initialization is deferred to when the pCPU will be
> + * online, via a STARTING callback. The only exception is
> + * the boot cpu, which does not get such a notification, and
> + * hence needs to be taken care of here.
> + */
> +if ( system_state == SYS_STATE_boot )
>  init_pcpu(ops, cpu);

Same here, plus the new condition at the first glance isn't matching
the old one, but perhaps that's intentional.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 2/2] VT-d: extend XSA-59 workaround to XeonE5 v3 (Haswell)

2015-03-18 Thread Jan Beulich
>>> On 17.03.15 at 18:41,  wrote:
> Note that the following Haswell chipsets should also be included in this 
> list:
> 
> Haswell - 0xc0f, 0xd00, 0xd04, 0xd08, 0xd0f, 0xa00, 0xa08, 0xa0f

But these are desktop ones afaict, i.e. they'd belong in a separate
patch (which I'll create in a minute).

Jan

> -Original Message-
> From: Jan Beulich [mailto:jbeul...@suse.com] 
> Sent: Friday, December 19, 2014 1:42 AM
> To: xen-devel
> Cc: Dugger, Donald D; Tian, Kevin; Zhang, Yang Z
> Subject: [PATCH 2/2] VT-d: extend XSA-59 workaround to XeonE5 v3 (Haswell)
> 
> Note that the datasheet lacks PCI IDs for Dev 1 Fn 0-1, so their IDs are 
> being added based on what https://pci-ids.ucw.cz/read/PC/8086 says.
> 
> Signed-off-by: Jan Beulich 
> 
> --- a/xen/drivers/passthrough/vtd/quirks.c
> +++ b/xen/drivers/passthrough/vtd/quirks.c
> @@ -431,6 +431,7 @@ void pci_vtd_quirk(const struct pci_dev 
>   *   - Potential security issue if malicious guest trigger VT-d faults.
>   */
>  case 0x0e28: /* Xeon-E5v2 (IvyBridge) */
> +case 0x2f28: /* Xeon-E5v3 (Haswell) */
>  case 0x342e: /* Tylersburg chipset (Nehalem / Westmere systems) */
>  case 0x3728: /* Xeon C5500/C3500 (JasperForest) */
>  case 0x3c28: /* Sandybridge */
> @@ -443,6 +444,9 @@ void pci_vtd_quirk(const struct pci_dev 
>  /* Xeon E5/E7 v2 */
>  case 0x0e00: /* host bridge */
>  case 0x0e01: case 0x0e04 ... 0x0e0b: /* root ports */
> +/* Xeon E5 v3 */
> +case 0x2f00: /* host bridge */
> +case 0x2f01 ... 0x2f0b: /* root ports */
>  /* Tylersburg (EP)/Boxboro (MP) chipsets (NHM-EP/EX, WSM-EP/EX) */
>  case 0x3400 ... 0x3407: /* host bridges */
>  case 0x3408 ... 0x3411: case 0x3420 ... 0x3421: /* root ports */




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 1/2] VT-d: make XSA-59 workaround fully cover XeonE5/E7 v2

2015-03-18 Thread Jan Beulich
>>> On 17.03.15 at 18:39,  wrote:
> Note that the following Nehalem/Westmere chipsets should be included in this 
> list:
> 
> Nehalem - 0x40, 0x2c01, 0x2c41, 0x313x
> Westmere - 0x2c70, 0x2d81, 0xd15x

0x0040 is already there (as a desktop one). The others being server
ones again would belong into a separate patch, as v2 (see the title)
refers to Ivybridge afaict.

Furthermore simply listing the IDs isn't necessarily enough. I'll see
whether I can hunt down the data sheets for the above, but
considering that we have three different workarounds, of which
I can reasonably exclude only the desktop variant, I wouldn't know
where to put those without having the spec or you telling me.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 1/6] x86: detect and initialize Intel CAT feature

2015-03-18 Thread Chao Peng
On Tue, Mar 17, 2015 at 09:00:27AM -0400, Konrad Rzeszutek Wilk wrote:
> On Tue, Mar 17, 2015 at 04:11:33PM +0800, Chao Peng wrote:
> > On Fri, Mar 13, 2015 at 09:40:13AM -0400, Konrad Rzeszutek Wilk wrote:
> > > On Fri, Mar 13, 2015 at 06:13:20PM +0800, Chao Peng wrote:
> > > > Detect Intel Cache Allocation Technology(CAT) feature and store the
> > > > cpuid information for later use. Currently only L3 cache allocation is
> > > > supported. The L3 CAT features may vary among sockets so per-socket
> > > > feature information is stored. The initialization can happen either at
> > > > boot time or when CPU(s) is hot plugged after booting.
> > > > 
> > > > Signed-off-by: Chao Peng 
> > > > ---
> > > >  docs/misc/xen-command-line.markdown |  15 +++-
> > > >  xen/arch/x86/psr.c  | 151 
> > > > +---
> > > >  xen/include/asm-x86/cpufeature.h|   1 +
> > > >  3 files changed, 155 insertions(+), 12 deletions(-)
> > > > 
> > > > +cat_cpu_init(smp_processor_id());
> > > 
> > > Do 'if (!cat_cpu_init(..)).`'
> > > 
> > > as the CPU might not support this.
> > > 
> > > At which point you should also free the cat_socket_info and
> > > not register the cpu notifier.
> > 
> > Even the booting CPU does not support this, other CPUs may still support
> > this. Generally the feature is a per-socket feature. So break here is
> > not the intention.
> 
> Oooh, and you did mention that in the git commit description and I dived
> right in the code - without looking there - sorry for that noise!
> 
> Thought I am curious - what if all the sockets don't support and the
> user does try enable it on the command line (user error)? Shouldn't we
> then figure out that all of the CPUs don't support and xfree
> cat_socket_info and not register the CPU notifier?

Ideally, we should. Tricky thing is to support cpu hot plug, which
happens not at but after the booting time.

Chao
> > 
> > Except this, all other comments will be addressed by the next version.
> 
> thank you!
> > Thanks for your time.
> > 
> > Chao
> > > 
> > > > +register_cpu_notifier(&cpu_nfb);
> > > > +}
> > > > +
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 2/2] sched_credit2.c: runqueue_per_core code

2015-03-18 Thread Dario Faggioli
On Wed, 2015-03-18 at 07:56 +, Jan Beulich wrote:
> >>> On 17.03.15 at 19:18,  wrote:

> > --- a/xen/common/sched_credit2.c
> > +++ b/xen/common/sched_credit2.c
> > @@ -1936,12 +1936,8 @@ static void init_pcpu(const struct scheduler *ops, 
> > int cpu)
> >  }
> >  
> >  /* Figure out which runqueue to put it in */
> > -rqi = 0;
> > -
> > -/* Figure out which runqueue to put it in */
> > -/* NB: cpu 0 doesn't get a STARTING callback, so we hard-code it to 
> > runqueue 0. */
> > -if ( cpu == 0 )
> > -rqi = 0;
> > +if ( system_state == SYS_STATE_boot )
> 
> ... I'd suggest using <= SYS_STATE_boot or < SYS_STATE_smp_boot.
> 
Ok.

> > @@ -1986,9 +1982,13 @@ static void init_pcpu(const struct scheduler *ops, 
> > int cpu)
> >  static void *
> >  csched2_alloc_pdata(const struct scheduler *ops, int cpu)
> >  {
> > -/* Check to see if the cpu is online yet */
> > -/* Note: cpu 0 doesn't get a STARTING callback */
> > -if ( cpu == 0 || cpu_to_socket(cpu) >= 0 )
> > +/*
> > + * Actual initialization is deferred to when the pCPU will be
> > + * online, via a STARTING callback. The only exception is
> > + * the boot cpu, which does not get such a notification, and
> > + * hence needs to be taken care of here.
> > + */
> > +if ( system_state == SYS_STATE_boot )
> >  init_pcpu(ops, cpu);
> 
> Same here, plus the new condition at the first glance isn't matching
> the old one, but perhaps that's intentional.
> 
It is intentional, and that is why we're changing it! :-) Let me try to
explain:

The idea, in both old and new code, is to call init_pcpu() either:
 a) on the boot cpu, during boot
 b) on non-boot cpu, *only* if they are online.

The issue is that, for assessing b), it checks whether
cpu_to_socket(cpu) returns >=0 and, if it does, it assumes the cpu is
online. That is not true, as cpu_data.phys_proc_id (which is what
cpu_to_socket() go reading) is 0 even before any onlining and topolog
identification has happened (it's a global).

Therefore, all the pcpus are initialized right away, and all are
assigned to runqueue 0, as returned by cpu_to_socket() at this stage,
which is clearly wrong.

In fact, this is the reason why, previous versions of this took the
approach of initializing things such as cpu_to_socket() returned -1
before topology identification.

In the new version, as you suggested, I'm using system_state to figure
out whether we are dealing with the boot cpu, and that's it. :-)

Hope this clarifies things... If yes, I'll make sure to put a similar
explanation in the changelog, when sending the patch officially.

Regards,
Dario


signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V2] Add flag to start info regarding virtual mapped p2m list

2015-03-18 Thread Ian Campbell
On Tue, 2015-03-17 at 06:50 +0100, Juergen Gross wrote:
> On 03/04/2015 12:41 PM, Ian Campbell wrote:
> > On Wed, 2015-03-04 at 12:22 +0100, Juergen Gross wrote:
> >> It would either be used like intended,
> >
> > Which is how? That is what is really missing here.
> >
> > So far this appears to be a bit which enables some as yet unspecified[0]
> > behaviour in one particular OS kernel with some as yet undiscussed
> > potential future impact on toolstack functionality.
> 
> Sorry for the late answer, but I managed to move this mail to my archive
> before reacting on it. :-(

No problem.

The description looks like the sort of thing which ought to go into the
commit log.

Is there not an ABI change somewhere else relating to the exposure of
the cr3+vaddr+size? If so why is it not in this patch?

Ideally whichever file which needs to change in xen/include/public to
expose that change should also come along with documentation for this
new ABI.

If that change has been deferred for some reason then I think it (and
why) should be mentioned in the commit message, you'll also want to
explain why adding the bit now but the ABI change later is safe, i.e.
what the transition plan is.

AIUI this change has broken memory hotplug and has also made it
difficult from an ABI PoV to reinstate that support. I think that needs
to be addressed (i.e. at the ABI design level, not necessary
implemented) before we add a bit exposing this feature.

> Guests implementing this new interface need to know, of course, whether
> the Xen tools are capable to use the new interface instead of the old
> p2m tree interface. Otherwise a guest using only the new interface with
> the virtual mapped linear p2m list on a machine with old Xen tools not
> supporting this interface could not be restored or migrated.

What is the mechanism for the converse, i.e. for the tools to detect if
a given guest is making use of this functionality?

Are tools going to be able to tell at migration time, or is it something
which needs to be detectable at domain build time (and therefore exposed
in the kernel ELF notes)?

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 1/3] tools/libxl/libxl_qmp.c: Make sure sun_path is NULL terminated in qmp_open

2015-03-18 Thread Ian Campbell
On Mon, 2015-03-16 at 10:05 +, PRAMOD DEVENDRA wrote:

Please can you arrange in the future to chain your series into a single
thread. That means that each N/3 mail should have a References and/or
In-reply-to header pointing at either the 0/N mail or the (N-1)/N mail
in the series.

http://wiki.xen.org/wiki/Submitting_Xen_Patches#Sending_the_patches_to_the_list 
has some information on how to achieve this with git send-email.

Note that this also implies that you should include a 0/N mail which
ties the series together (i.e. explains the overall purpose of the
series). The wiki covers this too.

At the moment my mailbox contains 4 unthread mails from you, named 1/3,
2/3, 3/3 and 2/2 which is liable to get confusing if there are any
further revisions etc.

Ian.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 2/2] tools/libxl/libxl_qmp.c: Close fds on error path in qmp_open

2015-03-18 Thread Wei Liu
On Wed, Mar 18, 2015 at 06:50:35AM +, PRAMOD DEVENDRA wrote:
> From: Pramod Devendra 
> 
> Signed-off-by: Pramod Devendra 
> CC: Ian Jackson 
> CC: Stefano Stabellini 
> CC: Ian Campbell 
> CC: Wei Liu 

Thanks.

The patch content looks good.

However the patch title and commit message are not descriptive enough.

Can you change the title to:

  libxl/libxl_qmp.c: fix qmp_open

And then in commit message:

  This patch does following things:
  
  1. Make sure sun_path does not overflow.
  2. Close qmp_fd on error.

For more general information on patch submission, please refer to

  http://wiki.xenproject.org/wiki/Submitting_Xen_Project_Patches

Wei.

> ---
>  tools/libxl/libxl_qmp.c |   17 ++---
>  1 file changed, 14 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/libxl/libxl_qmp.c b/tools/libxl/libxl_qmp.c
> index c7324e6..992c1ee 100644
> --- a/tools/libxl/libxl_qmp.c
> +++ b/tools/libxl/libxl_qmp.c
> @@ -365,14 +365,24 @@ static int qmp_open(libxl__qmp_handler *qmp, const char 
> *qmp_socket_path,
>  return -1;
>  }
>  ret = libxl_fd_set_nonblock(qmp->ctx, qmp->qmp_fd, 1);
> -if (ret) return -1;
> +if (ret) {
> +close(qmp->qmp_fd);
> +return -1;
> +}
>  ret = libxl_fd_set_cloexec(qmp->ctx, qmp->qmp_fd, 1);
> -if (ret) return -1;
> +if (ret) {
> +close(qmp->qmp_fd);
> +return -1;
> +}
>  
> +if(sizeof (qmp->addr.sun_path) <= strlen(qmp_socket_path)) {
> +close(qmp->qmp_fd);
> +return -1;
> +}
>  memset(&qmp->addr, 0, sizeof (qmp->addr));
>  qmp->addr.sun_family = AF_UNIX;
>  strncpy(qmp->addr.sun_path, qmp_socket_path,
> -sizeof (qmp->addr.sun_path));
> +sizeof (qmp->addr.sun_path)-1);
>  
>  do {
>  ret = connect(qmp->qmp_fd, (struct sockaddr *) &qmp->addr,
> @@ -384,6 +394,7 @@ static int qmp_open(libxl__qmp_handler *qmp, const char 
> *qmp_socket_path,
>   * ECONNREFUSED : Leftover socket hasn't been removed yet */
>  continue;
>  }
> +close(qmp->qmp_fd);
>  return -1;
>  } while ((++i / 5 <= timeout) && (usleep(200 * 1000) <= 0));
>  
> -- 
> 1.7.10.4

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Any work on sharing of large multi-page segments?

2015-03-18 Thread Ian Campbell
On Tue, 2015-03-17 at 17:45 -0600, Andrew Warkentin wrote:
> On 3/17/15, Jan Beulich  wrote:
> > And how would that be significantly different from the batching
> > that's already built into the grant table hypercall?
> >
> I guess it does do more or less what I want already. I was looking
> more at the inner mapping/unmapping functions, rather than the
> wrappers around them that implement the actual hypercalls.
> 
> What would be a useful addition would be support for granting 2M
> pages. That would eliminate any problem with running out of grant
> table slots.

It's related but not quite the same but on ARM we are probably
eventually going to want to look into support for 64K grant mappings at
some point.

On ARM there are several options for the basic leaf page (called
"granules") size, 4K, 16K and 64K and it seems that at least some
OS/distro vendors are going to ship their kernels with 64K pages by
default, so we need to be able to support such guests.

Short term we can deal with this in the front/backend by using multiple
grants per guest page (i.e. 16 grants per 64K page) but that is quite
wasteful of ring space. Longer term I think it would be worth
investigating adding grant table extensions to allow for >4K granule
sizes.

>From there it's not too much of a stretch to imagine supporting
superpages of whichever granule size (i.e. 2M in the 4K case might be
sane, 32M in the 64K case perhaps less useful). 

Even without that Linux will use compound pages e.g. for network frags
(up to order 3 == 32k, I think), so supporting higher order grants might
even be useful there even on x86.

Internally we'd still need to deal with all this in Xen as 4K mappings
in the second stage pages, but that's doable I think.

It's not clear when this would float to the top of priority list for Xen
on ARM though, so I wouldn't wait for us ;-)

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v2][PATCH 2/2] libxl: introduce gfx_passthru_kind

2015-03-18 Thread Ian Campbell
On Wed, 2015-03-18 at 15:32 +0800, Chen, Tiejun wrote:
> > I think the _libxl_ message needs to be just "Unable to detect graphics
> > passthru kind". i.e. it can't/shouldn't reference anything to do with xl
> > config options etc (which would make no sense if libvirt was being
> > used).
> >
> > That's not very user friendly though, so you may want to consider adding
> > a new specific error code for this case and returning it here, such that
> > xl or libvirt can then give a more comprehensible error message.
> 
> But,
>  args = libxl__build_device_model_args(gc, "stubdom-dm", guest_domid,
>guest_config, d_state, NULL);
>   |
>   + return libxl__build_device_model_args_new(gc, dm,guest_domid, 
> guest_config, state, dm_state_fd);
>   |
>   +  Currently we always return "NULL" inside.
> 
>  if (!args) {
>  ret = ERROR_FAIL;
>  goto out;
>  }
> 
> So I'm not very sure how to pass this new specific error code to xl/libvirt.

Bah, yes. I don't think it is reasonable to ask you to rework the error
handling here completely for this. Lets leave this as a potential future
enhancement.

> Thanks but I guess I'd better to paste this whole patch to avoid I'm 
> still missing something :)
> 
> ---
>   tools/libxl/libxl.h  |  6 ++
>   tools/libxl/libxl_dm.c   | 25 ++---
>   tools/libxl/libxl_internal.h |  5 +
>   tools/libxl/libxl_types.idl  |  6 ++
>   tools/libxl/xl_cmdimpl.c | 14 --
>   5 files changed, 51 insertions(+), 5 deletions(-)
> 
> diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
> index 6bbc52d..62b3ae5 100644
> --- a/tools/libxl/libxl.h
> +++ b/tools/libxl/libxl.h
> @@ -720,6 +720,12 @@ void libxl_mac_copy(libxl_ctx *ctx, libxl_mac *dst, 
> libxl_mac *src);
>   #define LIBXL_HAVE_PSR_MBM 1
>   #endif
> 
> +/*
> + * libxl_domain_build_info has the u.hvm.gfx_passthru_kind field and
> + * the libxl_gfx_passthru_kind enumeration is defined.
> +*/
> +#define LIBXL_HAVE_GFX_PASSTHRU_KIND
> +
>   typedef char **libxl_string_list;
>   void libxl_string_list_dispose(libxl_string_list *sl);
>   int libxl_string_list_length(const libxl_string_list *sl);
> diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
> index 8599a6a..045d48a 100644
> --- a/tools/libxl/libxl_dm.c
> +++ b/tools/libxl/libxl_dm.c
> @@ -710,9 +710,6 @@ static char ** 
> libxl__build_device_model_args_new(libxl__gc *gc,
>   flexarray_append(dm_args, "-net");
>   flexarray_append(dm_args, "none");
>   }
> -if (libxl_defbool_val(b_info->u.hvm.gfx_passthru)) {
> -flexarray_append(dm_args, "-gfx_passthru");
> -}
>   } else {
>   if (!sdl && !vnc) {
>   flexarray_append(dm_args, "-nographic");
> @@ -757,6 +754,28 @@ static char ** 
> libxl__build_device_model_args_new(libxl__gc *gc,
>   machinearg, max_ram_below_4g);
>   }
>   }
> +
> +if (libxl_defbool_val(b_info->u.hvm.gfx_passthru)) {
> +switch (b_info->u.hvm.gfx_passthru_kind) {
> +case _info->u.hvm.gfx_passthru_kind:
> +if (libxl__is_igd_vga_passthru(gc, guest_config)) {
> +machinearg = GCSPRINTF("%s,igd-passthru=on", 
> machinearg);
> +} else {
> +LOG(ERROR, "Unable to detect graphics passthru kind,"
> +" please set gfx_passthru_kind. See xl.cfg(5) 
> for more"
> +" information.\n");

Please change this to "Unable to detect graphics passthru kind" to avoid
talking xl-isms in libxl.

> +return NULL;
> +}
> +break;
> +case LIBXL_GFX_PASSTHRU_KIND_IGD:
> +machinearg = GCSPRINTF("%s,igd-passthru=on", machinearg);
> +break;

This duplicates the code from above. I think this would be best done as:

static int libxl__detect_gfx_passthru_kind(libxl__gc *gc, guest_config)
{
if (b_info->u.hvm.gfx_passthru_kind != LIBXL_GFX_PASSTHRU_KIND_DEFAULT)
return 0;

if (libxl__is_igd_vga_passthru(gc, guest_config)) {
b_info->u.hvm.gfx_passthru_kind = LIBXL_GFX_PASSTHRU_KIND_IGD;
return 0;
}

LOG(ERROR, "Unable to detect graphics passthru kind");
return 1;
}

Then for the code in libxl__build_device_model_args_new:

 if (libxl_defbool_val(b_info->u.hvm.gfx_passthru)) {
 if (!libxl__detect_gfx_passthru_kind(gc, guest_config))
  return NULL
 switch (b_info->u.hvm.gfx_passthru_kind) {
 case LIBXL_GFX_PASSTHRU_KIND_IGD:
 machinearg = GCSPRINTF("%s,igd-passthru=on", machinearg);
 break;
 default:
 LOG(ERROR, "unknown gfx_passthru_kind\n");
return NULL;
 }
}

Tha

Re: [Xen-devel] [PATCHv1] xen/balloon: disable memory hotplug in PV guests

2015-03-18 Thread David Vrabel
On 16/03/15 10:31, Juergen Gross wrote:
> On 03/16/2015 11:03 AM, Daniel Kiper wrote:
>> On Mon, Mar 16, 2015 at 06:35:04AM +0100, Juergen Gross wrote:
>>> On 03/11/2015 04:40 PM, Boris Ostrovsky wrote:
 On 03/11/2015 10:42 AM, David Vrabel wrote:
> On 10/03/15 13:35, Boris Ostrovsky wrote:
>> On 03/10/2015 07:40 AM, David Vrabel wrote:
>>> On 09/03/15 14:10, David Vrabel wrote:
 Memory hotplug doesn't work with PV guests because:

 a) The p2m cannot be expanded to cover the new sections.
>>> Broken by 054954eb051f35e74b75a566a96fe756015352c8 (xen: switch to
>>> linear virtual mapped sparse p2m list).
>>>
>>> This one would be non-trivial to fix.  We'd need a sparse set of
>>> vm_area's for the p2m or similar.
>>>
 b) add_memory() builds page tables for the new sections which
 means
the new pages must have valid p2m entries (or a BUG occurs).
>>> After some more testing this appears to be broken by:
>>>
>>> 25b884a83d487fd62c3de7ac1ab5549979188482 (x86/xen: set regions above
>>> the
>>> end of RAM as 1:1) included 3.16.
>>>
>>> This one can be trivially fixed by setting the new sections in
>>> the p2m
>>> to INVALID_P2M_ENTRY before calling add_memory().
>> Have you tried 3.17? As I said yesterday, it worked for me (with 4.4
>> Xen).
> No.  But there are three bugs that prevent it from working in 3.16+ so
> I'm really not sure how you had a working in a 3.17 PV guest.

 This is what I have:

 [build@build-mk2 linux-boris]$ ssh root@tst008 cat
 /mnt/lab/bootstrap-x86_64/test_small.xm
 extra="console=hvc0 debug earlyprintk=xen "
 kernel="/mnt/lab/bootstrap-x86_64/vmlinuz"
 ramdisk="/mnt/lab/bootstrap-x86_64/initramfs.cpio.gz"
 memory=1024
 maxmem = 4096
 vcpus=1
 maxvcpus=3
 name="bootstrap-x86_64"
 on_crash="preserve"
 vif = [ 'mac=00:0F:4B:00:00:68, bridge=switch' ]
 vnc=1
 vnclisten="0.0.0.0"
 disk=['phy:/dev/guests/bootstrap-x86_64,xvda,w']
 [build@build-mk2 linux-boris]$ ssh root@tst008 xl create
 /mnt/lab/bootstrap-x86_64/test_small.xm
 Parsing config from /mnt/lab/bootstrap-x86_64/test_small.xm
 [build@build-mk2 linux-boris]$ ssh root@tst008 xl list |grep
 bootstrap-x86_64
 bootstrap-x86_64 2  1024 1
 -b   5.4
 [build@build-mk2 linux-boris]$ ssh root@g-pvops uname -r
 3.17.0upstream
 [build@build-mk2 linux-boris]$ ssh root@g-pvops dmesg|grep
 paravirtualized
 [0.00] Booting paravirtualized kernel on Xen
 [build@build-mk2 linux-boris]$ ssh root@g-pvops grep MemTotal
 /proc/meminfo
 MemTotal: 968036 kB
 [build@build-mk2 linux-boris]$ ssh root@tst008 xl mem-set
 bootstrap-x86_64 2048
 [build@build-mk2 linux-boris]$ ssh root@tst008 xl list |grep
 bootstrap-x86_64
 bootstrap-x86_64 2  2048 1
 -b   5.7
 [build@build-mk2 linux-boris]$ ssh root@g-pvops grep MemTotal
 /proc/meminfo
 MemTotal:2016612 kB
 [build@build-mk2 linux-boris]$


>
> Regardless, it definitely doesn't work now because of the linear p2m
> change.  What do you want to do about this?

 Since backing out p2m changes is not an option I guess your patch is
 the
 only short-term alternative.

 But this still looks like a regression so perhaps Juergen can take a
 look to see how it can be fixed.
>>>
>>> Hmm, the p2m list is allocated for the maximum memory size of the domain
>>> which is obtained from the hypervisor. In case of Dom0 it is read via
>>> XENMEM_maximum_reservation, for a domU it is based on the E820 memory
>>> map read via XENMEM_memory_map.
>>>
>>> I just tested it with a 4.0-rc1 domU kernel with 512MB initial memory
>>> and 4GB of maxmem. The E820 map looked like this:
>>>
>>> [0.00] Xen: [mem 0x-0x0009] usable
>>> [0.00] Xen: [mem 0x000a-0x000f] reserved
>>> [0.00] Xen: [mem 0x0010-0x] usable
>>>
>>> So the complete 4GB were included, like they should. The resulting p2m
>>> list is allocated in the needed size:
>>>
>>> [0.00] p2m virtual area at c900, size is 80
>>>
>>> So what is your problem here? Can you post the E820 map and the p2m map
>>> info for your failing domain, please?
>>
>> If you use memory hotplug then maxmem is not a limit from guest kernel
>> point of view (host still must allow that operation but it is another
>> not related issue). The problem is that p2m must be dynamically
>> expendable
>> to support it. Earlier implementation supported that thing and memory
>> hotplug worked without any issue.
> 
> Okay, now I get it.
> 
> The problem with the earlier p2m implementation was that it was
> expendable to support o

Re: [Xen-devel] [PATCH V2] Add flag to start info regarding virtual mapped p2m list

2015-03-18 Thread Juergen Gross

On 03/18/2015 10:59 AM, Ian Campbell wrote:

On Tue, 2015-03-17 at 06:50 +0100, Juergen Gross wrote:

On 03/04/2015 12:41 PM, Ian Campbell wrote:

On Wed, 2015-03-04 at 12:22 +0100, Juergen Gross wrote:

It would either be used like intended,


Which is how? That is what is really missing here.

So far this appears to be a bit which enables some as yet unspecified[0]
behaviour in one particular OS kernel with some as yet undiscussed
potential future impact on toolstack functionality.


Sorry for the late answer, but I managed to move this mail to my archive
before reacting on it. :-(


No problem.

The description looks like the sort of thing which ought to go into the
commit log.


Okay, I can change it.


Is there not an ABI change somewhere else relating to the exposure of
the cr3+vaddr+size? If so why is it not in this patch?


Commit 50bd1f0825339dfacde471df7664729216fc46e3


Ideally whichever file which needs to change in xen/include/public to
expose that change should also come along with documentation for this
new ABI.


Included in above commit in form of comments in the modified file.


If that change has been deferred for some reason then I think it (and
why) should be mentioned in the commit message, you'll also want to
explain why adding the bit now but the ABI change later is safe, i.e.
what the transition plan is.

AIUI this change has broken memory hotplug and has also made it
difficult from an ABI PoV to reinstate that support. I think that needs
to be addressed (i.e. at the ABI design level, not necessary
implemented) before we add a bit exposing this feature.


The interface change didn't brake anything. It was the implementation in
the Linux kernel.


Guests implementing this new interface need to know, of course, whether
the Xen tools are capable to use the new interface instead of the old
p2m tree interface. Otherwise a guest using only the new interface with
the virtual mapped linear p2m list on a machine with old Xen tools not
supporting this interface could not be restored or migrated.


What is the mechanism for the converse, i.e. for the tools to detect if
a given guest is making use of this functionality?


See above commit. It's all in the comments.


Are tools going to be able to tell at migration time, or is it something
which needs to be detectable at domain build time (and therefore exposed
in the kernel ELF notes)?


Migration time is fine.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 3/5] xen: print online pCPUs and free pCPUs when dumping

2015-03-18 Thread Jan Beulich
>>> On 17.03.15 at 16:33,  wrote:
> @@ -671,12 +678,17 @@ void dump_runq(unsigned char key)
>  sched_smt_power_savings? "enabled":"disabled");
>  printk("NOW=0x%08X%08X\n",  (u32)(now>>32), (u32)now);
>  
> +print_cpumap("Online Cpus", &cpu_online_map);
> +if ( cpumask_weight(&cpupool_free_cpus) )

This is certainly more expensive than necessary - mind if I change
this to !cpumask_empty() on commit?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V2] Add flag to start info regarding virtual mapped p2m list

2015-03-18 Thread Ian Campbell
On Wed, 2015-03-18 at 11:59 +0100, Juergen Gross wrote:

> > Is there not an ABI change somewhere else relating to the exposure of
> > the cr3+vaddr+size? If so why is it not in this patch?
> 
> Commit 50bd1f0825339dfacde471df7664729216fc46e3
> 
> > Ideally whichever file which needs to change in xen/include/public to
> > expose that change should also come along with documentation for this
> > new ABI.
> 
> Included in above commit in form of comments in the modified file.

Great, please mention in the commit log here that this is building on
that work (would have saved me having to ask).

> > If that change has been deferred for some reason then I think it (and
> > why) should be mentioned in the commit message, you'll also want to
> > explain why adding the bit now but the ABI change later is safe, i.e.
> > what the transition plan is.
> >
> > AIUI this change has broken memory hotplug and has also made it
> > difficult from an ABI PoV to reinstate that support. I think that needs
> > to be addressed (i.e. at the ABI design level, not necessary
> > implemented) before we add a bit exposing this feature.
> 
> The interface change didn't brake anything. It was the implementation in
> the Linux kernel.

AIUI the new interface has made it difficult to for OS kernels to
arrange to be able to grow their P2M. Whether that is an "OS kernel
issues" or an "interface issue" isn't really the point, the fact is that
for whatever reason it is now difficult to arrange.

Is there a plan for how this might be dealt with?

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 4/6] tools/libxl: Allow limiting amount copied by datacopier

2015-03-18 Thread Ian Campbell
On Mon, 2015-03-16 at 13:29 +, Ross Lagerwall wrote:
> Currently, a datacopier will unconditionally read until EOF on its read fd.
> 
> For migration v2, libxl needs to read records of a specific length out of the
> migration stream, without reading any further data.
> 
> Introduce a parameter, bytes_to_read, which may be used to stop the datacopier
> ahead of reaching EOF. If bytes_to_read is set to -1, then the datacopier will
> read until EOF.
> 
> Signed-off-by: Ross Lagerwall 
> [Rewrite commit message]
> Signed-off-by: Andrew Cooper 

Acked-by: Ian Campbell 



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 5/6] tools/libxl: Extend datacopier to support reading into a buffer

2015-03-18 Thread Ian Campbell
On Mon, 2015-03-16 at 13:29 +, Ross Lagerwall wrote:
> Currently a datacopier may source its data from an fd or local buffer, but its
> destination must be an fd.  For migration v2, libxl needs to read from the
> migration stream into a local buffer.
> 
> Implement a "read into local buffer" mode, invoked when readbuf is set and
> writefd is -1.  On success, the callback passes the number of bytes read.
> 
> Signed-off-by: Ross Lagerwall 
> [Rewrite commit message]
> Signed-off-by: Andrew Cooper 

Acked-by: Ian Campbell 



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Upstream QEMU based stubdom and rump kernel

2015-03-18 Thread Martin Lucina
(Adding some of the Mirage folks to Cc:)

wei.l...@citrix.com said:
> Hi all
> 
> I'm now working on upstream QEMU stubdom, and rump kernel seems to be a
> good fit for this purpose.
> 
> A bit background information. A stubdom is a service domain.  With QEMU
> stubdom we are able to run QEMU device emulation code in a separate
> domain so that bugs in QEMU don't affect Dom0 (the controlling domain).
> Xen currently has a QEMU stubdom, but it's based on our fork of ancient
> QEMU (plus some other libraries and mini-os). Eventually we would like
> to use upstream QEMU in stubdom.
> 
> I've now successfully built QEMU upstream with rump kernel. However to
> make it fully functional as a stubdom, there are some missing pieces to
> be added in.
> 
> 1. The ability to access QMP socket (a unix socket) from Dom0. That
>will be used to issue command to QEMU.
> 2. The ability to access files in Dom0. That will be used to write to /
>read from QEMU state file.

As I understand from Stefano's and Anthony's replies in this thread, both
of the above can be implemented using an AF_UNIX or AF_INET socket on the
QEMU end. Such an implementation would not require anything special done in
QEMU, just telling it which socket to use using existing mechanisms.

So, let's step back a bit: What we need is a trusted communication channel
from a Rump Kernel domU to dom0, using existing socket or socket-like[*]
APIs at both the domU and dom0 ends.

This fits in with a couple of things I hope to make time to work on in the
next couple of months:

 1. Introspection of Rump Kernel domUs for ops purposes, i.e. get some
basic "ps", "top", "vmstat"-like information about what the domU is
doing from the dom0.

 2. Connecting up multiple Rump Kernel domUs and/or Mirage domUs. The
general idea here is that you can have e.g. a Mirage domU running a
HTTP+TLS frontend, communicating with a Rump Kernel domU running PHP +
FastCGI.

The Mirage folks are already doing something similar in their 
Jitsu work, using a protocol called Conduit which runs over vchan.

Now, both of the above require exactly the same underlying mechanism.

Point 2. will further require implementing support in the Rump Kernel,
either for a shim which would proxy AF_UNIX / AF_INET transparently using
vchan, or possibly later implementing a separate socket family (AF_VCHAN /
AF_HYPER?). Once that is done you should be able to just drop it in to
QEMU on Rump.

[*] Aside: What I mean by socket-like is that the implementation does not
need to be in the dom0 kernel, it can just be a user-space library. For
example, see the nanomsg or ZeroMQ APIs, which I have worked on extensively
in the past.

> 3. The building process requires mini-os headers. That will be used
>to build libxc (the controlling library).

As Antti already suggested, if you can use POSIX interfaces rather than
mini-os ones in QEMU, then that would be a better approach.

> One of my lessons learned from the existing stubdom stuffs is that I
> should work with upstream and produce maintainable code. So before I do
> anything for real I'd better consult the community. My gut feeling is
> that the first two requirements are not really Xen specific. Let me know
> what you guys plan and think.

Thanks for getting in touch. I think this is an important discussion!

Martin

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Upstream QEMU based stubdom and rump kernel

2015-03-18 Thread Martin Lucina
po...@rumpkernel.org said:
> etfs isn't a file system, e.g. it doesn't allow listing files or
> removing them, but it does give you complete control of what happens
> when data is read or written for /some/path.  But based on the other
> posts, sounds like it might be enough for what you need.
> 
> See:
> http://man.netbsd.org/cgi-bin/man-cgi?rump_etfs++NetBSD-current

They'd still need to implement the rumphyper/Mini-OS backend to get etfs to
talk over vchan to the dom0, right?

> That's not really a problem, though I do want to limit the amount of
> interface we claim to support with rump kernels.  For example, ISTR
> you mentioned on irc you'd like to use minios wait.h.  It would be
> better to use pthread synchronization instead of minios
> synchronization.  That way, if we do have a need to change the
> underlying threading in the future, you won't run into trouble.
> 
> So, we should just determine what is actually needed and expose
> those bits by default.

+1

Martin

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Upstream QEMU based stubdom and rump kernel

2015-03-18 Thread Martin Lucina
ian.campb...@citrix.com said:
> On Tue, 2015-03-17 at 15:27 +, Wei Liu wrote:
> > This looks most interesting as it implies we can easily pipe a console
> > to it.
> 
> BTW, rather than rawe consoles we should probably consider using the
> channel extension: http://xenbits.xen.org/docs/unstable/misc/channel.txt

What would be the advantage/rationale for using channels rather than vchan?
(See my other reply to this thread)

Martin

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 3/5] xen: print online pCPUs and free pCPUs when dumping

2015-03-18 Thread Dario Faggioli
On Wed, 2015-03-18 at 11:03 +, Jan Beulich wrote:
> >>> On 17.03.15 at 16:33,  wrote:
> > @@ -671,12 +678,17 @@ void dump_runq(unsigned char key)
> >  sched_smt_power_savings? "enabled":"disabled");
> >  printk("NOW=0x%08X%08X\n",  (u32)(now>>32), (u32)now);
> >  
> > +print_cpumap("Online Cpus", &cpu_online_map);
> > +if ( cpumask_weight(&cpupool_free_cpus) )
> 
> This is certainly more expensive than necessary - mind if I change
> this to !cpumask_empty() on commit?
> 
Go ahead... That is indeed what I meant, sorry! :-)

Dario


signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Upstream QEMU based stubdom and rump kernel

2015-03-18 Thread Ian Campbell
On Wed, 2015-03-18 at 12:24 +0100, Martin Lucina wrote:
> ian.campb...@citrix.com said:
> > On Tue, 2015-03-17 at 15:27 +, Wei Liu wrote:
> > > This looks most interesting as it implies we can easily pipe a console
> > > to it.
> > 
> > BTW, rather than rawe consoles we should probably consider using the
> > channel extension: http://xenbits.xen.org/docs/unstable/misc/channel.txt
> 
> What would be the advantage/rationale for using channels rather than vchan?
> (See my other reply to this thread)

Not much really.

About the only relevant difference between vchan and channels(/consoles)
is that there is an existing backend running on most xen systems
(xenconsoled) which can be leveraged in some cases for channels, whereas
vchan would need a specific backend writing for each case.

Apart from that implementation convenience vchan is probably going to be
better in terms of proper integration for the other end.

But iff the decision goes the way of consoles then using channels in
preference to raw consoles makes sense.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 6/6] tools/libxl: Adjust datacopiers POLLHUP handling when the fd is also readable

2015-03-18 Thread Ian Campbell
On Mon, 2015-03-16 at 13:29 +, Ross Lagerwall wrote:
> From: Andrew Cooper 
> 
> POLLHUP|POLLIN is a valid revent to receive when there is readable data in a
> pipe, but the writable fd has been closed.  This occurs in migration v2 when
> the legacy conversion process (which transforms the data inline) completes and
> exits successfully.
> 
> In the case that there is data to read, suppress the POLLHUP.  POSIX states
> that the hangup state is latched[1], which means it will reoccur on subsequent
> poll() calls.  The datacopier is thus provided the opportunity to read until
> EOF, if possible.
> 
> A POLLHUP on its own is treated exactly as before, indicating a different
> error with the fd.
> 
> [1] http://pubs.opengroup.org/onlinepubs/9699919799/functions/poll.html
> 
> Signed-off-by: Andrew Cooper 

Acked-by: Ian Campbell 



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v8 21/21] xl: vNUMA support

2015-03-18 Thread Ian Campbell
On Mon, 2015-03-16 at 09:52 +, Wei Liu wrote:
> This patch includes configuration options parser and documentation.
> 
> Please find the hunk to xl.cfg.pod.5 for more information.
> 
> Signed-off-by: Wei Liu 

Acked-by: Ian Campbell 

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v8 18/21] libxlu: record location when parsing values

2015-03-18 Thread Ian Campbell
On Mon, 2015-03-16 at 09:52 +, Wei Liu wrote:
> Originally only setting has line number recorded. Since we're moving to
> more sophisticated API, record the location for individual value. It is
> useful for error reporting.
> 
> Signed-off-by: Wei Liu 
Acked-by: Ian Campbell 

> Cc: Ian Jackson 
> ---
> Changes in v8:
> 1. Define YYLTYPE in libxl_internal.h and get rid of my hack.

AIUI this was Ian's only concern last time and AFAICT you have addressed
it in the way he intended, so I am going to apply.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2] Build: Fix stubdom vtpm build failure

2015-03-18 Thread Ian Campbell
On Tue, 2015-03-17 at 10:20 +0100, Olaf Hering wrote:
> On Mon, Mar 16, Quan Xu wrote:
> 
> > Typedefs are duplicated in stubdom/vtpmmgr/tcg.h and supported compilers
> > do not cope with current staging branch.
> 
> This version finally compiles. Thanks!

Thanks. I translated this into a Tested-by and acked it myself,
committed.

I made the subject "stubdom: fix vtpm build failure due to duplicated
typedefs.", I hope that's ok.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 0/6] tools/libxl: Improvements to libxl__datacopier

2015-03-18 Thread Ian Campbell
On Mon, 2015-03-16 at 13:27 +, Ross Lagerwall wrote:
> To support migration v2, we need far more flexibility out of the datacopier.

Applied, thanks.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2] Build: Fix stubdom vtpm build failure

2015-03-18 Thread Xu, Quan
Thanks, ok.

Quan

> -Original Message-
> From: Ian Campbell [mailto:ian.campb...@citrix.com]
> Sent: Wednesday, March 18, 2015 8:02 PM
> To: Olaf Hering
> Cc: Xu, Quan; xen-devel@lists.xen.org; dgde...@tycho.nsa.gov;
> andrew.coop...@citrix.com
> Subject: Re: [PATCH v2] Build: Fix stubdom vtpm build failure
> 
> On Tue, 2015-03-17 at 10:20 +0100, Olaf Hering wrote:
> > On Mon, Mar 16, Quan Xu wrote:
> >
> > > Typedefs are duplicated in stubdom/vtpmmgr/tcg.h and supported
> > > compilers do not cope with current staging branch.
> >
> > This version finally compiles. Thanks!
> 
> Thanks. I translated this into a Tested-by and acked it myself, committed.
> 
> I made the subject "stubdom: fix vtpm build failure due to duplicated 
> typedefs.",
> I hope that's ok.
> 
> Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 2/2] tools/libxl/libxl_qmp.c: Close fds on error path in qmp_open

2015-03-18 Thread Ian Campbell
On Wed, 2015-03-18 at 10:15 +, Wei Liu wrote:
> On Wed, Mar 18, 2015 at 06:50:35AM +, PRAMOD DEVENDRA wrote:
> > From: Pramod Devendra 
> > 
> > Signed-off-by: Pramod Devendra 
> > CC: Ian Jackson 
> > CC: Stefano Stabellini 
> > CC: Ian Campbell 
> > CC: Wei Liu 
> 
> Thanks.
> 
> The patch content looks good.

Actually I think it should use the goto out error handling style, rather
than closing the fd on every single path.

i.e. the tail should be:

 out:
  if (ret == -1 && qmp->qmp_fd > -1) close(qmp->qmp_fd);

  return ret;

and ret should be initialised to -1 at the top, and each return should
be a goto.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 1/3] libxl: In domain death search, start search at first domid we want

2015-03-18 Thread Ian Campbell
On Tue, 2015-03-17 at 09:30 -0600, Jim Fehlig wrote:
> From: Ian Jackson 
> 
> From: Ian Jackson 
> 
> When domain_death_xswatch_callback needed a further call to
> xc_domain_getinfolist it would restart it with the last domain it
> found rather than the first one it wants.
> 
> If it only wants one it will also only ask for one domain.  The result
> would then be that it gets the previous domain again (ie, the previous
> one to the one it wants), which still doesn't reveal the answer to the
> question, and it would therefore loop again.
> 
> It's completely unclear to me why I thought it was a good idea to
> start the xc_domain_getinfolist with the last domain previously found
> rather than the first one left un-confirmed.  The code has been that
> way since it was introduced.

Is it because the xc_domain_getinfolist will fetch at most:
int nentries = LIBXL_TAILQ_NEXT(evg, entry) ? 200 : 1;
entries?

After your change then if the domid we are looking for is the 201st
domain then won't we just keep going round looking at the first 200
(undying) domains?

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 2/3] libxl: Domain destroy: unlock userdata earlier

2015-03-18 Thread Ian Campbell
On Tue, 2015-03-17 at 17:34 +, Wei Liu wrote:
> On Tue, Mar 17, 2015 at 09:30:58AM -0600, Jim Fehlig wrote:
> > From: Ian Jackson 
> > 
> > Unlock the userdata before we actually call xc_domain_destroy.  This
> > leaves open the possibility that other libxl callers will see the
> > half-destroyed domain (with no devices, paused), but this is fine.
> > 
> > Signed-off-by: Ian Jackson 
> > CC: Wei Liu 
> > Reviewed-by: Jim Fehlig 
> > Tested-by: Jim Fehlig 
> 
> Acked-by: Wei Liu 

Acked-by: Ian Campbell 

I'm not sure if this is safe/sensible to apply without the preceding
patch which I had a comment on.

Ian.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 3/3] libxl: Domain destroy: fork

2015-03-18 Thread Ian Campbell
On Tue, 2015-03-17 at 09:30 -0600, Jim Fehlig wrote:
> From: Ian Jackson 
> 
> Call xc_domain_destroy in a subprocess.  That allows us to do so
> asynchronously, rather than blocking the whole process calling libxl.
> 
> The changes in detail:
> 
>  * Provide an libxl__ev_child in libxl__domain_destroy_state, and
>initialise it in libxl__domain_destroy.  There is no possibility
>to `clean up' a libxl__ev_child, but there need to clean it up, as
   ^is no ?

>the control flow ensures that we only continue after the child has
>exited.
> 
>  * Call libxl__ev_child_fork at the right point and put the call to
>xc_domain_destroy and associated logging in the child.  (The child
>opens a new xenctrl handle because we mustn't use the parent's.)

Do I read it correctly that this is arranging to use the same logger as
the parent was, IOW the log messages will still go somewhere the
application considers useful?

> 
>  * Consequently, the success return path from domain_destroy_domid_cb
>no longer calls dis->callback.  Instead it simply returns.
> 
>  * We plumb the errorno value through the child's exit status, if it
>fits.  This means we normally do the logging only in the parent.

(This makes my question above moot I think).

>  * Incidentally, we fix the bug that we were treating the return value
>from xc_domain_destroy as an errno value when in fact it is a
>return value from do_domctl (in this case, 0 or -1 setting errno).
> 
> Signed-off-by: Ian Jackson 
> Reviewed-by: Jim Fehlig 
> Tested-by: Jim Fehlig 
> ---
>  tools/libxl/libxl.c  | 57 
> 
>  tools/libxl/libxl_internal.h |  1 +
>  2 files changed, 53 insertions(+), 5 deletions(-)
> 
> diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
> index b6541d4..b43db1a 100644
> --- a/tools/libxl/libxl.c
> +++ b/tools/libxl/libxl.c
> @@ -1481,6 +1481,10 @@ static void domain_destroy_callback(libxl__egc *egc,
>  static void destroy_finish_check(libxl__egc *egc,
>   libxl__domain_destroy_state *dds);
>  
> +static void domain_destroy_domid_cb(libxl__egc *egc,
> +libxl__ev_child *destroyer,
> +pid_t pid, int status);

This seems misplaced, doesn't it want to go before libxl__destroy_domid
just after the prototype of devices_destroy_cb, which is where it lives
in the sequence?

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 3/3] libxl: Domain destroy: fork

2015-03-18 Thread Ian Campbell
On Tue, 2015-03-17 at 18:04 +, Wei Liu wrote:
> > +} else {
> > +LOGE(ERROR,
> > + "xc_domain_destroy failed for %d (with difficult errno value %d)",
> 
> Indentation is wrong.

It is preferable to pull back the indentation of a string literal rather
than wrap it in order to preserve grep-ability.

How far back to pull it is a matter of preference I guess.

Ian.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] flask/policy: fix static device labeling examples

2015-03-18 Thread Ian Campbell
On Wed, 2015-03-11 at 10:59 -0400, Daniel De Graaf wrote:
> The definitions of static device labels must be placed at the end of the
> policy.conf before passing it to checkpolicy; the existing examples
> (which are commented out) are in the wrong location.  Create a new file
> for device contexts which will place them in the proper location.
> 
> This also removes some directions about using the xen policy type in
> checkpolicy which is no longer needed.
> 
> Reported-by: Julien Grall 
> Signed-off-by: Daniel De Graaf 

Acked + applied. Thanks.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v8 00/21] Virtual NUMA for PV and HVM

2015-03-18 Thread Ian Campbell
On Mon, 2015-03-16 at 09:52 +, Wei Liu wrote:
> Hi all
> 
> This is version 8 of this series rebased on top of master.

Applied, thanks. (except patch #1 which was already in).

Ian.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 2/3] tools/libxl/libxl_cpuid.c: Fix leak of resstr on error path

2015-03-18 Thread Ian Campbell
On Tue, 2015-03-17 at 15:44 +, Wei Liu wrote:
> On Mon, Mar 16, 2015 at 10:06:17AM +, PRAMOD DEVENDRA wrote:
> > From: Pramod Devendra 
> > 
> > Signed-off-by: Pramod Devendra 
> > CC: Ian Jackson 
> > CC: Stefano Stabellini 
> > CC: Ian Campbell 
> > CC: Wei Liu 
> 
> Acked-by: Wei Liu 

Applied.

> >  resstr = entry->policy[flag->reg - 1];
> > -if (resstr == NULL) {
> > -resstr = strdup("");
> > -}
> 
> Minor nit. I would prefer "resstr = " be grouped with the code you
> moved. No need to resend though.

Do you mean the restr = entry->... bit?



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 3/3] tools/libxc/xc_linux_osdep.c: Don't leak mmap() mapping on map_foreign_bulk() error path

2015-03-18 Thread Ian Campbell
On Tue, 2015-03-17 at 15:45 +, Wei Liu wrote:
> On Mon, Mar 16, 2015 at 10:06:50AM +, PRAMOD DEVENDRA wrote:
> > From: Pramod Devendra 
> > 
> > Signed-off-by: Pramod Devendra 
> > CC: Ian Jackson 
> > CC: Stefano Stabellini 
> > CC: Ian Campbell 
> > CC: Wei Liu 
> 
> Acked-by: Wei Liu 

Applied, thanks.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] tools/libxl: avoid comparing an unsigned int to -1

2015-03-18 Thread Ian Campbell
On Tue, 2015-03-17 at 15:55 +, Wei Liu wrote:
> On Mon, Mar 16, 2015 at 10:12:34AM +, Koushik Chakravarty wrote:
> > Signed-off-by: Koushik Chakravarty 
> > CC: Ian Jackson 
> > CC: Stefano Stabellini 
> > CC: Ian Campbell 
> > CC: Wei Liu 
> 
> Acked-by: Wei Liu 

Applied, thanks.

> Ian J, this one should be backported to 4.5.




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH] tools/libxl: close the logfile_w and null file descriptors in libxl__spawn_qdisk_backend() error path

2015-03-18 Thread Koushik Chakravarty
Signed-off-by: Koushik Chakravarty 
CC: Ian Jackson 
CC: Stefano Stabellini 
CC: Ian Campbell 
CC: Wei Liu 
---
 tools/libxl/libxl_dm.c |8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index cb006df..2678be2 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -1508,7 +1508,7 @@ void libxl__spawn_qdisk_backend(libxl__egc *egc, 
libxl__dm_spawn_state *dmss)
 flexarray_t *dm_args;
 char **args;
 const char *dm;
-int logfile_w, null, rc;
+int logfile_w, null = -1, rc;
 uint32_t domid = dmss->guest_domid;
 
 /* Always use qemu-xen as device model */
@@ -1534,6 +1534,10 @@ void libxl__spawn_qdisk_backend(libxl__egc *egc, 
libxl__dm_spawn_state *dmss)
 goto error;
 }
 null = open("/dev/null", O_RDONLY);
+if (null < 0) {
+   rc = ERROR_FAIL;
+   goto error;
+}
 
 dmss->guest_config = NULL;
 /*
@@ -1568,6 +1572,8 @@ void libxl__spawn_qdisk_backend(libxl__egc *egc, 
libxl__dm_spawn_state *dmss)
 
 error:
 assert(rc);
+if (logfile_w >= 0) close(logfile_w);
+if (null >= 0) close(null);
 dmss->callback(egc, dmss, rc);
 return;
 }
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 0/3] Xen/FLASK policy updates for device contexts

2015-03-18 Thread Stephen Smalley
On 03/17/2015 04:43 PM, Daniel De Graaf wrote:
> In order to support assigning security lables to ARM device tree nodes
> in Xen's XSM policy, a new ocontext type is needed in the security
> policy.
> 
> In addition to adding the new ocontext, the existing I/O memory range
> ocontext is expanded to 64 bits in order to support hardware with more
> than 44 bits of physical address space (32-bit count of 4K pages).
> 
> Changes from v2:
>  - Clean up printf format strings for 32-bit builds
> 
> Changes from v1:
>  - Use policy version 30 instead of forking the version numbers for Xen;
>this removes the need for v1's patch 3.
>  - Report an error when attempting to use an I/O memory range that
>requires a 64-bit representation with an old policy output version
>that cannot support this
>  - Fix a few incorrect references to PCIDEVICECON
>  - Reorder patches to clarify the allowed characterset of device tree
>paths
> 
> [PATCH 1/3] checkpolicy: Expand allowed character set in paths
> [PATCH 2/3] libsepol, checkpolicy: widen Xen IOMEM ocontext entries
> [PATCH 3/3] libsepol, checkpolicy: add device tree ocontext nodes to

Thanks, applied all three.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] (v2) VT-d Posted-interrupt (PI) design for XEN

2015-03-18 Thread Wu, Feng
VT-d Posted-interrupt (PI) design for XEN

Background
==
With the development of virtualization, there are more and more device
assignment requirements. However, today when a VM is running with
assigned devices (such as, NIC), external interrupt handling for the assigned
devices always needs VMM intervention.

VT-d Posted-interrupt is a more enhanced method to handle interrupts
in the virtualization environment. Interrupt posting is the process by
which an interrupt request is recorded in a memory-resident
posted-interrupt-descriptor structure by the root-complex, followed by
an optional notification event issued to the CPU complex.

With VT-d Posted-interrupt we can get the following advantages:
- Direct delivery of external interrupts to running vCPUs without VMM
intervention
- Decrease the interrupt migration complexity. On vCPU migration, software
can atomically co-migrate all interrupts targeting the migrating vCPU. For
virtual machines with assigned devices, migrating a vCPU across pCPUs
either incur the overhead of forwarding interrupts in software (e.g. via VMM
generated IPIS), or complexity to independently migrate each interrupt targeting
the vCPU to the new pCPU. However, after enabling VT-d PI, the destination vCPU
of an external interrupt from assigned devices is stored in the IRTE (i.e.
Posted-interrupt Descriptor Address), when vCPU is migrated to another pCPU,
we will set this new pCPU in the 'NDST' filed of Posted-interrupt descriptor, 
this
make the interrupt migration automatic.


Posted-interrupt Introduction

There are two components to the Posted-interrupt architecture:
Processor Support and Root-Complex Support

- Processor Support
Posted-interrupt processing is a feature by which a processor processes
the virtual interrupts by recording them as pending on the virtual-APIC
page.

Posted-interrupt processing is enabled by setting the "process posted
interrupts" VM-execution control. The processing is performed in response
to the arrival of an interrupt with the posted-interrupt notification vector.
In response to such an interrupt, the processor processes virtual interrupts
recorded in a data structure called a posted-interrupt descriptor.

More information about APICv and CPU-side Posted-interrupt, please refer
to Chapter 29, and Section 29.6 in the Intel SDM:
http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-manual-325462.pdf

- Root-Complex Support
Interrupt posting is the process by which an interrupt request (from IOAPIC
or MSI/MSIx capable sources) is recorded in a memory-resident
posted-interrupt-descriptor structure by the root-complex, followed by
an optional notification event issued to the CPU complex. The interrupt
request arriving at the root-complex carry the identity of the interrupt
request source and a 'remapping-index'. The remapping-index is used to
look-up an entry from the memory-resident interrupt-remap-table. Unlike
with interrupt-remapping, the interrupt-remap-table-entry for a posted-
interrupt, specifies a virtual-vector and a pointer to the posted-interrupt
descriptor. The virtual-vector specifies the vector of the interrupt to be
recorded in the posted-interrupt descriptor. The posted-interrupt descriptor
hosts storage for the virtual-vectors and contains the attributes of the
notification event (interrupt) to be issued to the CPU complex to inform
CPU/software about pending interrupts recorded in the posted-interrupt
descriptor.

More information about VT-d PI, please refer to
http://www.intel.com/content/www/us/en/intelligent-systems/intel-technology/vt-directed-io-spec.html

Important Definitions
==
There are some changes to IRTE and posted-interrupt descriptor after
VT-d PI is introduced:
IRTE:
Posted-interrupt Descriptor Address: the address of the posted-interrupt 
descriptor
Virtual Vector: the guest vector of the interrupt
URG: indicates if the interrupt is urgent

Posted-interrupt descriptor:
The Posted Interrupt Descriptor hosts the following fields:
Posted Interrupt Request (PIR): Provide storage for posting (recording) 
interrupts (one bit
per vector, for up to 256 vectors).

Outstanding Notification (ON): Indicate if there is a notification event 
outstanding (not
processed by processor or software) for this Posted Interrupt Descriptor. When 
this field is 0,
hardware modifies it from 0 to 1 when generating a notification event, and the 
entity receiving
the notification event (processor or software) resets it as part of posted 
interrupt processing.

Suppress Notification (SN): Indicate if a notification event is to be 
suppressed (not
generated) for non-urgent interrupt requests (interrupts processed through an 
IRTE with
URG=0).

Notification Vector (NV): Specify the vector for notification event (interrupt).

Notification Destination (NDST): Specify the physical APIC-ID of the 
destination logical
processor for the notification event.

Re: [Xen-devel] Upstream QEMU based stubdom and rump kernel

2015-03-18 Thread Stefano Stabellini
On Wed, 18 Mar 2015, Ian Campbell wrote:
> On Wed, 2015-03-18 at 12:24 +0100, Martin Lucina wrote:
> > ian.campb...@citrix.com said:
> > > On Tue, 2015-03-17 at 15:27 +, Wei Liu wrote:
> > > > This looks most interesting as it implies we can easily pipe a console
> > > > to it.
> > > 
> > > BTW, rather than rawe consoles we should probably consider using the
> > > channel extension: http://xenbits.xen.org/docs/unstable/misc/channel.txt
> > 
> > What would be the advantage/rationale for using channels rather than vchan?
> > (See my other reply to this thread)
> 
> Not much really.
> 
> About the only relevant difference between vchan and channels(/consoles)
> is that there is an existing backend running on most xen systems
> (xenconsoled) which can be leveraged in some cases for channels, whereas
> vchan would need a specific backend writing for each case.
> 
> Apart from that implementation convenience vchan is probably going to be
> better in terms of proper integration for the other end.
> 
> But iff the decision goes the way of consoles then using channels in
> preference to raw consoles makes sense.

I think that for simplicity's sake and to limit dependencies on the
system, using consoles for low bandwidth channels, such as QMP, is
preferable.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/5] libxl: Add ERROR_NOTFOUND for libxl_domain_info when it cannot find the domain

2015-03-18 Thread Ian Campbell
On Fri, 2015-03-13 at 16:26 -0400, Konrad Rzeszutek Wilk wrote:
> And use that for all of its callers in the tree.

I wonder -- is it better to have a generic ERROR_NOTFOUND for anything
which is not found, or to have a specific ERROR_DOMAIN_NOTFOUND
(likewise for other things).

Any opinions? I'm inclined towards the latter, e.g. imagine
disk_remove(domid, disk) -- ERROR_NOTFOUND doesn't tell you if the disk
or the domain is missing. (this example may or may not be hypothetical,
I didn't chekc).

> @@ -454,6 +454,8 @@ int libxl__domain_rename(libxl__gc *gc, uint32_t domid,
>  
>  /* update /vm//name */
>  rc = libxl_domain_info(ctx, &info, domid);
> +if (rc == ERROR_NOTFOUND)
> +goto x_rc;
>  if (rc)
>  goto x_fail;

Might as well us x_rc for all failures, there seem little point in
squashing whatever libxl_domain_info returned into ERROR_FAIL for the
other cases either.

> @@ -5415,11 +5417,12 @@ static int libxl__set_vcpuonline_xenstore(libxl__gc 
> *gc, uint32_t domid,
>  libxl_dominfo info;
>  char *dompath;
>  xs_transaction_t t;
> -int i, rc = ERROR_FAIL;
> +int i, rc;
>  
>  libxl_dominfo_init(&info);
>  
> -if (libxl_domain_info(CTX, &info, domid) < 0) {
> +rc = libxl_domain_info(CTX, &info, domid);
> +if (rc < 0) {
>  LOGE(ERROR, "getting domain info list");
>  goto out;
>  }

I think rc needs setting to ERROR_FAIL after this to retain the same
behaviour for the subsequent failures of e.g. libxl__xs_get_dompath
which doesn't otherwise set rc and previous relied on the ERROR_FAIL
from above.

> diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
> index 5eec092..5164371 100644
> --- a/tools/libxl/libxl.h
> +++ b/tools/libxl/libxl.h
> @@ -1088,7 +1088,9 @@ int libxl_console_get_tty(libxl_ctx *ctx, uint32_t 
> domid, int cons_num,
>   */
>  int libxl_primary_console_get_tty(libxl_ctx *ctx, uint32_t domid_vm, char 
> **path);
>  
> -/* May be called with info_r == NULL to check for domain's existance */
> +/* May be called with info_r == NULL to check for domain's existance.

Could you fix the spelling of "existence" while you touch this line
please ;-)

> diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
> index 47af340..69a91cc 100644
> --- a/tools/libxl/libxl_types.idl
> +++ b/tools/libxl/libxl_types.idl
> @@ -63,6 +63,7 @@ libxl_error = Enumeration("error", [
>  (-17, "DEVICE_EXISTS"),
>  (-18, "REMUS_DEVOPS_DOES_NOT_MATCH"),
>  (-19, "REMUS_DEVICE_NOT_SUPPORTED"),
> +(-20, "NOTFOUND"),
>  ], value_namespace = "")

Do we need a #define LIBXL_HAVE_ERROR_NOTFOUND to indicate this is
available, I think we probably do.

> @@ -7495,7 +7495,8 @@ int main_cpupoolnumasplit(int argc, char **argv)
>  goto out;
>  }
>  for (c = 0; c < 10; c++) {
> -if (libxl_domain_info(ctx, &info, 0)) {
> +ret = -libxl_domain_info(ctx, &info, 0);
> +if (ret) {

I think we don't need to do this bit. (The current inconsistent exit
codes form xl notwithstanding)



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 2/5] libxl: vcpuset: Return error values.

2015-03-18 Thread Ian Campbell
On Fri, 2015-03-13 at 16:26 -0400, Konrad Rzeszutek Wilk wrote:
> The function does not return any values at all. Convert the
> internal libxl ones (ERROR_FAIL, ..., etc) to positive values
> and for the other cases just return standard libxl values.

It's not clear why you want to do this, in particular returning
-ERROR_INVAL and inverting libxl error codes seems like a very strange
thing to be doing.

I think you should either use ERROR_INVAL (not inverted) and propagate
libxl rc's directly or convert them into something which suits xl, i.e.
0 and 1.

> Signed-off-by: Konrad Rzeszutek Wilk 
> ---
>  tools/libxl/xl_cmdimpl.c | 23 +--
>  1 file changed, 13 insertions(+), 10 deletions(-)
> 
> diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
> index 2d7145f..454a895 100644
> --- a/tools/libxl/xl_cmdimpl.c
> +++ b/tools/libxl/xl_cmdimpl.c
> @@ -5013,17 +5013,18 @@ int main_vcpupin(int argc, char **argv)
>  return rc;
>  }
>  
> -static void vcpuset(uint32_t domid, const char* nr_vcpus, int check_host)
> +static int vcpuset(uint32_t domid, const char* nr_vcpus, int check_host)
>  {
>  char *endptr;
>  unsigned int max_vcpus, i;
>  libxl_bitmap cpumap;
> +int rc;
>  
>  libxl_bitmap_init(&cpumap);
>  max_vcpus = strtoul(nr_vcpus, &endptr, 10);
>  if (nr_vcpus == endptr) {
>  fprintf(stderr, "Error: Invalid argument.\n");
> -return;
> +return -ERROR_INVAL;
>  }
>  
>  /*
> @@ -5036,22 +5037,25 @@ static void vcpuset(uint32_t domid, const char* 
> nr_vcpus, int check_host)
>  fprintf(stderr, "You are overcommmitting! You have %d physical " 
> \
>  " CPUs and want %d vCPUs! Aborting, use --ignore-host to 
> " \
>  " continue\n", host_cpu, max_vcpus);
> -return;
> +return -ERROR_INVAL;
>  }
>  /* NB: This also limits how many are set in the bitmap */
>  max_vcpus = (max_vcpus > host_cpu ? host_cpu : max_vcpus);
>  }
> -if (libxl_cpu_bitmap_alloc(ctx, &cpumap, max_vcpus)) {
> -fprintf(stderr, "libxl_cpu_bitmap_alloc failed\n");
> -return;
> +rc = libxl_cpu_bitmap_alloc(ctx, &cpumap, max_vcpus);
> +if (rc) {
> +fprintf(stderr, "libxl_cpu_bitmap_alloc failed, rc: %d\n", rc);
> +return -rc;
>  }
>  for (i = 0; i < max_vcpus; i++)
>  libxl_bitmap_set(&cpumap, i);
>  
> -if (libxl_set_vcpuonline(ctx, domid, &cpumap) < 0)
> -fprintf(stderr, "libxl_set_vcpuonline failed domid=%d 
> max_vcpus=%d\n", domid, max_vcpus);
> +rc = libxl_set_vcpuonline(ctx, domid, &cpumap);
> +if (rc)
> +fprintf(stderr, "libxl_set_vcpuonline failed domid=%d max_vcpus=%d, 
> rc: %d\n", domid, max_vcpus, rc);
>  
>  libxl_bitmap_dispose(&cpumap);
> +return rc ? -rc : 0;
>  }
>  
>  int main_vcpuset(int argc, char **argv)
> @@ -5070,8 +5074,7 @@ int main_vcpuset(int argc, char **argv)
>  break;
>  }
>  
> -vcpuset(find_domain(argv[optind]), argv[optind + 1], check_host);
> -return 0;
> +return vcpuset(find_domain(argv[optind]), argv[optind + 1], check_host);
>  }
>  
>  static void output_xeninfo(void)



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 2/5] libxl: vcpuset: Return error values.

2015-03-18 Thread Ian Campbell
On Wed, 2015-03-18 at 13:06 +, Ian Campbell wrote:
> On Fri, 2015-03-13 at 16:26 -0400, Konrad Rzeszutek Wilk wrote:
> > The function does not return any values at all. Convert the
> > internal libxl ones (ERROR_FAIL, ..., etc) to positive values
> > and for the other cases just return standard libxl values.
> 
> It's not clear why you want to do this, in particular returning
> -ERROR_INVAL and inverting libxl error codes seems like a very strange
> thing to be doing.

BTW I know the xl error handling is horribly confused, and there are
even a small number of instances of -ERROR_* already, but I think those
are wrong and we shouldn't introduce more.

> 
> I think you should either use ERROR_INVAL (not inverted) and propagate
> libxl rc's directly or convert them into something which suits xl, i.e.
> 0 and 1.
> 
> > Signed-off-by: Konrad Rzeszutek Wilk 
> > ---
> >  tools/libxl/xl_cmdimpl.c | 23 +--
> >  1 file changed, 13 insertions(+), 10 deletions(-)
> > 
> > diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
> > index 2d7145f..454a895 100644
> > --- a/tools/libxl/xl_cmdimpl.c
> > +++ b/tools/libxl/xl_cmdimpl.c
> > @@ -5013,17 +5013,18 @@ int main_vcpupin(int argc, char **argv)
> >  return rc;
> >  }
> >  
> > -static void vcpuset(uint32_t domid, const char* nr_vcpus, int check_host)
> > +static int vcpuset(uint32_t domid, const char* nr_vcpus, int check_host)
> >  {
> >  char *endptr;
> >  unsigned int max_vcpus, i;
> >  libxl_bitmap cpumap;
> > +int rc;
> >  
> >  libxl_bitmap_init(&cpumap);
> >  max_vcpus = strtoul(nr_vcpus, &endptr, 10);
> >  if (nr_vcpus == endptr) {
> >  fprintf(stderr, "Error: Invalid argument.\n");
> > -return;
> > +return -ERROR_INVAL;
> >  }
> >  
> >  /*
> > @@ -5036,22 +5037,25 @@ static void vcpuset(uint32_t domid, const char* 
> > nr_vcpus, int check_host)
> >  fprintf(stderr, "You are overcommmitting! You have %d physical 
> > " \
> >  " CPUs and want %d vCPUs! Aborting, use --ignore-host 
> > to " \
> >  " continue\n", host_cpu, max_vcpus);
> > -return;
> > +return -ERROR_INVAL;
> >  }
> >  /* NB: This also limits how many are set in the bitmap */
> >  max_vcpus = (max_vcpus > host_cpu ? host_cpu : max_vcpus);
> >  }
> > -if (libxl_cpu_bitmap_alloc(ctx, &cpumap, max_vcpus)) {
> > -fprintf(stderr, "libxl_cpu_bitmap_alloc failed\n");
> > -return;
> > +rc = libxl_cpu_bitmap_alloc(ctx, &cpumap, max_vcpus);
> > +if (rc) {
> > +fprintf(stderr, "libxl_cpu_bitmap_alloc failed, rc: %d\n", rc);
> > +return -rc;
> >  }
> >  for (i = 0; i < max_vcpus; i++)
> >  libxl_bitmap_set(&cpumap, i);
> >  
> > -if (libxl_set_vcpuonline(ctx, domid, &cpumap) < 0)
> > -fprintf(stderr, "libxl_set_vcpuonline failed domid=%d 
> > max_vcpus=%d\n", domid, max_vcpus);
> > +rc = libxl_set_vcpuonline(ctx, domid, &cpumap);
> > +if (rc)
> > +fprintf(stderr, "libxl_set_vcpuonline failed domid=%d 
> > max_vcpus=%d, rc: %d\n", domid, max_vcpus, rc);
> >  
> >  libxl_bitmap_dispose(&cpumap);
> > +return rc ? -rc : 0;
> >  }
> >  
> >  int main_vcpuset(int argc, char **argv)
> > @@ -5070,8 +5074,7 @@ int main_vcpuset(int argc, char **argv)
> >  break;
> >  }
> >  
> > -vcpuset(find_domain(argv[optind]), argv[optind + 1], check_host);
> > -return 0;
> > +return vcpuset(find_domain(argv[optind]), argv[optind + 1], 
> > check_host);
> >  }
> >  
> >  static void output_xeninfo(void)
> 



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] OpenStack - Libvirt+Xen CI overview : anyone interested in a walk/through or presentation on how it works

2015-03-18 Thread Lars Kurth
Hi,

I added all the people who raised their hands so far to the TO list. As far as 
I can tell we have people from the following timezones: GMT, GMT+1, MTZ, ETZ - 
if I got this wrong, please let me know your timezone. Depending on when we 
have the presentation, we will need to consider daylight savings offsets.

I will give it another few days for more people to raise their hands. We can 
then try and figure out a slot which fits everyone.

Best Regards
Lars

Begin forwarded message:

From: Bob Ball 
To: "xen-devel@lists.xen.org" 
Date: 10 March 2015 12:03:13 GMT
Cc: Anthony Perard 
Subject: [Xen-devel] OpenStack - Libvirt+Xen CI overview

For the last few weeks Anthony and I have been working on creating a CI 
environment to run against all OpenStack jobs.  We're now in a position where 
we can share the current status, overview of how it works and next steps.  We 
actively want to support involvement in this effort from others with an 
interest in libvirt+Xen's openstack integration.

The CI we have set up is follow the recommendations made by the OpenStack 
official infrastructure maintainers, and reproduces a notable portion of the 
official OpenStack CI environment to run these tests.  Namely this setup is 
using:
- Puppet to deploy the master node
- Zuul to watch for code changes uploaded to review.openstack.org
- Jenkins job builder to create Jenkins job definitions from a YAML file
- Nodepool to automatically create single-use virtual machines in the Rackspace 
public cloud 
- Devstack-gate to run Tempest tests in serial

More information on Zuul, JJB, Nodepool and devstack-gate is available through 
http://ci.openstack.org

The current status is that we have a zuul instance monitoring for jobs and 
adding them to the queue of jobs to be run at 
http://zuul.openstack.xenproject.org/

In the background Nodepool provisions virtual machines into a pool of nodes 
ready to be used.  All ready nodes are automatically added to Jenkins 
(https://jenkins.openstack.xenproject.org/), and then Zuul+Jenkins will trigger 
a particular job on a node when one is available.

Logs are then uploaded to Rackspace's Cloud Files with sample logs for a 
passing job at 
http://logs.openstack.xenproject.org/52/162352/3/silent/dsvm-tempest-xen/da3ff30/index.html

I'd like to organise a meeting to walk through the various components of the CI 
with those who are interested, so this is an initial call to find out who is 
interested in finding out more!

Thanks,

Bob

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Upstream QEMU based stubdom and rump kernel

2015-03-18 Thread Antti Kantee

On 18/03/15 11:22, Martin Lucina wrote:

po...@rumpkernel.org said:

etfs isn't a file system, e.g. it doesn't allow listing files or
removing them, but it does give you complete control of what happens
when data is read or written for /some/path.  But based on the other
posts, sounds like it might be enough for what you need.

See:
http://man.netbsd.org/cgi-bin/man-cgi?rump_etfs++NetBSD-current


They'd still need to implement the rumphyper/Mini-OS backend to get etfs to
talk over vchan to the dom0, right?


Strictly speaking, they'd have to implement the iov{read,write} 
hypercalls to do that.  But, no, etfs doesn't do magic.  IOW, they'd 
have to define what "host path" means.


It occurred to me that I wrote that manpage, umm, 5 years ago when rump 
kernels ran only in userspace and "host path" was a better defined term. 
 Pile that manpage on the neverending heap of documentation which broke 
while the code kept working.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 5/5] libxl: vcpu-set - allow to decrease vcpu count on overcommitted guests (v3)

2015-03-18 Thread Ian Campbell
On Fri, 2015-03-13 at 16:26 -0400, Konrad Rzeszutek Wilk wrote:
> We have a check to warn the user if they are overcommitting.
> But the check only checks the hosts CPU amount and does
> not take into account the case when the user is trying to fix
> the overcommit. That is - they want to limit the amount of
> online VCPUs.
> 
> This fix allows the user to offline vCPUs without any
> warnings when they are running an overcommitted guest.
> 
> Also fix the extra space in the message.
> 
> Signed-off-by: Konrad Rzeszutek Wilk 

Acked-by: Ian Campbell 



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 3/5] libxl: vcpuset: Check max_vcpus argument against the maximum number of vCPUs the guest has set.

2015-03-18 Thread Ian Campbell
On Fri, 2015-03-13 at 16:26 -0400, Konrad Rzeszutek Wilk wrote:
> The maximum number of VCPUs the guest can have is determined during
> domain creation and is set by 'maxvcpus' parameter (in the guest
> config). Trying to set the amount of vCPUs above said value
> in vcpuset will result in an error - and we can catch it here
> (instead of later in the function) and print a nice warning to the user.

I suppose the "later in the function" is in the return value of
libxl_set_vcpuonline?

I think this check would be better off done inside the library, i.e. in
libxl_set_vcpuonline, with a (probably new) suitable error code. The
libxl can log and xl can spot the situation and log something specific
(if indeed it needs to once libxl does).

Ian.

> 
> Signed-off-by: Konrad Rzeszutek Wilk 
> [v2: Check ERROR_NOTFOUND return value]
> ---
>  tools/libxl/xl_cmdimpl.c | 16 +++-
>  1 file changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
> index 454a895..ba0fd71 100644
> --- a/tools/libxl/xl_cmdimpl.c
> +++ b/tools/libxl/xl_cmdimpl.c
> @@ -5018,6 +5018,7 @@ static int vcpuset(uint32_t domid, const char* 
> nr_vcpus, int check_host)
>  char *endptr;
>  unsigned int max_vcpus, i;
>  libxl_bitmap cpumap;
> +libxl_dominfo dominfo;
>  int rc;
>  
>  libxl_bitmap_init(&cpumap);
> @@ -5026,7 +5027,20 @@ static int vcpuset(uint32_t domid, const char* 
> nr_vcpus, int check_host)
>  fprintf(stderr, "Error: Invalid argument.\n");
>  return -ERROR_INVAL;
>  }
> -
> +rc = libxl_domain_info(ctx, &dominfo, domid);
> +if (rc == ERROR_NOTFOUND) {
> +fprintf(stderr, "Error: Domain %u does not exist.\n", domid);
> +return -rc;
> +}
> +if (rc) {
> +fprintf(stderr, "libxl_domain_info failed (code %d).\n", rc);
> +return -rc;
> +}
> +if (max_vcpus > dominfo.vcpu_max_id + 1) {
> +fprintf(stderr, "You have a max of %d vCPUs and you want %d 
> vCPUs!\n",
> +dominfo.vcpu_max_id + 1, max_vcpus);
> +return -ERROR_INVAL;
> +}
>  /*
>   * Maximum amount of vCPUS the guest is allowed to set is limited
>   * by the host's amount of pCPUs.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xenstore: use relative path for device-model node

2015-03-18 Thread Ian Campbell
On Fri, 2015-03-13 at 10:38 +, Wei Liu wrote:
> Signed-off-by: Wei Liu 

I suppose that somewhere in the libxl series that
QEMU_TRADITIONAL_REVISION will need updating to incorporate this. It
might be useful to note where in that series this can safely be done..

> ---
>  xenstore.c | 24 
>  1 file changed, 12 insertions(+), 12 deletions(-)
> 
> diff --git a/xenstore.c b/xenstore.c
> index b0d6f77..09319c7 100644
> --- a/xenstore.c
> +++ b/xenstore.c
> @@ -711,7 +711,7 @@ void xenstore_parse_domain_config(int hvm_domid)
>  
> 
>  /* Set a watch for log-dirty commands from the migration tools */
> -if (pasprintf(&buf, "/local/domain/0/device-model/%u/logdirty/cmd",
> +if (pasprintf(&buf, "device-model/%u/logdirty/cmd",
>domid) != -1) {
>  xs_watch(xsh, buf, "logdirty");
>  fprintf(logfile, "Watching %s\n", buf);
> @@ -719,7 +719,7 @@ void xenstore_parse_domain_config(int hvm_domid)
>  
>  /* Set a watch for suspend requests from the migration tools */
>  if (pasprintf(&buf, 
> -  "/local/domain/0/device-model/%u/command", domid) != -1) {
> +  "device-model/%u/command", domid) != -1) {
>  xs_watch(xsh, buf, "dm-command");
>  fprintf(logfile, "Watching %s\n", buf);
>  }
> @@ -777,7 +777,7 @@ int xenstore_parse_disable_pf_config ()
>  int disable_pf = 0;
>  unsigned int len;
>  
> -if (pasprintf(&buf, "/local/domain/0/device-model/%u/disable_pf",domid) 
> == -1)
> +if (pasprintf(&buf, "device-model/%u/disable_pf",domid) == -1)
>  goto out;
>  
>  params = xs_read(xsh, XBT_NULL, buf, &len);
> @@ -808,13 +808,13 @@ static void xenstore_process_logdirty_event(void)
>  
>  /* Remember the paths for the command and response entries */
>  if (pasprintf(&ret_path,
> -"/local/domain/0/device-model/%u/logdirty/ret",
> +"device-model/%u/logdirty/ret",
>  domid) == -1) {
>  fprintf(logfile, "Log-dirty: out of memory\n");
>  exit(1);
>  }
>  if (pasprintf(&cmd_path,
> -"/local/domain/0/device-model/%u/logdirty/cmd",
> +"device-model/%u/logdirty/cmd",
>  domid) == -1) {
>  fprintf(logfile, "Log-dirty: out of memory\n");
>  exit(1);
> @@ -855,7 +855,7 @@ static void xenstore_process_dm_command_event(void)
>  unsigned int len;
>  
>  if (pasprintf(&path, 
> -  "/local/domain/0/device-model/%u/command", domid) == -1) {
> +  "device-model/%u/command", domid) == -1) {
>  fprintf(logfile, "out of memory reading dm command\n");
>  goto out;
>  }
> @@ -875,7 +875,7 @@ static void xenstore_process_dm_command_event(void)
>  } else if (!strncmp(command, "usb-add", len)) {
>  fprintf(logfile, "dm-command: usb-add a usb device\n");
>  if (pasprintf(&path,
> -"/local/domain/0/device-model/%u/parameter", domid) == -1) {
> +"device-model/%u/parameter", domid) == -1) {
>  fprintf(logfile, "out of memory reading dm command parameter\n");
>  goto out;
>  }
> @@ -889,7 +889,7 @@ static void xenstore_process_dm_command_event(void)
>  } else if (!strncmp(command, "usb-del", len)) {
>  fprintf(logfile, "dm-command: usb-del a usb device\n");
>  if (pasprintf(&path,
> -"/local/domain/0/device-model/%u/parameter", domid) == -1) {
> +"device-model/%u/parameter", domid) == -1) {
>  fprintf(logfile, "out of memory reading dm command parameter\n");
>  goto out;
>  }
> @@ -905,7 +905,7 @@ static void xenstore_process_dm_command_event(void)
>  fprintf(logfile, "dm-command: hot remove pass-through pci dev \n");
>  
>  if (pasprintf(&path, 
> -  "/local/domain/0/device-model/%u/parameter", domid) == 
> -1) {
> +  "device-model/%u/parameter", domid) == -1) {
>  fprintf(logfile, "out of memory reading dm command parameter\n");
>  goto out;
>  }
> @@ -919,7 +919,7 @@ static void xenstore_process_dm_command_event(void)
>  fprintf(logfile, "dm-command: hot insert pass-through pci dev \n");
>  
>  if (pasprintf(&path, 
> -  "/local/domain/0/device-model/%u/parameter", domid) == 
> -1) {
> +  "device-model/%u/parameter", domid) == -1) {
>  fprintf(logfile, "out of memory reading dm command parameter\n");
>  goto out;
>  }
> @@ -944,7 +944,7 @@ void xenstore_record_dm(const char *subpath, const char 
> *state)
>  char *path = NULL;
>  
>  if (pasprintf(&path, 
> -  "/local/domain/0/device-model/%u/%s", domid, subpath) == 
> -1) {
> +  "device-model/%u/%s", domid, subpath) == -1) {
>  fprintf(logfile, "out of memory rec

Re: [Xen-devel] [PATCH 1/5] libxl: remove DM path in libxl__device_model_destroy

2015-03-18 Thread Ian Campbell
On Fri, 2015-03-13 at 10:34 +, Wei Liu wrote:
> ... because it is the right place to clean up device model stuffs.

... and not devices_destroy_cb because it is the right ...

(also "stuff").

> And the path should use LIBXL_TOOLSTACK_DOMID instead of hardcoded 0.

Between this and what is in the next patch I think you probably should
refactor into a helper function to get the correct path for a given
domid.

FWIW I think you might also be able to do:
GCSPRINTF("/local/domain/" #LIBXL_TOOLSTACK_DOMID "/device-model/%d",
  domid);

If you want, although perhaps the intention is for it to eventually not
be a hard-define of 0 and become e.g. a function call or a global
variable reference, in which case best not?

> Signed-off-by: Wei Liu 
> ---
>  tools/libxl/libxl.c| 2 --
>  tools/libxl/libxl_dm.c | 6 ++
>  2 files changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
> index 088786e..46a43d5 100644
> --- a/tools/libxl/libxl.c
> +++ b/tools/libxl/libxl.c
> @@ -1659,8 +1659,6 @@ static void devices_destroy_cb(libxl__egc *egc,
>  
>  xs_rm(ctx->xsh, XBT_NULL, libxl__xs_libxl_path(gc, domid));
>  xs_rm(ctx->xsh, XBT_NULL, libxl__sprintf(gc,
> -"/local/domain/0/device-model/%d", domid));
> -xs_rm(ctx->xsh, XBT_NULL, libxl__sprintf(gc,
>  "/local/domain/%d/hvmloader", domid));
>  
>  /* This is async operation, we already hold CTX lock */
> diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
> index 8599a6a..0fd5ffa 100644
> --- a/tools/libxl/libxl_dm.c
> +++ b/tools/libxl/libxl_dm.c
> @@ -1629,6 +1629,12 @@ out:
>  
>  int libxl__destroy_device_model(libxl__gc *gc, uint32_t domid)
>  {
> +char *path = GCSPRINTF("/local/domain/%d/device-model/%d",
> +   LIBXL_TOOLSTACK_DOMID, domid);
> +
> +if (!xs_rm(CTX->xsh, XBT_NULL, path))
> +LOG(ERROR,"xs_rm failed for %s", path);
> +/* We should try to destroy the device model anyway. */
>  return kill_device_model(gc,
>  GCSPRINTF("/local/domain/%d/image/device-model-pid", domid));
>  }



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 4/5] libxl: vcpuset: Remove useless limit on max_vcpus.

2015-03-18 Thread Ian Campbell
On Fri, 2015-03-13 at 16:26 -0400, Konrad Rzeszutek Wilk wrote:
> The check is superflous. If the 'max_vcpus' (argument
> value) is greater than  pCPU and --ignore-host has not
> been supplied we would print an warning and return
> and not call this code.
> 
> If the --ignore-host parameter had been used we would
> never end up in this condition and enforce 'max_vcpus'.
> 
> The only time it would be invoked is if max_vcpus < host_cpu
> in which case it would set max_vcpus to max_vcpus.
> 
> In short - it is dead code.
> 
> Signed-off-by: Konrad Rzeszutek Wilk 

Acked-by: Ian Campbell 



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 1/5] libxl: remove DM path in libxl__device_model_destroy

2015-03-18 Thread Ian Campbell
On Wed, 2015-03-18 at 13:21 +, Ian Campbell wrote:
> On Fri, 2015-03-13 at 10:34 +, Wei Liu wrote:
> > ... because it is the right place to clean up device model stuffs.
> 
> ... and not devices_destroy_cb because it is the right ...
> 
> (also "stuff").
> 
> > And the path should use LIBXL_TOOLSTACK_DOMID instead of hardcoded 0.
> 
> Between this and what is in the next patch I think you probably should
> refactor into a helper function to get the correct path for a given
> domid.
> 
> FWIW I think you might also be able to do:
> GCSPRINTF("/local/domain/" #LIBXL_TOOLSTACK_DOMID "/device-model/%d",
>   domid);
> 
> If you want, although perhaps the intention is for it to eventually not
> be a hard-define of 0 and become e.g. a function call or a global
> variable reference, in which case best not?

Having reread patch #0 more carefully I suppose the helper would need to
take the dm_domid too, which would then rule out that cpp trick.

Ian.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 2/5] libxl: use LIBXL_TOOLSTACK_DOMID

2015-03-18 Thread Ian Campbell
On Fri, 2015-03-13 at 10:34 +, Wei Liu wrote:
> The function in question is libxl__spawn_local_dm. We should use
> LIBXL_TOOLSTACK_DOMID when constructing xenstore path.
> 
> Currently LIBXL_TOOLSTACK_DOMID is 0, so this patch introduces no
> functional change.

As I mentioned on the previous patch, path should come from a helper
rather than repeated everywhere.

> 
> Signed-off-by: Wei Liu 
> ---
>  tools/libxl/libxl_dm.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
> index 0fd5ffa..3dedad4 100644
> --- a/tools/libxl/libxl_dm.c
> +++ b/tools/libxl/libxl_dm.c
> @@ -1375,7 +1375,8 @@ void libxl__spawn_local_dm(libxl__egc *egc, 
> libxl__dm_spawn_state *dmss)
>  free(path);
>  }
>  
> -path = libxl__sprintf(gc, "/local/domain/0/device-model/%d", domid);
> +path = libxl__sprintf(gc, "/local/domain/%d/device-model/%d",
> +  LIBXL_TOOLSTACK_DOMID, domid);
>  xs_mkdir(ctx->xsh, XBT_NULL, path);
>  
>  if (b_info->type == LIBXL_DOMAIN_TYPE_HVM &&
> @@ -1424,7 +1425,8 @@ retry_transaction:
>  LIBXL__LOG(CTX, XTL_DEBUG, "  %s", *arg);
>  
>  spawn->what = GCSPRINTF("domain %d device model", domid);
> -spawn->xspath = GCSPRINTF("/local/domain/0/device-model/%d/state", 
> domid);
> +spawn->xspath = GCSPRINTF("/local/domain/%d/device-model/%d/state",
> +  LIBXL_TOOLSTACK_DOMID, domid);
>  spawn->timeout_ms = LIBXL_DEVICE_MODEL_START_TIMEOUT * 1000;
>  spawn->pidpath = GCSPRINTF("%s/image/device-model-pid", dom_path);
>  spawn->midproc_cb = libxl__spawn_record_pid;



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 3/5] libxl: use new QEMU xenstore protocol

2015-03-18 Thread Ian Campbell
On Fri, 2015-03-13 at 10:34 +, Wei Liu wrote:
> diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
> index 3dedad4..4a38455 100644
> --- a/tools/libxl/libxl_dm.c
> +++ b/tools/libxl/libxl_dm.c
> @@ -998,7 +998,7 @@ void libxl__spawn_stub_dm(libxl__egc *egc, 
> libxl__stub_dm_spawn_state *sdss)
>  libxl_device_vfb *vfb;
>  libxl_device_vkb *vkb;
>  char **args;
> -struct xs_permissions perm[2];
> +struct xs_permissions perm[1];
>  xs_transaction_t t;
>  
>  /* convenience aliases */
> @@ -1106,16 +1106,16 @@ void libxl__spawn_stub_dm(libxl__egc *egc, 
> libxl__stub_dm_spawn_state *sdss)
>  }
>  xs_set_target(ctx->xsh, dm_domid, guest_domid);
>  
> -perm[0].id = dm_domid;
> -perm[0].perms = XS_PERM_NONE;
> -perm[1].id = guest_domid;
> -perm[1].perms = XS_PERM_READ;
> +perm[0].id = guest_domid;
> +perm[0].perms = XS_PERM_READ;

This will make the node owned by the guest (and hence r/w to the guest)
and set the perms for everyone else to r/o. I don't think this is what
you meant?

(The way perms are specified is a bit confusing, check out
http://wiki.xen.org/wiki/XenBus#Permissions )


> -return GCSPRINTF("/local/domain/0/device-model/%d/physmap/%s/%s",
> -domid, phys_offset, node);
> +return GCSPRINTF("/local/domain/%d/device-model/%d/physmap/%s/%s",
> + dm_domid, domid, phys_offset, node);

This sort of thing might imply that the helper takes the tail of the
path?
> SPRINTF(
> -"/local/domain/0/device-model/%d/physmap", domid), &num);
> +"/local/domain/%d/device-model/%d/physmap",

I only just noticed, but I think you want %u not %d (since dm_domid is
unsigned), although given the existing domid one is wrong too maybe you
don't want to bother...

> diff --git a/tools/libxl/libxl_pci.c b/tools/libxl/libxl_pci.c
> index f3ae132..4aeb2bb 100644
> --- a/tools/libxl/libxl_pci.c
> +++ b/tools/libxl/libxl_pci.c
> @@ -850,11 +850,14 @@ static int qemu_pci_add_xenstore(libxl__gc *gc, 
> uint32_t domid,
>  int rc = 0;
>  char *path;
>  char *state, *vdevfn;
> +uint32_t dm_domid;
>  
> -path = libxl__sprintf(gc, "/local/domain/0/device-model/%d/state", 
> domid);
> +dm_domid = libxl_get_stubdom_id(CTX, domid);
> +path = libxl__sprintf(gc, "/local/domain/%d/device-model/%d/state",
> +  dm_domid, domid);

Maybe (up to you) switch thing to GCSPRINTF as you touch them?

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 4/5] libxl: wait for stubdom to be ready

2015-03-18 Thread Ian Campbell
On Fri, 2015-03-13 at 10:34 +, Wei Liu wrote:
> Watch /local/domain/$dm_domid/device-model/$domid/state, wait until
> state turns "running" then unpause guest.
> 
> LIBXL_STUBDOM_START_TIMEOUT is the timeout used wait for stubdom to be
> ready. My test on a very old machine (Core 2 6400) showed that it might
> need more than 20s before the stubdom is ready to serve DomU.
> 
> Signed-off-by: Wei Liu 
> ---
>  tools/libxl/libxl_dm.c   | 39 ++-
>  tools/libxl/libxl_internal.h |  2 ++
>  2 files changed, 40 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
> index 4a38455..ad2ef41 100644
> --- a/tools/libxl/libxl_dm.c
> +++ b/tools/libxl/libxl_dm.c
> @@ -984,6 +984,8 @@ static void stubdom_pvqemu_cb(libxl__egc *egc,
>  static void spawn_stubdom_pvqemu_destroy_cb(libxl__egc *egc,
>  libxl__destroy_domid_state *dis,
>  int rc);
> +static void stubdom_xswait_cb(libxl__egc *egc, libxl__xswait_state *xswait,
> +  int rc, const char *p);
>  
>  char *libxl__stub_dm_name(libxl__gc *gc, const char *guest_name)
>  {
> @@ -1273,16 +1275,51 @@ static void stubdom_pvqemu_cb(libxl__egc *egc,
>  rc = libxl_domain_unpause(CTX, dm_domid);
>  if (rc) goto out;
>  
> +libxl__xswait_init(&sdss->xswait);
> +sdss->xswait.ao = ao;
> +sdss->xswait.what = GCSPRINTF("Stubdom %d startup", dm_domid);

Maybe include the domid too, e.g. "Stubdom %d for dom %d startup"?

(And probably %u throughout?)

> +static void stubdom_xswait_cb(libxl__egc *egc, libxl__xswait_state *xswait,
> +  int rc, const char *p)
> +{
> +EGC_GC;
> +libxl__stub_dm_spawn_state *sdss = CONTAINER_OF(xswait, *sdss, xswait);
> +uint32_t dm_domid = sdss->pvqemu.guest_domid;
> +
> +if (rc) {
> +if (rc == ERROR_TIMEDOUT)
> +LOG(ERROR, "%s: startup timed out", xswait->what);
> +goto out;
> +}
> +
> +if (!p) return;
> +
> +if (strcmp(p, "running"))
> +return;
>   out:
>  if (rc) {
>  if (dm_domid) {
> -sdss->dis.ao = ao;
> +sdss->dis.ao = sdss->dm.spawn.ao;
>  sdss->dis.domid = dm_domid;
>  sdss->dis.callback = spawn_stubdom_pvqemu_destroy_cb;
>  libxl__destroy_domid(egc, &sdss->dis);
> +libxl__xswait_stop(gc, xswait);

Can you do this once befire the if (rc)?

Ian.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 2/5] libxl: vcpuset: Return error values.

2015-03-18 Thread Konrad Rzeszutek Wilk
On Wed, Mar 18, 2015 at 01:06:18PM +, Ian Campbell wrote:
> On Fri, 2015-03-13 at 16:26 -0400, Konrad Rzeszutek Wilk wrote:
> > The function does not return any values at all. Convert the
> > internal libxl ones (ERROR_FAIL, ..., etc) to positive values
> > and for the other cases just return standard libxl values.
> 
> It's not clear why you want to do this, in particular returning
> -ERROR_INVAL and inverting libxl error codes seems like a very strange
> thing to be doing.

The ERROR_INVAL are negative values. Inverting them makes them
positive - which is what the rest of xl does for error vales?
> 
> I think you should either use ERROR_INVAL (not inverted) and propagate
> libxl rc's directly or convert them into something which suits xl, i.e.
> 0 and 1.

Oh, there is a lot of other code (xl <>) which return -ERROR_XYZ so that
the error values are positive.

How should 'xl' return errors? Just as 1 for failure or actually
use an -ERROR_XYZ so that folks can map them to -ERROR_XYZ?

> 
> > Signed-off-by: Konrad Rzeszutek Wilk 
> > ---
> >  tools/libxl/xl_cmdimpl.c | 23 +--
> >  1 file changed, 13 insertions(+), 10 deletions(-)
> > 
> > diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
> > index 2d7145f..454a895 100644
> > --- a/tools/libxl/xl_cmdimpl.c
> > +++ b/tools/libxl/xl_cmdimpl.c
> > @@ -5013,17 +5013,18 @@ int main_vcpupin(int argc, char **argv)
> >  return rc;
> >  }
> >  
> > -static void vcpuset(uint32_t domid, const char* nr_vcpus, int check_host)
> > +static int vcpuset(uint32_t domid, const char* nr_vcpus, int check_host)
> >  {
> >  char *endptr;
> >  unsigned int max_vcpus, i;
> >  libxl_bitmap cpumap;
> > +int rc;
> >  
> >  libxl_bitmap_init(&cpumap);
> >  max_vcpus = strtoul(nr_vcpus, &endptr, 10);
> >  if (nr_vcpus == endptr) {
> >  fprintf(stderr, "Error: Invalid argument.\n");
> > -return;
> > +return -ERROR_INVAL;
> >  }
> >  
> >  /*
> > @@ -5036,22 +5037,25 @@ static void vcpuset(uint32_t domid, const char* 
> > nr_vcpus, int check_host)
> >  fprintf(stderr, "You are overcommmitting! You have %d physical 
> > " \
> >  " CPUs and want %d vCPUs! Aborting, use --ignore-host 
> > to " \
> >  " continue\n", host_cpu, max_vcpus);
> > -return;
> > +return -ERROR_INVAL;
> >  }
> >  /* NB: This also limits how many are set in the bitmap */
> >  max_vcpus = (max_vcpus > host_cpu ? host_cpu : max_vcpus);
> >  }
> > -if (libxl_cpu_bitmap_alloc(ctx, &cpumap, max_vcpus)) {
> > -fprintf(stderr, "libxl_cpu_bitmap_alloc failed\n");
> > -return;
> > +rc = libxl_cpu_bitmap_alloc(ctx, &cpumap, max_vcpus);
> > +if (rc) {
> > +fprintf(stderr, "libxl_cpu_bitmap_alloc failed, rc: %d\n", rc);
> > +return -rc;
> >  }
> >  for (i = 0; i < max_vcpus; i++)
> >  libxl_bitmap_set(&cpumap, i);
> >  
> > -if (libxl_set_vcpuonline(ctx, domid, &cpumap) < 0)
> > -fprintf(stderr, "libxl_set_vcpuonline failed domid=%d 
> > max_vcpus=%d\n", domid, max_vcpus);
> > +rc = libxl_set_vcpuonline(ctx, domid, &cpumap);
> > +if (rc)
> > +fprintf(stderr, "libxl_set_vcpuonline failed domid=%d 
> > max_vcpus=%d, rc: %d\n", domid, max_vcpus, rc);
> >  
> >  libxl_bitmap_dispose(&cpumap);
> > +return rc ? -rc : 0;
> >  }
> >  
> >  int main_vcpuset(int argc, char **argv)
> > @@ -5070,8 +5074,7 @@ int main_vcpuset(int argc, char **argv)
> >  break;
> >  }
> >  
> > -vcpuset(find_domain(argv[optind]), argv[optind + 1], check_host);
> > -return 0;
> > +return vcpuset(find_domain(argv[optind]), argv[optind + 1], 
> > check_host);
> >  }
> >  
> >  static void output_xeninfo(void)
> 
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCHv1] xen/balloon: disable memory hotplug in PV guests

2015-03-18 Thread Juergen Gross

On 03/18/2015 11:36 AM, David Vrabel wrote:

On 16/03/15 10:31, Juergen Gross wrote:

On 03/16/2015 11:03 AM, Daniel Kiper wrote:

On Mon, Mar 16, 2015 at 06:35:04AM +0100, Juergen Gross wrote:

On 03/11/2015 04:40 PM, Boris Ostrovsky wrote:

On 03/11/2015 10:42 AM, David Vrabel wrote:

On 10/03/15 13:35, Boris Ostrovsky wrote:

On 03/10/2015 07:40 AM, David Vrabel wrote:

On 09/03/15 14:10, David Vrabel wrote:

Memory hotplug doesn't work with PV guests because:

 a) The p2m cannot be expanded to cover the new sections.

Broken by 054954eb051f35e74b75a566a96fe756015352c8 (xen: switch to
linear virtual mapped sparse p2m list).

This one would be non-trivial to fix.  We'd need a sparse set of
vm_area's for the p2m or similar.


 b) add_memory() builds page tables for the new sections which
means
the new pages must have valid p2m entries (or a BUG occurs).

After some more testing this appears to be broken by:

25b884a83d487fd62c3de7ac1ab5549979188482 (x86/xen: set regions above
the
end of RAM as 1:1) included 3.16.

This one can be trivially fixed by setting the new sections in
the p2m
to INVALID_P2M_ENTRY before calling add_memory().

Have you tried 3.17? As I said yesterday, it worked for me (with 4.4
Xen).

No.  But there are three bugs that prevent it from working in 3.16+ so
I'm really not sure how you had a working in a 3.17 PV guest.


This is what I have:

[build@build-mk2 linux-boris]$ ssh root@tst008 cat
/mnt/lab/bootstrap-x86_64/test_small.xm
extra="console=hvc0 debug earlyprintk=xen "
kernel="/mnt/lab/bootstrap-x86_64/vmlinuz"
ramdisk="/mnt/lab/bootstrap-x86_64/initramfs.cpio.gz"
memory=1024
maxmem = 4096
vcpus=1
maxvcpus=3
name="bootstrap-x86_64"
on_crash="preserve"
vif = [ 'mac=00:0F:4B:00:00:68, bridge=switch' ]
vnc=1
vnclisten="0.0.0.0"
disk=['phy:/dev/guests/bootstrap-x86_64,xvda,w']
[build@build-mk2 linux-boris]$ ssh root@tst008 xl create
/mnt/lab/bootstrap-x86_64/test_small.xm
Parsing config from /mnt/lab/bootstrap-x86_64/test_small.xm
[build@build-mk2 linux-boris]$ ssh root@tst008 xl list |grep
bootstrap-x86_64
bootstrap-x86_64 2  1024 1
-b   5.4
[build@build-mk2 linux-boris]$ ssh root@g-pvops uname -r
3.17.0upstream
[build@build-mk2 linux-boris]$ ssh root@g-pvops dmesg|grep
paravirtualized
[0.00] Booting paravirtualized kernel on Xen
[build@build-mk2 linux-boris]$ ssh root@g-pvops grep MemTotal
/proc/meminfo
MemTotal: 968036 kB
[build@build-mk2 linux-boris]$ ssh root@tst008 xl mem-set
bootstrap-x86_64 2048
[build@build-mk2 linux-boris]$ ssh root@tst008 xl list |grep
bootstrap-x86_64
bootstrap-x86_64 2  2048 1
-b   5.7
[build@build-mk2 linux-boris]$ ssh root@g-pvops grep MemTotal
/proc/meminfo
MemTotal:2016612 kB
[build@build-mk2 linux-boris]$




Regardless, it definitely doesn't work now because of the linear p2m
change.  What do you want to do about this?


Since backing out p2m changes is not an option I guess your patch is
the
only short-term alternative.

But this still looks like a regression so perhaps Juergen can take a
look to see how it can be fixed.


Hmm, the p2m list is allocated for the maximum memory size of the domain
which is obtained from the hypervisor. In case of Dom0 it is read via
XENMEM_maximum_reservation, for a domU it is based on the E820 memory
map read via XENMEM_memory_map.

I just tested it with a 4.0-rc1 domU kernel with 512MB initial memory
and 4GB of maxmem. The E820 map looked like this:

[0.00] Xen: [mem 0x-0x0009] usable
[0.00] Xen: [mem 0x000a-0x000f] reserved
[0.00] Xen: [mem 0x0010-0x] usable

So the complete 4GB were included, like they should. The resulting p2m
list is allocated in the needed size:

[0.00] p2m virtual area at c900, size is 80

So what is your problem here? Can you post the E820 map and the p2m map
info for your failing domain, please?


If you use memory hotplug then maxmem is not a limit from guest kernel
point of view (host still must allow that operation but it is another
not related issue). The problem is that p2m must be dynamically
expendable
to support it. Earlier implementation supported that thing and memory
hotplug worked without any issue.


Okay, now I get it.

The problem with the earlier p2m implementation was that it was
expendable to support only up to 512GB of RAM. So we need some way to
tell the kernel how much virtual memory it should reserve for the p2m
list if memory hotplug is enabled. We could:

a) use a configurable maximum (e.g. for 512GB RAM as today)


I would set the p2m virtual area to cover up to 512 GB (needs 1 GB of
virt space) for a 64-bit guest and up to 64 GB (needs 64 MB of virt
space) for a 32-bit guest.


Are 64 GB for 32 bit guests a sensible default? This will need more than
10% of the available virtual kernel space (taking fixmap etc. into
accou

Re: [Xen-devel] [PATCH] dpci: Put the dpci back on the list if scheduled from another CPU.

2015-03-18 Thread Konrad Rzeszutek Wilk
On Wed, Mar 18, 2015 at 07:38:12AM +, Jan Beulich wrote:
> >>> On 17.03.15 at 18:15,  wrote:
> > On Tue, Mar 17, 2015 at 04:06:14PM +, Jan Beulich wrote:
> >> >>> On 17.03.15 at 16:38,  wrote:
> >> > --- a/xen/drivers/passthrough/io.c
> >> > +++ b/xen/drivers/passthrough/io.c
> >> > @@ -804,7 +804,17 @@ static void dpci_softirq(void)
> >> >  d = pirq_dpci->dom;
> >> >  smp_mb(); /* 'd' MUST be saved before we set/clear the bits. */
> >> >  if ( test_and_set_bit(STATE_RUN, &pirq_dpci->state) )
> >> > -BUG();
> >> > +{
> >> > +unsigned long flags;
> >> > +
> >> > +/* Put back on the list and retry. */
> >> > +local_irq_save(flags);
> >> > +list_add_tail(&pirq_dpci->softirq_list, 
> >> > &this_cpu(dpci_list));
> >> > +local_irq_restore(flags);
> >> > +
> >> > +raise_softirq(HVM_DPCI_SOFTIRQ);
> >> > +continue;
> >> > +}
> >> 
> >> As just said in another mail - unless there are convincing new
> >> arguments in favor of this (more of a hack than a real fix), I'm
> >> not going to accept it and instead consider reverting the
> >> offending commit. Iirc the latest we had come to looked quite a
> >> bit better than this one.
> > 
> > The latest one (please see attached) would cause an dead-lock iff
> > on the CPU we are running the softirq and an do_IRQ comes for the
> > exact dpci we are in process of executing.
> 
> I'm afraid I'm not seeing it - please explain.

Lets assume that the device is an PCIe with MSI. We have only one
VCPU in the guest.

We receive the first interrupt, end up going:
vmx_vmexit_handler
 - case EXIT_REASON_EXTERNAL_INTERRUPT
   \- vmx_do_extint
\- do_IRQ
  \- __do_IRQ_guest
  \- hvm_do_IRQ_dpci
 \- raise_softirq_for
[DPCI_SOFTIRQ bit set]
vmx_process_softirq
 sti
 do_softirq
   -\ __do_sofitq_
\- dpci_softirq
-\ hvm_dirq_assist
   [state is 'running']

[Same vector comes in]
do_IRQ
 \- __do_IRQ_guest
\- hvm_do_IRQ_dpci
   \- raise_softirq_for
   [here we either
a) spin waiting 'running' to be done or -- dead-lock
b) we just exit out and drop this interrupt, 
c) increment 'masked' to tell 'dpci_softirq' to reschedule
   -- live lock if this keeps on going]

Now c) is protected - the do_IRQ has anti-storm code so eventually
it will stop.
> 
> As to the code - I think switch() is rather hiding the intentions
> here, i.e. the code would be better readable if using two if()s:
> 
> +for ( ;; )
> +{
> +old = cmpxchg(&pirq_dpci->state, 0, 1 << STATE_SCHED);
> +/* If the 'state' is 0 (not in use) we can schedule it. */
> +if ( !old )
> +break;
> +if ( !(old & (1 << STATE_RUN)) )
> +{
> +/* Whenever STATE_SCHED is set we MUST not schedule it. */
> +assert(old & (1 << STATE_SCHED));
> +return;
> +}
> +}
> 
> Jan
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCHv1] xen/balloon: disable memory hotplug in PV guests

2015-03-18 Thread David Vrabel
On 18/03/15 13:57, Juergen Gross wrote:
> On 03/18/2015 11:36 AM, David Vrabel wrote:
>> On 16/03/15 10:31, Juergen Gross wrote:
>>> On 03/16/2015 11:03 AM, Daniel Kiper wrote:
 On Mon, Mar 16, 2015 at 06:35:04AM +0100, Juergen Gross wrote:
> On 03/11/2015 04:40 PM, Boris Ostrovsky wrote:
>> On 03/11/2015 10:42 AM, David Vrabel wrote:
>>> On 10/03/15 13:35, Boris Ostrovsky wrote:
 On 03/10/2015 07:40 AM, David Vrabel wrote:
> On 09/03/15 14:10, David Vrabel wrote:
>> Memory hotplug doesn't work with PV guests because:
>>
>>  a) The p2m cannot be expanded to cover the new sections.
> Broken by 054954eb051f35e74b75a566a96fe756015352c8 (xen: switch to
> linear virtual mapped sparse p2m list).
>
> This one would be non-trivial to fix.  We'd need a sparse set of
> vm_area's for the p2m or similar.
>
>>  b) add_memory() builds page tables for the new sections
>> which
>> means
>> the new pages must have valid p2m entries (or a BUG
>> occurs).
> After some more testing this appears to be broken by:
>
> 25b884a83d487fd62c3de7ac1ab5549979188482 (x86/xen: set regions
> above
> the
> end of RAM as 1:1) included 3.16.
>
> This one can be trivially fixed by setting the new sections in
> the p2m
> to INVALID_P2M_ENTRY before calling add_memory().
 Have you tried 3.17? As I said yesterday, it worked for me (with
 4.4
 Xen).
>>> No.  But there are three bugs that prevent it from working in
>>> 3.16+ so
>>> I'm really not sure how you had a working in a 3.17 PV guest.
>>
>> This is what I have:
>>
>> [build@build-mk2 linux-boris]$ ssh root@tst008 cat
>> /mnt/lab/bootstrap-x86_64/test_small.xm
>> extra="console=hvc0 debug earlyprintk=xen "
>> kernel="/mnt/lab/bootstrap-x86_64/vmlinuz"
>> ramdisk="/mnt/lab/bootstrap-x86_64/initramfs.cpio.gz"
>> memory=1024
>> maxmem = 4096
>> vcpus=1
>> maxvcpus=3
>> name="bootstrap-x86_64"
>> on_crash="preserve"
>> vif = [ 'mac=00:0F:4B:00:00:68, bridge=switch' ]
>> vnc=1
>> vnclisten="0.0.0.0"
>> disk=['phy:/dev/guests/bootstrap-x86_64,xvda,w']
>> [build@build-mk2 linux-boris]$ ssh root@tst008 xl create
>> /mnt/lab/bootstrap-x86_64/test_small.xm
>> Parsing config from /mnt/lab/bootstrap-x86_64/test_small.xm
>> [build@build-mk2 linux-boris]$ ssh root@tst008 xl list |grep
>> bootstrap-x86_64
>> bootstrap-x86_64 2  1024 1
>> -b   5.4
>> [build@build-mk2 linux-boris]$ ssh root@g-pvops uname -r
>> 3.17.0upstream
>> [build@build-mk2 linux-boris]$ ssh root@g-pvops dmesg|grep
>> paravirtualized
>> [0.00] Booting paravirtualized kernel on Xen
>> [build@build-mk2 linux-boris]$ ssh root@g-pvops grep MemTotal
>> /proc/meminfo
>> MemTotal: 968036 kB
>> [build@build-mk2 linux-boris]$ ssh root@tst008 xl mem-set
>> bootstrap-x86_64 2048
>> [build@build-mk2 linux-boris]$ ssh root@tst008 xl list |grep
>> bootstrap-x86_64
>> bootstrap-x86_64 2  2048 1
>> -b   5.7
>> [build@build-mk2 linux-boris]$ ssh root@g-pvops grep MemTotal
>> /proc/meminfo
>> MemTotal:2016612 kB
>> [build@build-mk2 linux-boris]$
>>
>>
>>>
>>> Regardless, it definitely doesn't work now because of the linear p2m
>>> change.  What do you want to do about this?
>>
>> Since backing out p2m changes is not an option I guess your patch is
>> the
>> only short-term alternative.
>>
>> But this still looks like a regression so perhaps Juergen can take a
>> look to see how it can be fixed.
>
> Hmm, the p2m list is allocated for the maximum memory size of the
> domain
> which is obtained from the hypervisor. In case of Dom0 it is read via
> XENMEM_maximum_reservation, for a domU it is based on the E820 memory
> map read via XENMEM_memory_map.
>
> I just tested it with a 4.0-rc1 domU kernel with 512MB initial memory
> and 4GB of maxmem. The E820 map looked like this:
>
> [0.00] Xen: [mem 0x-0x0009] usable
> [0.00] Xen: [mem 0x000a-0x000f]
> reserved
> [0.00] Xen: [mem 0x0010-0x] usable
>
> So the complete 4GB were included, like they should. The resulting p2m
> list is allocated in the needed size:
>
> [0.00] p2m virtual area at c900, size is 80
>
> So what is your problem here? Can you post the E820 map and the p2m
> map
> info for your failing domain, please?

 If you use memory hotplug then maxmem is not a limit from guest kernel
 p

Re: [Xen-devel] [RFC PATCH] dpci: Put the dpci back on the list if running on another CPU.

2015-03-18 Thread Konrad Rzeszutek Wilk
On Wed, Mar 18, 2015 at 07:41:55AM +, Jan Beulich wrote:
> >>> On 17.03.15 at 18:44,  wrote:
> > As you can see to preserve the existing functionality such as
> > being able to schedule N amount of interrupt injections
> > for the N interrupts we might get - I modified '->masked'
> > to be an atomic counter.
> 
> Why would that be? When an earlier interrupt wasn't fully handled,
> real hardware wouldn't latch more than one further instance either.

We acknowledge the interrupt in the hypervisor - as in we call
->ack on the handler (which for MSI is an nop anyhow).

If the device is misconfigured and keeps on sending burst of
interrupts every 10 msec for 1msec we can dead-lock.

Either way we should tell the guest about those interrupts.
> 
> > The end result is that we can still live-lock. Unless we:
> >  - Drop on the floor the injection of N interrupts and
> >just deliever at max one per VMX_EXIT (and not bother
> >with interrupts arriving when we are in the VMX handler).
> 
> I'm afraid I again don't see the point here.

I am basing all of this on the assumption that we have
many interrupts for the same device coming it - and we have
not been able to tell the guest about it (the guest could
be descheduled, too slow, etc) so that it can do what it
needs to silence the device.

It might be very well an busted device - and if that is
the case be that - but we must not dead-lock due to it.
> 
> >  - Alter the softirq code slightly - to have an variant
> >which will only iterate once over the pending softirq
> >bits per call. (so save an copy of the bitmap on the
> >stack when entering the softirq handler - and use that.
> >We could also xor it against the current to catch any
> >non-duplicate bits being set that we should deal with).
> 
> That's clearly not an option: The solution should be isolated
> to DPCI code, i.e. without altering existing behavior in other
> (more generic) components.
> 
> Jan
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 2/5] libxl: vcpuset: Return error values.

2015-03-18 Thread Dario Faggioli
On Wed, 2015-03-18 at 13:08 +, Ian Campbell wrote:
> On Wed, 2015-03-18 at 13:06 +, Ian Campbell wrote:
> > On Fri, 2015-03-13 at 16:26 -0400, Konrad Rzeszutek Wilk wrote:
> > > The function does not return any values at all. Convert the
> > > internal libxl ones (ERROR_FAIL, ..., etc) to positive values
> > > and for the other cases just return standard libxl values.
> > 
> > It's not clear why you want to do this, in particular returning
> > -ERROR_INVAL and inverting libxl error codes seems like a very strange
> > thing to be doing.
> 
> BTW I know the xl error handling is horribly confused, and there are
> even a small number of instances of -ERROR_* already, but I think those
> are wrong and we shouldn't introduce more.
> 
Indeed. I did some xl error code refactoring for a series of mine a few
days back, and as far as I could see, the most common pattern in xl is
returning 0 or 1.

FWIW, I think we should not diverge any further from that and, at some
point, convert 0/1 to EXIT_SUCCESS/EXIT_FAILURE.

> > I think you should either use ERROR_INVAL (not inverted) and propagate
> > libxl rc's directly or convert them into something which suits xl, i.e.
> > 0 and 1.
> > 
Again, +1 for 0 or 1.

Regards,
Dario


signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 22/24] tools/libxl: arm: Use an higher value for the GIC phandle

2015-03-18 Thread Julien Grall

Hi Ian,

On 24/02/2015 10:08, Ian Campbell wrote:

On Mon, 2015-02-23 at 22:02 +, Julien Grall wrote:


On 23/02/2015 14:36, Ian Campbell wrote:

On Thu, 2015-01-29 at 13:48 +, Julien Grall wrote:

On 29/01/15 12:28, Stefano Stabellini wrote:

On Thu, 29 Jan 2015, Julien Grall wrote:

On 29/01/15 11:07, Stefano Stabellini wrote:

On Tue, 13 Jan 2015, Julien Grall wrote:

The partial device tree may contains phandle. The Device Tree Compiler
tends to allocate the phandle from 1.

Reserve the ID 65000 for the GIC phandle. I think we can safely assume
that the partial device tree will never contain a such ID.

Signed-off-by: Julien Grall 
Cc: Ian Jackson 
Cc: Wei Liu 



Shouldn't we at least check that the partial device tree doesn't contain
a conflicting phandle?


I don't think so. This will unlikely happen, and if it happens the guest
will crash with an obvious error.


It is good that the error is obvious.

But how expensive is to check for it?


I would have to check the validity of the properties (name + value
size). At least the properties "linux,phandle" and "phandle" should be
checked.

Though I could do in copy_properties but I find it hackish.


Can't you just track the largest phandle ever seen during
copy_properties and then use N+1 for the GIC?


Now the we decided to trust the input device tree, it would be easier to
write the code.

I will give a look.




Think about the poor user that ends up in this situation: the fact that
is unlikely only makes it harder for a user to figure out what to do to
fix it.


The poor use will have to write his device tree by hand to hit this
error ;).


Or use a new version of dtc which does things differently for some
reason.


And you would not be able to get a phandle for the GIC if largest
phandle is too high.

So the guest won't work correctly.


Indeed, filling in a bitmap as you go might be another option, although
you'd either need an 8k bitmap (not so bad in userspace) or to realloc
as it grows.


I though about different implementation for tracking the phandle. All 
require to parse the partial device tree twice: one for tracking the 
phandle and the second for copy the nodes.


This is because we need to have the GIC phandle at the time we create 
the root node (properties has to be created before the child nodes).


So I plan to keep this patch as it is and documenting the value of the 
GIC. Would it be fine for you?


Though I could put an higher value too.

Regards,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 2/5] libxl: vcpuset: Return error values.

2015-03-18 Thread Konrad Rzeszutek Wilk
On Wed, Mar 18, 2015 at 02:09:09PM +, Dario Faggioli wrote:
> On Wed, 2015-03-18 at 13:08 +, Ian Campbell wrote:
> > On Wed, 2015-03-18 at 13:06 +, Ian Campbell wrote:
> > > On Fri, 2015-03-13 at 16:26 -0400, Konrad Rzeszutek Wilk wrote:
> > > > The function does not return any values at all. Convert the
> > > > internal libxl ones (ERROR_FAIL, ..., etc) to positive values
> > > > and for the other cases just return standard libxl values.
> > > 
> > > It's not clear why you want to do this, in particular returning
> > > -ERROR_INVAL and inverting libxl error codes seems like a very strange
> > > thing to be doing.
> > 
> > BTW I know the xl error handling is horribly confused, and there are
> > even a small number of instances of -ERROR_* already, but I think those
> > are wrong and we shouldn't introduce more.
> > 
> Indeed. I did some xl error code refactoring for a series of mine a few
> days back, and as far as I could see, the most common pattern in xl is
> returning 0 or 1.

Gah, I seem to have looked at the wrong examples and thought that was
the proper way!
> 
> FWIW, I think we should not diverge any further from that and, at some
> point, convert 0/1 to EXIT_SUCCESS/EXIT_FAILURE.
> 
> > > I think you should either use ERROR_INVAL (not inverted) and propagate
> > > libxl rc's directly or convert them into something which suits xl, i.e.
> > > 0 and 1.
> > > 
> Again, +1 for 0 or 1.
> 
> Regards,
> Dario



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] OpenStack - Libvirt+Xen CI overview : anyone interested in a walk/through or presentation on how it works

2015-03-18 Thread Wei Liu
On Wed, Mar 18, 2015 at 01:12:50PM +, Lars Kurth wrote:
> Hi,
> 
> I added all the people who raised their hands so far to the TO list. As far 
> as I can tell we have people from the following timezones: GMT, GMT+1, MTZ, 
> ETZ - if I got this wrong, please let me know your timezone. Depending on 
> when we have the presentation, we will need to consider daylight savings 
> offsets.
> 
> I will give it another few days for more people to raise their hands. We can 
> then try and figure out a slot which fits everyone.
> 

I would be interested in such meeting. My time zone is the same as
George's and Anthony's.

Wei.

> Best Regards
> Lars
> 
> Begin forwarded message:
> 
> From: Bob Ball 
> To: "xen-devel@lists.xen.org" 
> Date: 10 March 2015 12:03:13 GMT
> Cc: Anthony Perard 
> Subject: [Xen-devel] OpenStack - Libvirt+Xen CI overview
> 
> For the last few weeks Anthony and I have been working on creating a CI 
> environment to run against all OpenStack jobs.  We're now in a position where 
> we can share the current status, overview of how it works and next steps.  We 
> actively want to support involvement in this effort from others with an 
> interest in libvirt+Xen's openstack integration.
> 
> The CI we have set up is follow the recommendations made by the OpenStack 
> official infrastructure maintainers, and reproduces a notable portion of the 
> official OpenStack CI environment to run these tests.  Namely this setup is 
> using:
> - Puppet to deploy the master node
> - Zuul to watch for code changes uploaded to review.openstack.org
> - Jenkins job builder to create Jenkins job definitions from a YAML file
> - Nodepool to automatically create single-use virtual machines in the 
> Rackspace public cloud 
> - Devstack-gate to run Tempest tests in serial
> 
> More information on Zuul, JJB, Nodepool and devstack-gate is available 
> through http://ci.openstack.org
> 
> The current status is that we have a zuul instance monitoring for jobs and 
> adding them to the queue of jobs to be run at 
> http://zuul.openstack.xenproject.org/
> 
> In the background Nodepool provisions virtual machines into a pool of nodes 
> ready to be used.  All ready nodes are automatically added to Jenkins 
> (https://jenkins.openstack.xenproject.org/), and then Zuul+Jenkins will 
> trigger a particular job on a node when one is available.
> 
> Logs are then uploaded to Rackspace's Cloud Files with sample logs for a 
> passing job at 
> http://logs.openstack.xenproject.org/52/162352/3/silent/dsvm-tempest-xen/da3ff30/index.html
> 
> I'd like to organise a meeting to walk through the various components of the 
> CI with those who are interested, so this is an initial call to find out who 
> is interested in finding out more!
> 
> Thanks,
> 
> Bob
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
> 
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCHv1] xen/balloon: disable memory hotplug in PV guests

2015-03-18 Thread Daniel Kiper
On Wed, Mar 18, 2015 at 01:59:58PM +, David Vrabel wrote:
> On 18/03/15 13:57, Juergen Gross wrote:
> > On 03/18/2015 11:36 AM, David Vrabel wrote:
> >> On 16/03/15 10:31, Juergen Gross wrote:
> >>> On 03/16/2015 11:03 AM, Daniel Kiper wrote:
>  On Mon, Mar 16, 2015 at 06:35:04AM +0100, Juergen Gross wrote:
> > On 03/11/2015 04:40 PM, Boris Ostrovsky wrote:
> >> On 03/11/2015 10:42 AM, David Vrabel wrote:
> >>> On 10/03/15 13:35, Boris Ostrovsky wrote:
>  On 03/10/2015 07:40 AM, David Vrabel wrote:
> > On 09/03/15 14:10, David Vrabel wrote:
> >> Memory hotplug doesn't work with PV guests because:
> >>
> >>  a) The p2m cannot be expanded to cover the new sections.
> > Broken by 054954eb051f35e74b75a566a96fe756015352c8 (xen: switch to
> > linear virtual mapped sparse p2m list).
> >
> > This one would be non-trivial to fix.  We'd need a sparse set of
> > vm_area's for the p2m or similar.
> >
> >>  b) add_memory() builds page tables for the new sections
> >> which
> >> means
> >> the new pages must have valid p2m entries (or a BUG
> >> occurs).
> > After some more testing this appears to be broken by:
> >
> > 25b884a83d487fd62c3de7ac1ab5549979188482 (x86/xen: set regions
> > above
> > the
> > end of RAM as 1:1) included 3.16.
> >
> > This one can be trivially fixed by setting the new sections in
> > the p2m
> > to INVALID_P2M_ENTRY before calling add_memory().
>  Have you tried 3.17? As I said yesterday, it worked for me (with
>  4.4
>  Xen).
> >>> No.  But there are three bugs that prevent it from working in
> >>> 3.16+ so
> >>> I'm really not sure how you had a working in a 3.17 PV guest.
> >>
> >> This is what I have:
> >>
> >> [build@build-mk2 linux-boris]$ ssh root@tst008 cat
> >> /mnt/lab/bootstrap-x86_64/test_small.xm
> >> extra="console=hvc0 debug earlyprintk=xen "
> >> kernel="/mnt/lab/bootstrap-x86_64/vmlinuz"
> >> ramdisk="/mnt/lab/bootstrap-x86_64/initramfs.cpio.gz"
> >> memory=1024
> >> maxmem = 4096
> >> vcpus=1
> >> maxvcpus=3
> >> name="bootstrap-x86_64"
> >> on_crash="preserve"
> >> vif = [ 'mac=00:0F:4B:00:00:68, bridge=switch' ]
> >> vnc=1
> >> vnclisten="0.0.0.0"
> >> disk=['phy:/dev/guests/bootstrap-x86_64,xvda,w']
> >> [build@build-mk2 linux-boris]$ ssh root@tst008 xl create
> >> /mnt/lab/bootstrap-x86_64/test_small.xm
> >> Parsing config from /mnt/lab/bootstrap-x86_64/test_small.xm
> >> [build@build-mk2 linux-boris]$ ssh root@tst008 xl list |grep
> >> bootstrap-x86_64
> >> bootstrap-x86_64 2  1024 1
> >> -b   5.4
> >> [build@build-mk2 linux-boris]$ ssh root@g-pvops uname -r
> >> 3.17.0upstream
> >> [build@build-mk2 linux-boris]$ ssh root@g-pvops dmesg|grep
> >> paravirtualized
> >> [0.00] Booting paravirtualized kernel on Xen
> >> [build@build-mk2 linux-boris]$ ssh root@g-pvops grep MemTotal
> >> /proc/meminfo
> >> MemTotal: 968036 kB
> >> [build@build-mk2 linux-boris]$ ssh root@tst008 xl mem-set
> >> bootstrap-x86_64 2048
> >> [build@build-mk2 linux-boris]$ ssh root@tst008 xl list |grep
> >> bootstrap-x86_64
> >> bootstrap-x86_64 2  2048 1
> >> -b   5.7
> >> [build@build-mk2 linux-boris]$ ssh root@g-pvops grep MemTotal
> >> /proc/meminfo
> >> MemTotal:2016612 kB
> >> [build@build-mk2 linux-boris]$
> >>
> >>
> >>>
> >>> Regardless, it definitely doesn't work now because of the linear p2m
> >>> change.  What do you want to do about this?
> >>
> >> Since backing out p2m changes is not an option I guess your patch is
> >> the
> >> only short-term alternative.
> >>
> >> But this still looks like a regression so perhaps Juergen can take a
> >> look to see how it can be fixed.
> >
> > Hmm, the p2m list is allocated for the maximum memory size of the
> > domain
> > which is obtained from the hypervisor. In case of Dom0 it is read via
> > XENMEM_maximum_reservation, for a domU it is based on the E820 memory
> > map read via XENMEM_memory_map.
> >
> > I just tested it with a 4.0-rc1 domU kernel with 512MB initial memory
> > and 4GB of maxmem. The E820 map looked like this:
> >
> > [0.00] Xen: [mem 0x-0x0009] usable
> > [0.00] Xen: [mem 0x000a-0x000f]
> > reserved
> > [0.00] Xen: [mem 0x0010-0x] usable
> >
> > So the complete 4GB were included, like they should. The resulting p2m
> > list is allocated in the needed size:
> >
> >

Re: [Xen-devel] [PATCH v2 2/2] sched_credit2.c: runqueue_per_core code

2015-03-18 Thread George Dunlap
On 03/18/2015 08:53 AM, Dario Faggioli wrote:
>>> @@ -1986,9 +1982,13 @@ static void init_pcpu(const struct scheduler *ops, 
>>> int cpu)
>>>  static void *
>>>  csched2_alloc_pdata(const struct scheduler *ops, int cpu)
>>>  {
>>> -/* Check to see if the cpu is online yet */
>>> -/* Note: cpu 0 doesn't get a STARTING callback */
>>> -if ( cpu == 0 || cpu_to_socket(cpu) >= 0 )
>>> +/*
>>> + * Actual initialization is deferred to when the pCPU will be
>>> + * online, via a STARTING callback. The only exception is
>>> + * the boot cpu, which does not get such a notification, and
>>> + * hence needs to be taken care of here.
>>> + */
>>> +if ( system_state == SYS_STATE_boot )
>>>  init_pcpu(ops, cpu);

I think this will break adding a cpu to a cpupool after boot, won't it?

Don't we want effectively, "if ( is_boot_cpu() ||
is_cpu_topology_set_up() )"?

is_cpu_topology_set_up() could be "system_state >= SYS_STATE_smp_boot" I
suppose?

Is "cpu == 0" the best test for "is_boot_cpu()", or is there another test?

>>
>> Same here, plus the new condition at the first glance isn't matching
>> the old one, but perhaps that's intentional.
>>
> It is intentional, and that is why we're changing it! :-) Let me try to
> explain:
> 
> The idea, in both old and new code, is to call init_pcpu() either:
>  a) on the boot cpu, during boot
>  b) on non-boot cpu, *only* if they are online.
> 
> The issue is that, for assessing b), it checks whether
> cpu_to_socket(cpu) returns >=0 and, if it does, it assumes the cpu is
> online. That is not true, as cpu_data.phys_proc_id (which is what
> cpu_to_socket() go reading) is 0 even before any onlining and topolog
> identification has happened (it's a global).
> 
> Therefore, all the pcpus are initialized right away, and all are
> assigned to runqueue 0, as returned by cpu_to_socket() at this stage,
> which is clearly wrong.
> 
> In fact, this is the reason why, previous versions of this took the
> approach of initializing things such as cpu_to_socket() returned -1
> before topology identification.
> 
> In the new version, as you suggested, I'm using system_state to figure
> out whether we are dealing with the boot cpu, and that's it. :-)
> 
> Hope this clarifies things... If yes, I'll make sure to put a similar
> explanation in the changelog, when sending the patch officially.

So let's step back for a minute, and let me see if I can describe the
situation.

Unlike the other schedulers, credit2 needs to know the topology of a
cpus when setting it up, so that we can place it in the proper runqueue.

Setting up the per-cpu structures would normally happen in alloc_pdata.
 This is called in three different ways:

* During boot, on the boot processer, alloc_pdata is called from
scheduler_init().

* During boot, on the secondary processors, alloc_pdata is called from
a CPU_UP_PREPARE callback, which happens just before the cpu is actually
brought up.

* When switching a cpu to a new cpupool after boot, alloc_pdata is also
called from cpu_schedule_switch().

The "normal" place the cpu topology information can be found is in
global the cpu_data[] array, typically accessed by the cpu_to_socket()
or cpu_to_core() macros.

This topology information is written to cpu_data[] by
smp_store_cpu_info().  For the boot cpu, it happens in
smp_prepare_cpus();  For secondary cpus, it's happens in
start_secondary(), which is the code run on the cpu itself as it's being
brought up.

Unfortunately, at the moment, both of these places are after the
respective places where the alloc pdata is called for those cpus.
Flattening the entire x86 setup at the moment you'd see:
  alloc_pdata for boot cpu
  smp_store_cpu_info for boot cpu
  for each secondary cpu
alloc_pdata for secondary cpu
smp_store_cpu_info for secondary cpu

scheduler_init() is called before smp_prepare_cpus() in part because of
a dependency chain elsewhere: we cannot set up the idle domain until
scheduler_init() is called; and other places further on in the
initialization but before setting up cpu topology information assume
that the idle domain has been set up.

I haven't actually tracked down this dependency yet; nor have I explored
the possibility of populating cpu_data[] for the boot cpu earier than
it's done now.

I'm not sure exactly why we call alloc_pdata on CPU_UP_PREPARE rather
than CPU_STARTING -- whether that's just what seemed most sensible when
cpupools were created, or whether there's a dependency somewhere between
those two points.

I *think* that there probably cannot be a dependency: no cpu boot code
can (or should) depend on the cpu having been initialized, because it
doesn't happen when you're booting credit2; and no scheduler code in
alloc_pdata can (or should) depend on being in a pre-initialized state,
because it also happens when assigning an already-initialized cpu to a
cpupool.

At the moment we deal with the fact that the topology information for a
CPU is not availabl

[Xen-devel] [PATCH V4] Add flag to start info regarding virtual mapped p2m list

2015-03-18 Thread Juergen Gross
Xen pv domains are using a domain private p2m list to convert guest pfns
to mfns. This p2m list has to be updated by the Xen tools during domain
restore and migration, as the mfns will most likely change. In order to
locate the p2m list the Xen tools need an interface provided by the
guest. Up to now this interface has been the shared info page where the
guest would store the mfn of the top level page of a 3-level p2m tree.

This p2m tree is fixed in it's layout and due to the limitation of
entries it can hold at each level it is limiting the maximum size of the
p2m list which can be reported to the Xen tools. The maximum memory the
p2m tree can support for 64 bit domains is 512 GB (32 bit domains don't
have a problem, as the p2m tree limit is much higher than the supported
domain size of 64 GB).

In order to be able to support pv domains with more than 512 GB an
additional way to specify the p2m list for the Xen tools has been added:
instead of a tree structure linked via mfns, the virtual address of a
linear p2m list, the cr3 value of the related address space and the size
of the p2m list can be specified by the guest (added by commit
50bd1f0825339dfacde471df7664729216fc46e3).

Guests implementing this new interface need to know, of course, whether
the Xen tools are capable to use the new interface instead of the old
p2m tree interface. Otherwise a guest using only the new interface with
the virtual mapped linear p2m list on a machine with old Xen tools not
supporting this interface could not be restored or migrated.

The added flag in the start info indicates the Xen tool's capability to
use the new interface enabling the guest to omit the p2m tree and thus
to support more than 512 GB of RAM.

Signed-off-by: Juergen Gross 
---
 xen/include/public/xen.h | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/xen/include/public/xen.h b/xen/include/public/xen.h
index 3703c39..66c09a3 100644
--- a/xen/include/public/xen.h
+++ b/xen/include/public/xen.h
@@ -773,10 +773,12 @@ typedef struct start_info start_info_t;
 #endif /* XEN_HAVE_PV_GUEST_ENTRY */
 
 /* These flags are passed in the 'flags' field of start_info_t. */
-#define SIF_PRIVILEGED(1<<0)  /* Is the domain privileged? */
-#define SIF_INITDOMAIN(1<<1)  /* Is this the initial control domain? */
-#define SIF_MULTIBOOT_MOD (1<<2)  /* Is mod_start a multiboot module? */
-#define SIF_MOD_START_PFN (1<<3)  /* Is mod_start a PFN? */
+#define SIF_PRIVILEGED  (1<<0)  /* Is the domain privileged? */
+#define SIF_INITDOMAIN  (1<<1)  /* Is this the initial control domain? */
+#define SIF_MULTIBOOT_MOD   (1<<2)  /* Is mod_start a multiboot module? */
+#define SIF_MOD_START_PFN   (1<<3)  /* Is mod_start a PFN? */
+#define SIF_VIRT_P2M_4TOOLS (1<<4)  /* Do Xen tools understand a virt. mapped 
*/
+/* P->M making the 3 level tree obsolete? 
*/
 #define SIF_PM_MASK   (0xFF<<8) /* reserve 1 byte for xen-pm options */
 
 /*
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 0/7] Some (not only) cpupool related fixes and improvements

2015-03-18 Thread Ian Campbell
On Fri, 2015-03-13 at 12:08 +0100, Dario Faggioli wrote:
> Dario Faggioli (7):
>   xl: turn some int local variable into bool
>   xl: add -c/--cpupool option to `xl list'
>   libxl: introduce libxl_cpupool_cpu{add,remove}_cpumap()
>   xl: enable using ranges of pCPUs when manipulating cpupools
>   xl: enable using ranges of pCPUs when creating cpupools
>   xl: make error reporting of cpupool subcommands consistent
>   xl: use libxl_cpupoolinfo_list_free() in main_cpupoolnumasplit

Applied with Wei's acks.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 2/2] sched_credit2.c: runqueue_per_core code

2015-03-18 Thread Jan Beulich
>>> On 18.03.15 at 16:26,  wrote:
> In both cases there's a slight risk in using system_state to determine
> whether to rely on cpu_data[] or not, because there's actually a window
> for each processor after system_state == SYS_STATE_smp_boot where
> cpu_data[] is *not* initialized, but it's not obvious from looking at
> the data itself.  If the callback mechanisms ever change order with the
> cpu_data[] initializations in the future, we risk a situation where
> credit2 silently regresses to using a single massive runqueue.

But such an ordering change is extremely unlikely, since the
CPU_STARTING notification specifically tells you that the given
CPU is now ready for normal use, which includes it having its
topology related data set up.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] dpci: Put the dpci back on the list if scheduled from another CPU.

2015-03-18 Thread Jan Beulich
>>> On 18.03.15 at 14:58,  wrote:
> On Wed, Mar 18, 2015 at 07:38:12AM +, Jan Beulich wrote:
>> >>> On 17.03.15 at 18:15,  wrote:
>> > The latest one (please see attached) would cause an dead-lock iff
>> > on the CPU we are running the softirq and an do_IRQ comes for the
>> > exact dpci we are in process of executing.
>> 
>> I'm afraid I'm not seeing it - please explain.
> 
> Lets assume that the device is an PCIe with MSI. We have only one
> VCPU in the guest.
> 
> We receive the first interrupt, end up going:
> vmx_vmexit_handler
>  - case EXIT_REASON_EXTERNAL_INTERRUPT
>\- vmx_do_extint
>   \- do_IRQ
>   \- __do_IRQ_guest
>   \- hvm_do_IRQ_dpci
>  \- raise_softirq_for
>   [DPCI_SOFTIRQ bit set]
> vmx_process_softirq
>  sti
>  do_softirq
>-\ __do_sofitq_
> \- dpci_softirq
> -\ hvm_dirq_assist
>[state is 'running']
> 
> [Same vector comes in]

Is that indeed possible? Afaict nothing in the code sequence above
ack-ed the interrupt, and hence another one can't come in.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 2/2] sched_credit2.c: runqueue_per_core code

2015-03-18 Thread George Dunlap
On 03/18/2015 03:59 PM, Jan Beulich wrote:
 On 18.03.15 at 16:26,  wrote:
>> In both cases there's a slight risk in using system_state to determine
>> whether to rely on cpu_data[] or not, because there's actually a window
>> for each processor after system_state == SYS_STATE_smp_boot where
>> cpu_data[] is *not* initialized, but it's not obvious from looking at
>> the data itself.  If the callback mechanisms ever change order with the
>> cpu_data[] initializations in the future, we risk a situation where
>> credit2 silently regresses to using a single massive runqueue.
> 
> But such an ordering change is extremely unlikely, since the
> CPU_STARTING notification specifically tells you that the given
> CPU is now ready for normal use, which includes it having its
> topology related data set up.

I didn't mean so much that CPU_STARTING might change meaning, but that
someone looking only at schedule.c might think that it would make sense
to call alloc_pdata() somewhere else, before cpu_data[] had been
populated.  Even if they look at init_pcpu() in credit2, they're
unlikely to know that cpu_to_socket() isn't valid until after
boot_secondary() has happened; and if they try it and boot it,
everything will seem to work perfectly for credit2 -- except that there
will be a single global runqueue.

If you, Dario and I don't happen to catch it -- or don't happen to
remember the exact details of the constraints -- nobody may notice until
a year later.

This is the point of having ASSERTs -- so that we don't have to worry
about remembering and catching all these crazy constraints.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] (v2) VT-d Posted-interrupt (PI) design for XEN

2015-03-18 Thread Konrad Rzeszutek Wilk
On Wed, Mar 18, 2015 at 12:44:21PM +, Wu, Feng wrote:
> VT-d Posted-interrupt (PI) design for XEN
> 
> Background
> ==
> With the development of virtualization, there are more and more device
> assignment requirements. However, today when a VM is running with
> assigned devices (such as, NIC), external interrupt handling for the assigned
> devices always needs VMM intervention.
> 
> VT-d Posted-interrupt is a more enhanced method to handle interrupts
> in the virtualization environment. Interrupt posting is the process by
> which an interrupt request is recorded in a memory-resident
> posted-interrupt-descriptor structure by the root-complex, followed by
> an optional notification event issued to the CPU complex.
> 
> With VT-d Posted-interrupt we can get the following advantages:
> - Direct delivery of external interrupts to running vCPUs without VMM
> intervention


I hadn't digged deep in what Xen has currently - but I would assume that
this is exactly what we have now in Xen?

Hm, actually we seem to be still invoking the hypervisor on the
interrupts  -except that if we need to dispatch it to another CPU
using an normal vector to do so - which would still cause the
hypervisor to be invoked? Or does it actually go straight in the
guest?

So what kind of support do we currently have in Xen from posted
interrupt? Could you add a bit about this in the background please?

> - Decrease the interrupt migration complexity. On vCPU migration, software
> can atomically co-migrate all interrupts targeting the migrating vCPU. For
> virtual machines with assigned devices, migrating a vCPU across pCPUs
> either incur the overhead of forwarding interrupts in software (e.g. via VMM
> generated IPIS), or complexity to independently migrate each interrupt 
> targeting
> the vCPU to the new pCPU. However, after enabling VT-d PI, the destination 
> vCPU
> of an external interrupt from assigned devices is stored in the IRTE (i.e.
> Posted-interrupt Descriptor Address), when vCPU is migrated to another pCPU,
> we will set this new pCPU in the 'NDST' filed of Posted-interrupt descriptor, 
> this
> make the interrupt migration automatic.
> 
> 
> Posted-interrupt Introduction
> 
> There are two components to the Posted-interrupt architecture:
> Processor Support and Root-Complex Support
> 
> - Processor Support
> Posted-interrupt processing is a feature by which a processor processes
> the virtual interrupts by recording them as pending on the virtual-APIC
> page.
> 
> Posted-interrupt processing is enabled by setting the "process posted
> interrupts" VM-execution control. The processing is performed in response
> to the arrival of an interrupt with the posted-interrupt notification vector.
> In response to such an interrupt, the processor processes virtual interrupts
> recorded in a data structure called a posted-interrupt descriptor.
> 
> More information about APICv and CPU-side Posted-interrupt, please refer
> to Chapter 29, and Section 29.6 in the Intel SDM:
> http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-manual-325462.pdf
> 
> - Root-Complex Support
> Interrupt posting is the process by which an interrupt request (from IOAPIC
> or MSI/MSIx capable sources) is recorded in a memory-resident
> posted-interrupt-descriptor structure by the root-complex, followed by
> an optional notification event issued to the CPU complex. The interrupt
> request arriving at the root-complex carry the identity of the interrupt
> request source and a 'remapping-index'. The remapping-index is used to
> look-up an entry from the memory-resident interrupt-remap-table. Unlike
> with interrupt-remapping, the interrupt-remap-table-entry for a posted-
> interrupt, specifies a virtual-vector and a pointer to the posted-interrupt
> descriptor. The virtual-vector specifies the vector of the interrupt to be
> recorded in the posted-interrupt descriptor. The posted-interrupt descriptor
> hosts storage for the virtual-vectors and contains the attributes of the
> notification event (interrupt) to be issued to the CPU complex to inform
> CPU/software about pending interrupts recorded in the posted-interrupt
> descriptor.
> 
> More information about VT-d PI, please refer to
> http://www.intel.com/content/www/us/en/intelligent-systems/intel-technology/vt-directed-io-spec.html
> 
> Important Definitions
> ==
> There are some changes to IRTE and posted-interrupt descriptor after
> VT-d PI is introduced:

s/is/was/

> IRTE:
> Posted-interrupt Descriptor Address: the address of the posted-interrupt 
> descriptor
> Virtual Vector: the guest vector of the interrupt
> URG: indicates if the interrupt is urgent
> 
> Posted-interrupt descriptor:
> The Posted Interrupt Descriptor hosts the following fields:
> Posted Interrupt Request (PIR): Provide storage for posting (recording) 
> interrupts (one bit
> per vector, for up to 256 vectors).
> 
> Outstanding

Re: [Xen-devel] [PATCH] dpci: Put the dpci back on the list if scheduled from another CPU.

2015-03-18 Thread Konrad Rzeszutek Wilk
On Wed, Mar 18, 2015 at 04:06:37PM +, Jan Beulich wrote:
> >>> On 18.03.15 at 14:58,  wrote:
> > On Wed, Mar 18, 2015 at 07:38:12AM +, Jan Beulich wrote:
> >> >>> On 17.03.15 at 18:15,  wrote:
> >> > The latest one (please see attached) would cause an dead-lock iff
> >> > on the CPU we are running the softirq and an do_IRQ comes for the
> >> > exact dpci we are in process of executing.
> >> 
> >> I'm afraid I'm not seeing it - please explain.
> > 
> > Lets assume that the device is an PCIe with MSI. We have only one
> > VCPU in the guest.
> > 
> > We receive the first interrupt, end up going:
> > vmx_vmexit_handler
> >  - case EXIT_REASON_EXTERNAL_INTERRUPT
> >\- vmx_do_extint
> > \- do_IRQ
> >   \- __do_IRQ_guest
> >   \- hvm_do_IRQ_dpci
> >  \- raise_softirq_for
> > [DPCI_SOFTIRQ bit set]
> > vmx_process_softirq
> >  sti
> >  do_softirq
> >-\ __do_sofitq_
> > \- dpci_softirq
> > -\ hvm_dirq_assist
> >[state is 'running']
> > 
> > [Same vector comes in]
> 
> Is that indeed possible? Afaict nothing in the code sequence above
> ack-ed the interrupt, and hence another one can't come in.

The do_IRQ acks it:

 854 spin_lock(&desc->lock);
 855 desc->handler->ack(desc);

which is:
 424 static void ack_maskable_msi_irq(struct irq_desc *desc)
 425 {
 426 ack_nonmaskable_msi_irq(desc);
 427 ack_APIC_irq(); /* ACKTYPE_NONE */
 428 }


.. snip..
 857 if ( likely(desc->status & IRQ_GUEST) )

..
 886 __do_IRQ_guest(irq);


> 
> Jan
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 01/14] libxc: Replaces tabs with spaces in xc_cpupool_freeinfo

2015-03-18 Thread Ian Campbell
On Mon, 2015-03-16 at 11:39 -0400, Konrad Rzeszutek Wilk wrote:
> The goto looks very wrong when the rest of the code
> has spaces.
> 
> Signed-off-by: Konrad Rzeszutek Wilk 

Acked-by: Ian Campbell 




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH] xen-netback: making the bandwidth limiter runtime settable

2015-03-18 Thread Imre Palik
On 03/17/15 12:17, Wei Liu wrote:
> On Fri, Mar 13, 2015 at 01:51:05PM +0100, Imre Palik wrote:
>> From: "Palik, Imre" 
>>
>> With the current netback, the bandwidth limiter's parameters are only
>> settable during vif setup time.  This patch register a watch on them, and
>> thus makes them runtime changeable.
>>
>> When the watch fires, the timer is reset.  The timer's mutex is used for
>> fencing the change.
>>
> 
> I think this is a valid idea.  Just that this commit message is not
> complete. It doesn't describe everything this patch does.
> 
>> Cc: Anthony Liguori 
>> Signed-off-by: Imre Palik 
>> ---
> [...]
>>  queue->rx_queue_max = XENVIF_RX_QUEUE_BYTES;
>> diff --git a/drivers/net/xen-netback/netback.c 
>> b/drivers/net/xen-netback/netback.c
>> index cab9f52..bcc1880 100644
>> --- a/drivers/net/xen-netback/netback.c
>> +++ b/drivers/net/xen-netback/netback.c
>> @@ -641,7 +641,7 @@ static void tx_add_credit(struct xenvif_queue *queue)
>>  queue->remaining_credit = min(max_credit, max_burst);
>>  }
>>  
>> -static void tx_credit_callback(unsigned long data)
>> +void xenvif_tx_credit_callback(unsigned long data)
> 
> Please keep this function static.

The trouble with that, is that now I am initialising credit_timeout.function in
drivers/net/xen-netback/interface.c .

The reason for the change is that mod_timer_pending() hits a BUG() if
the timeout function is not initialised.

> 
> And say in the commit message you change tx_credit_callback to a better
> name.
> 
>>  {
>>  struct xenvif_queue *queue = (struct xenvif_queue *)data;
>>  tx_add_credit(queue);
>> @@ -1170,7 +1170,7 @@ static bool tx_credit_exceeded(struct xenvif_queue 
>> *queue, unsigned size)
>>  queue->credit_timeout.data =
>>  (unsigned long)queue;
>>  queue->credit_timeout.function =
>> -tx_credit_callback;
>> +xenvif_tx_credit_callback;
>>  mod_timer(&queue->credit_timeout,
>>next_credit);
> [...]
>> @@ -594,13 +597,9 @@ static void xen_net_read_rate(struct xenbus_device *dev,
>>  unsigned long b, u;
>>  char *ratestr;
>>  
>> -/* Default to unlimited bandwidth. */
>> -*bytes = ~0UL;
>> -*usec = 0;
>> -
>>  ratestr = xenbus_read(XBT_NIL, dev->nodename, "rate", NULL);
>>  if (IS_ERR(ratestr))
>> -return;
>> +goto reset;
>>  
>>  s = ratestr;
>>  b = simple_strtoul(s, &e, 10);
>> @@ -612,15 +611,21 @@ static void xen_net_read_rate(struct xenbus_device 
>> *dev,
>>  if ((s == e) || (*e != '\0'))
>>  goto fail;
>>  
>> +kfree(ratestr);
>> +
>>  *bytes = b;
>>  *usec = u;
>>  
>> -kfree(ratestr);
>>  return;
>>  
>> - fail:
>> +fail:
>>  pr_warn("Failed to parse network rate limit. Traffic unlimited.\n");
>>  kfree(ratestr);
>> +
>> +reset:
>> +/* Default to unlimited bandwidth. */
>> +*bytes = ~0UL;
>> +*usec = 0;
>>  }
>>  
> 
> Any reason you modify this function? It is still doing the exact same
> thing, right?

These changes made sense before the channel support, but not anymore.
I will roll them back.

> 
>>  static int xen_net_read_mac(struct xenbus_device *dev, u8 mac[])
>> @@ -645,6 +650,59 @@ static int xen_net_read_mac(struct xenbus_device *dev, 
>> u8 mac[])
>>  return 0;
>>  }
>>  
>> +static void xen_net_rate_changed(struct xenbus_watch *watch,
>> +const char **vec, unsigned int len)
>> +{
>> +struct xenvif *vif = container_of(watch, struct xenvif, credit_watch);
>> +struct xenbus_device *dev = xenvif_to_xenbus_device(vif);
>> +unsigned long   credit_bytes;
>> +unsigned long   credit_usec;
>> +unsigned int queue_index;
>> +
>> +xen_net_read_rate(dev, &credit_bytes, &credit_usec);
>> +for (queue_index = 0; queue_index < vif->num_queues; queue_index++) {
>> +struct xenvif_queue *queue = &vif->queues[queue_index];
>> +
>> +queue->credit_bytes = credit_bytes;
>> +queue->credit_usec = credit_usec;
>> +if (!mod_timer_pending(&queue->credit_timeout, jiffies) &&
>> +queue->remaining_credit > queue->credit_bytes) {
>> +queue->remaining_credit = queue->credit_bytes;
>> +}
>> +}
>> +}
>> +
>> +static int xen_register_watchers(struct xenbus_device *dev, struct xenvif 
>> *vif)
>> +{
>> +int err = 0;
>> +char *node;
>> +unsigned maxlen = strlen(dev->nodename) + sizeof("/rate");
>> +
>> +node = kmalloc(maxlen, GFP_KERNEL);
>> +if (!node)
>> +return -ENOMEM;
>> +sprintf(node, "%s/rate", dev->nodename);
> 
> Please use snprintf. (Though I can see using sprintf is fine here but I
> want the code to be a bit future proof.)
> 
>> +vif->credit_watch.node = node;
>> +vif->credit_watch.callback = xen_net_rate_changed;
>> +err = register_xenbus_watch(&vif->credit_watch);
>> +if (err

Re: [Xen-devel] [PATCH 05/14] libxl: xc_physdev_map return -1 and populate errno.

2015-03-18 Thread Ian Campbell
On Mon, 2015-03-16 at 11:39 -0400, Konrad Rzeszutek Wilk wrote:
> The users of these (qemu) check for a negative value
> so we are safe in regards to that. However they
> also use the return value to inform the user of the
> error.

IIRC I saw a qemu patch go past to fix the callers?

> Signed-off-by: Konrad Rzeszutek Wilk 

Acked-by: Ian Campbell 

> ---
>  tools/libxc/xc_physdev.c | 12 
>  1 file changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/tools/libxc/xc_physdev.c b/tools/libxc/xc_physdev.c
> index cf02d85..9b064b8 100644
> --- a/tools/libxc/xc_physdev.c
> +++ b/tools/libxc/xc_physdev.c
> @@ -43,8 +43,10 @@ int xc_physdev_map_pirq(xc_interface *xch,
>  struct physdev_map_pirq map;
>  
>  if ( !pirq )
> -return -EINVAL;
> -
> +{
> +errno = EINVAL;
> +return -1;
> +}
>  memset(&map, 0, sizeof(struct physdev_map_pirq));
>  map.domid = domid;
>  map.type = MAP_PIRQ_TYPE_GSI;
> @@ -72,8 +74,10 @@ int xc_physdev_map_pirq_msi(xc_interface *xch,
>  struct physdev_map_pirq map;
>  
>  if ( !pirq )
> -return -EINVAL;
> -
> +{
> +errno = EINVAL;
> +return -1;
> +}
>  memset(&map, 0, sizeof(struct physdev_map_pirq));
>  map.domid = domid;
>  map.type = MAP_PIRQ_TYPE_MSI;



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 06/14] libxl: Return negative value and propagate errno for xc_offline_page API

2015-03-18 Thread Ian Campbell
On Mon, 2015-03-16 at 11:39 -0400, Konrad Rzeszutek Wilk wrote:
> Instead of returning -Exx we now return -1 for error.
> We could stash the -Exx values in errno values but why - the
> underlaying functions we call all stash the proper errno
> value. Hence we just propagate it up wherver it is needed.
> 
> Signed-off-by: Konrad Rzeszutek Wilk 

Acked-by: Ian Campbell 



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 07/14] libxl: Fix xc_pm API calls to return negative error and stash error in errno.

2015-03-18 Thread Ian Campbell
On Mon, 2015-03-16 at 11:39 -0400, Konrad Rzeszutek Wilk wrote:
> Oddly enough the user of this API did the right thing -
> check for return being negative and used 'errno' for the
> real error.
> 
> Signed-off-by: Konrad Rzeszutek Wilk 

Acked-by: Ian Campbell 



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 08/14] libxl: Fix xc_tmem_control to return proper error.

2015-03-18 Thread Ian Campbell
On Mon, 2015-03-16 at 11:39 -0400, Konrad Rzeszutek Wilk wrote:
> The API returns now negative values on error and stashes
> the error in errno. Fix the user of this API.
> 
> The 'xc_hypercall_bounce_pre' can fail - and if so it will
> stash its errno values - no need to over-write it.
> 
> Signed-off-by: Konrad Rzeszutek Wilk 
> ---
>  tools/libxc/xc_tmem.c  | 14 ++
>  tools/xenstat/libxenstat/src/xenstat.c |  5 +++--
>  2 files changed, 13 insertions(+), 6 deletions(-)
> 
> diff --git a/tools/libxc/xc_tmem.c b/tools/libxc/xc_tmem.c
> index 3261e10..02797bf 100644
> --- a/tools/libxc/xc_tmem.c
> +++ b/tools/libxc/xc_tmem.c
> @@ -73,11 +73,14 @@ int xc_tmem_control(xc_interface *xch,
>  if ( subop == TMEMC_LIST && arg1 != 0 )
>  {
>  if ( buf == NULL )
> -return -EINVAL;
> +{
> +errno = EINVAL;
> +return -1;
> +}
>  if ( xc_hypercall_bounce_pre(xch, buf) )
>  {
>  PERROR("Could not bounce buffer for tmem control hypercall");
> -return -ENOMEM;
> +return -1;
>  }
>  }
>  
> @@ -118,11 +121,14 @@ int xc_tmem_control_oid(xc_interface *xch,
>  if ( subop == TMEMC_LIST && arg1 != 0 )
>  {
>  if ( buf == NULL )
> -return -EINVAL;
> +{
> +errno = EINVAL;
> +return -1;
> +}
>  if ( xc_hypercall_bounce_pre(xch, buf) )
>  {
>  PERROR("Could not bounce buffer for tmem control (OID) 
> hypercall");
> -return -ENOMEM;
> +return -1;
>  }
>  }
>  
> diff --git a/tools/xenstat/libxenstat/src/xenstat.c 
> b/tools/xenstat/libxenstat/src/xenstat.c
> index 8072a90..bf257ef 100644
> --- a/tools/xenstat/libxenstat/src/xenstat.c
> +++ b/tools/xenstat/libxenstat/src/xenstat.c
> @@ -166,6 +166,7 @@ xenstat_node *xenstat_get_node(xenstat_handle * handle, 
> unsigned int flags)
>   xc_domaininfo_t domaininfo[DOMAIN_CHUNK_SIZE];
>   int new_domains;
>   unsigned int i;
> + long rc;
>  
>   /* Create the node */
>   node = (xenstat_node *) calloc(1, sizeof(xenstat_node));
> @@ -189,9 +190,9 @@ xenstat_node *xenstat_get_node(xenstat_handle * handle, 
> unsigned int flags)
>   node->free_mem = ((unsigned long long)physinfo.free_pages)
>   * handle->page_size;
>  
> - node->freeable_mb = (long)xc_tmem_control(handle->xc_handle, -1,
> + rc = (long)xc_tmem_control(handle->xc_handle, -1,
>   TMEMC_QUERY_FREEABLE_MB, -1, 0, 0, 0, NULL);

Why the cast, why not make rc an int since that is what xc_tmem_control
takes and you don't seem to use the full width anyway?

Or alternatively fix the return type of xc_tmem_control.

> -
> + node->freeable_mb = (rc < 0) ? 0 : rc;

Should rc not get propagated into an error for the caller?

>   /* malloc(0) is not portable, so allocate a single domain.  This will
>* be resized below. */
>   node->domains = malloc(sizeof(xenstat_domain));



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2] xentop: add support for qdisks

2015-03-18 Thread Ian Campbell
On Wed, 2015-03-11 at 11:51 -0600, Charles Arnold wrote:
> Now that Xen uses qdisks by default and qemu does not write out
> statistics to sysfs this patch queries the QMP for disk statistics.
> 
> This patch depends on libyajl for parsing statistics returned from
> QMP. The runtime requires libyajl 2.0.3 or newer for required bug
> fixes in yajl_tree_parse().

Elsewhere we currently support libyajl1 even, which means that this is
all configure tests for.

You say bug fixes here, but the code comment says:
 /* Use libyajl version 2.1.x or newer for the tree parser feature with bug 
fixes */

which suggests it perhaps didn't even exist in earlier versions. Also I
note the quoted versions differ, FWIW.

Whether the interface exists (even in buggy form) or not in older
versions is important because if it doesn't exist then that would be a
build failure, which we would want to avoid.

Whereas a functional failure would perhaps be tolerable. However, given
the existing HAVE_YAJL_YAJL_VERSION_H define I think the code could
easily check if the YAJL library is good enough at compile time and stub
itself out -- i.e. not report qdisk stats if the yajl doesn't do the
job.

My second concern here is with the use of /var/run/xen/qmp-libxl-%i from
outside of libxl. I can't remember if qemu is safe against multiple
users of the socket. ISTR asking Anthony this before, but I don't recall
the answer, sorry :-(

Even if it is strictly speaking ok it seems a bit warty to do it, but
perhaps for an in-tree user like libxenstat it is tolerable.
Alternatively we could (relatively) easily arrange for their to be a
second qemp-libxenstat-%i socket, assuming the qemu overhead of a second
one is sane.

Would it be possible to include somewhere, either in a code comment or
in the changelog, an example of the JSON response to the QMP commands.

(I'm also consistently surprised by the lack of a qmp client library,
but that's not your fault!)

> diff --git a/tools/xenstat/libxenstat/src/xenstat.c 
> b/tools/xenstat/libxenstat/src/xenstat.c
> index 8072a90..f3847be 100644
> --- a/tools/xenstat/libxenstat/src/xenstat.c
> +++ b/tools/xenstat/libxenstat/src/xenstat.c
> @@ -657,6 +657,24 @@ static void xenstat_uninit_xen_version(xenstat_handle * 
> handle)
>   * VBD functions
>   */
>  
> +/* Save VBD information */
> +xenstat_vbd *xenstat_save_vbd(xenstat_domain *domain, xenstat_vbd *vbd)
> +{
> +if (domain->vbds == NULL) {
> +domain->num_vbds = 1;
> +domain->vbds = malloc(sizeof(xenstat_vbd));
> +} else {
> +domain->num_vbds++;
> +domain->vbds = realloc(domain->vbds,
> +   domain->num_vbds *
> +   sizeof(xenstat_vbd));
> +}

FYI realloc handles the old pointer being NULL just fine, so you don't
need to special case that so long as num_vbds starts out initialised to
0.

Also, if realloc returns NULL then you need to have remembered the old
value to free it, else it gets leaked.

> @@ -477,18 +480,10 @@ int xenstat_collect_vbds(xenstat_node * node)
>   continue;
>   }
>  
> - if (domain->vbds == NULL) {
> - domain->num_vbds = 1;
> - domain->vbds = malloc(sizeof(xenstat_vbd));
> - } else {
> - domain->num_vbds++;
> - domain->vbds = realloc(domain->vbds,
> -domain->num_vbds *
> -sizeof(xenstat_vbd));
> - }

Oh, I see my comments above were actually on the old code you were
moving.


> + /* Use libyajl version 2.1.x or newer for the tree parser feature with 
> bug fixes */
> + if ((info = yajl_tree_parse((char *)qmp_stats, errbuf, sizeof(errbuf))) 
> == NULL) {

You don't want to log something using errbuf? If not then it may as well
be as small as possible.

> + /* Use libyajl version 2.0.3 or newer for the tree parser feature */
> + if ((info = yajl_tree_parse((char *)stats_buf, errbuf, sizeof(errbuf))) 
> == NULL)

Likewise.

> +/* Get all the active domains */

actually only up to 1024 of them ;-)

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 05/14] libxl: xc_physdev_map return -1 and populate errno.

2015-03-18 Thread Konrad Rzeszutek Wilk
On Wed, Mar 18, 2015 at 04:21:52PM +, Ian Campbell wrote:
> On Mon, 2015-03-16 at 11:39 -0400, Konrad Rzeszutek Wilk wrote:
> > The users of these (qemu) check for a negative value
> > so we are safe in regards to that. However they
> > also use the return value to inform the user of the
> > error.
> 
> IIRC I saw a qemu patch go past to fix the callers?
> 

Yes and Stefano has already Acked them.
> > Signed-off-by: Konrad Rzeszutek Wilk 
> 
> Acked-by: Ian Campbell 

Yeey!
> 
> > ---
> >  tools/libxc/xc_physdev.c | 12 
> >  1 file changed, 8 insertions(+), 4 deletions(-)
> > 
> > diff --git a/tools/libxc/xc_physdev.c b/tools/libxc/xc_physdev.c
> > index cf02d85..9b064b8 100644
> > --- a/tools/libxc/xc_physdev.c
> > +++ b/tools/libxc/xc_physdev.c
> > @@ -43,8 +43,10 @@ int xc_physdev_map_pirq(xc_interface *xch,
> >  struct physdev_map_pirq map;
> >  
> >  if ( !pirq )
> > -return -EINVAL;
> > -
> > +{
> > +errno = EINVAL;
> > +return -1;
> > +}
> >  memset(&map, 0, sizeof(struct physdev_map_pirq));
> >  map.domid = domid;
> >  map.type = MAP_PIRQ_TYPE_GSI;
> > @@ -72,8 +74,10 @@ int xc_physdev_map_pirq_msi(xc_interface *xch,
> >  struct physdev_map_pirq map;
> >  
> >  if ( !pirq )
> > -return -EINVAL;
> > -
> > +{
> > +errno = EINVAL;
> > +return -1;
> > +}
> >  memset(&map, 0, sizeof(struct physdev_map_pirq));
> >  map.domid = domid;
> >  map.type = MAP_PIRQ_TYPE_MSI;
> 
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 2/2] sched_credit2.c: runqueue_per_core code

2015-03-18 Thread Dario Faggioli
On Wed, 2015-03-18 at 16:08 +, George Dunlap wrote:
> On 03/18/2015 03:59 PM, Jan Beulich wrote:
>  On 18.03.15 at 16:26,  wrote:
> >> In both cases there's a slight risk in using system_state to determine
> >> whether to rely on cpu_data[] or not, because there's actually a window
> >> for each processor after system_state == SYS_STATE_smp_boot where
> >> cpu_data[] is *not* initialized, but it's not obvious from looking at
> >> the data itself.  If the callback mechanisms ever change order with the
> >> cpu_data[] initializations in the future, we risk a situation where
> >> credit2 silently regresses to using a single massive runqueue.
> > 
> > But such an ordering change is extremely unlikely, since the
> > CPU_STARTING notification specifically tells you that the given
> > CPU is now ready for normal use, which includes it having its
> > topology related data set up.

> Even if they look at init_pcpu() in credit2, they're
> unlikely to know that cpu_to_socket() isn't valid until after
> boot_secondary() has happened; and if they try it and boot it,
> everything will seem to work perfectly for credit2 -- except that there
> will be a single global runqueue.
> 
Agreed... and I'm still replying to your (George's) email, but just
wanted to say that, for this, having system_state referenced from
Credit2's code would help making it clear the fact that code has
dependencies with the boot process, wouldn't it?

> If you, Dario and I don't happen to catch it -- or don't happen to
> remember the exact details of the constraints -- nobody may notice until
> a year later.
> 
:-)

> This is the point of having ASSERTs -- so that we don't have to worry
> about remembering and catching all these crazy constraints.
> 
I'm all for finding a way for ASSERT()ing something meaningful to this
effect.

Regards,
Dario


signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 02/14] libxl: Propagate errno from hypercall instead of anything else.

2015-03-18 Thread Ian Campbell
On Mon, 2015-03-16 at 11:39 -0400, Konrad Rzeszutek Wilk wrote:
> After we have done the hypercall - the errno has the failure
> code. However our usage of pthread and munmap can trigger them
> to manipulate the errno with their failure values. That would
> be bad as what we care about is just the hypercall error value.
> 
> Another solution to this would be to save the 'errno' from
> pthread/munmap/madvise as an extra parameter to be analyzed
> later. However the call-sites above us do not care about it.
> 
> Signed-off-by: Konrad Rzeszutek Wilk 

Acked-by: Ian Campbell 

> ---
>  tools/libxc/xc_freebsd_osdep.c | 3 +++
>  tools/libxc/xc_hcall_buf.c | 6 ++
>  tools/libxc/xc_linux_osdep.c   | 3 +++
>  3 files changed, 12 insertions(+)
> 
> diff --git a/tools/libxc/xc_freebsd_osdep.c b/tools/libxc/xc_freebsd_osdep.c
> index 151d3bf..bb6d240 100644
> --- a/tools/libxc/xc_freebsd_osdep.c
> +++ b/tools/libxc/xc_freebsd_osdep.c
> @@ -125,10 +125,13 @@ static void 
> freebsd_privcmd_free_hypercall_buffer(xc_interface *xch,
>int npages)
>  {
>  
> +int saved_errno = errno;
>  /* Unlock pages */
>  munlock(ptr, npages * XC_PAGE_SIZE);
>  
>  munmap(ptr, npages * XC_PAGE_SIZE);
> +/* We MUST propagate the hypercall errno, not unmap calls. */

If you have cause to resend then:

 /* We MUST Propagate the hypercall's ernno, not the unmap call's. */.

> +/* Ignore the pthread errors. */

s/the //

(both throughout).

Not a big deal but if it needs resending anyway

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 03/14] libxc: xc_core_arch_memory_map_get populate errno

2015-03-18 Thread Ian Campbell
On Mon, 2015-03-16 at 11:39 -0400, Konrad Rzeszutek Wilk wrote:
> with proper value (ENOMEM) when reporting failures.
> 
> Signed-off-by: Konrad Rzeszutek Wilk 
> ---
> [v1: errno before using PERROR]
> ---
>  tools/libxc/xc_core_arm.c | 1 +
>  tools/libxc/xc_core_x86.c | 1 +
>  2 files changed, 2 insertions(+)
> 
> diff --git a/tools/libxc/xc_core_arm.c b/tools/libxc/xc_core_arm.c
> index 16508e7..34185cf 100644
> --- a/tools/libxc/xc_core_arm.c
> +++ b/tools/libxc/xc_core_arm.c
> @@ -54,6 +54,7 @@ xc_core_arch_memory_map_get(xc_interface *xch, struct 
> xc_core_arch_context *unus
>  map = malloc(sizeof(*map));
>  if ( map == NULL )
>  {
> +errno = ENOMEM;

http://pubs.opengroup.org/onlinepubs/9699919799/functions/malloc.html#tag_16_311
 says that malloc will set errno itself:
"Otherwise, it shall return a null pointer [CX] [Option Start]
and set errno to indicate the error"



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 04/14] libxc: Fix xc_domain_get_tsc_info to return -1 instead of -Exx.

2015-03-18 Thread Ian Campbell
On Mon, 2015-03-16 at 11:39 -0400, Konrad Rzeszutek Wilk wrote:
> We don't need to put fill errno because xc_hypercall_buffer_alloc
> fills the errno with the appropiate errno values and we just
> need to pass them up the stack.

"appropriate"

> Signed-off-by: Konrad Rzeszutek Wilk 

Acked-by: Ian Campbell 



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 13/14] libxl: Don't assign return value to errno for E820 get/set xc_ calls.

2015-03-18 Thread Ian Campbell
On Mon, 2015-03-16 at 11:39 -0400, Konrad Rzeszutek Wilk wrote:

> We should be using the errno that the hypercall left
> instead of overwritting it with the return value.

"overwriting"

> Signed-off-by: Konrad Rzeszutek Wilk 

Acked-by: Ian Campbell 




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Any work on sharing of large multi-page segments?

2015-03-18 Thread George Dunlap
On Tue, Mar 17, 2015 at 11:45 PM, Andrew Warkentin  wrote:
> On 3/17/15, George Dunlap  wrote:
>> Any deduplication code would run in as a process probably in domain 0,
>> and may be somewhat slow; but the actual mechanism of sharing is a
>> generic mechanism in the hypervisor which any client can use.  Jan is
>> suggesting that you might be able to use that interface to
>> pro-actively tell Xen about the memory pages shared between your
>> various domains.
>>
>
> I wasn't quite sure if it's generic enough to use to implement shared
> segments, or if it were specific to deduplication at the hypervisor
> level.

I haven't used the shared memory interface myself, but we designed it
I'm pretty sure that was the intention.  An idea was discussed, for
instance, was that scanning random pages for duplicates has turned out
to actually be not much benefit for VMWare; but that a more promising
avenue might be hooking into the block layer, so that if VM A and VM B
have disks that share an image base, and block X is shared, that if VM
A reads block X and then VM B reads block X, instead of issuing a
second DMA into VM B's memory, the disk layer can just tell Xen,
"please share this page copy-on-write between VM A and VM B".

Remember also that Jan was recommending this method for sharing
read-only shared data and executable segments, not for areas for
communication (i.e.., where both are going to want to write to the
memory and have the other side see the updates).  For that you've got
to go with grant tables.

Anyway, I think it's worth looking into.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v1] Fix libxc return -E misusage.

2015-03-18 Thread Ian Campbell
On Mon, 2015-03-16 at 11:39 -0400, Konrad Rzeszutek Wilk wrote:
> Hey,
> 
> Please see the following set of patches which fix the various
> usage of return -Exx instead of return -1 for errors (and
> stashing the error value in errno).

All of them, or just a subset of the callpaths?

If it's all of them then I'm amazed it was only 14 patches!

Either way, well done and thanks for draining this swamp even a bit!

> It also cleans up some of the invalid usage of errno (as the
> underlaying calls already stash errno values) - and we just
> need to bubble them up.
> 
> Lastly it also wraps an errno-invisibility shield against
> xc_hypercall_bounce_* calls so that any errors in those
> won't over-write the hypercall ones.
> 
> 
>  tools/libxc/xc_core_arm.c  | 15 --
>  tools/libxc/xc_core_x86.c  | 22 +++---
>  tools/libxc/xc_cpupool.c   |  4 +--
>  tools/libxc/xc_dom_x86.c   |  7 +++--
>  tools/libxc/xc_domain.c|  2 +-
>  tools/libxc/xc_domain_save.c   |  8 -
>  tools/libxc/xc_freebsd_osdep.c |  3 ++
>  tools/libxc/xc_hcall_buf.c |  6 
>  tools/libxc/xc_linux_osdep.c   |  3 ++
>  tools/libxc/xc_offline_page.c  | 36 +--
>  tools/libxc/xc_physdev.c   | 12 +---
>  tools/libxc/xc_pm.c| 54 
> ++
>  tools/libxc/xc_private.c   |  4 +--
>  tools/libxc/xc_tmem.c  | 14 ++---
>  tools/libxc/xg_save_restore.h  |  3 ++
>  tools/libxl/libxl.c|  4 +--
>  tools/libxl/libxl_x86.c|  9 ++
>  tools/misc/xen-hptool.c|  6 ++--
>  tools/misc/xen-mfndump.c   |  2 +-
>  tools/tests/mem-sharing/memshrtool.c   | 12 ++--
>  tools/xenstat/libxenstat/src/xenstat.c |  5 ++--
>  21 files changed, 158 insertions(+), 73 deletions(-)
> 
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH] xen-netback: making the bandwidth limiter runtime settable

2015-03-18 Thread Wei Liu
On Wed, Mar 18, 2015 at 05:21:08PM +0100, Imre Palik wrote:
> On 03/17/15 12:17, Wei Liu wrote:
> > On Fri, Mar 13, 2015 at 01:51:05PM +0100, Imre Palik wrote:
> >> From: "Palik, Imre" 
> >>
> >> With the current netback, the bandwidth limiter's parameters are only
> >> settable during vif setup time.  This patch register a watch on them, and
> >> thus makes them runtime changeable.
> >>
> >> When the watch fires, the timer is reset.  The timer's mutex is used for
> >> fencing the change.
> >>
> > 
> > I think this is a valid idea.  Just that this commit message is not
> > complete. It doesn't describe everything this patch does.
> > 
> >> Cc: Anthony Liguori 
> >> Signed-off-by: Imre Palik 
> >> ---
> > [...]
> >>queue->rx_queue_max = XENVIF_RX_QUEUE_BYTES;
> >> diff --git a/drivers/net/xen-netback/netback.c 
> >> b/drivers/net/xen-netback/netback.c
> >> index cab9f52..bcc1880 100644
> >> --- a/drivers/net/xen-netback/netback.c
> >> +++ b/drivers/net/xen-netback/netback.c
> >> @@ -641,7 +641,7 @@ static void tx_add_credit(struct xenvif_queue *queue)
> >>queue->remaining_credit = min(max_credit, max_burst);
> >>  }
> >>  
> >> -static void tx_credit_callback(unsigned long data)
> >> +void xenvif_tx_credit_callback(unsigned long data)
> > 
> > Please keep this function static.
> 
> The trouble with that, is that now I am initialising credit_timeout.function 
> in
> drivers/net/xen-netback/interface.c .
> 

Oh, yes. I misread the hunk of common.h. Sorry about the noise.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH] dpci: Put the dpci back on the list if running on another CPU.

2015-03-18 Thread Jan Beulich
>>> On 18.03.15 at 15:06,  wrote:
> On Wed, Mar 18, 2015 at 07:41:55AM +, Jan Beulich wrote:
>> >>> On 17.03.15 at 18:44,  wrote:
>> > As you can see to preserve the existing functionality such as
>> > being able to schedule N amount of interrupt injections
>> > for the N interrupts we might get - I modified '->masked'
>> > to be an atomic counter.
>> 
>> Why would that be? When an earlier interrupt wasn't fully handled,
>> real hardware wouldn't latch more than one further instance either.
> 
> We acknowledge the interrupt in the hypervisor - as in we call
> ->ack on the handler (which for MSI is an nop anyhow).

The case where ->ack is a nop (for the purposes here) is specifically
not a problem, as that means we defer ack-ing the LAPIC (hence
further instances can't show up).

> If the device is misconfigured and keeps on sending burst of
> interrupts every 10 msec for 1msec we can dead-lock.

How is this different from the hypervisor itself not being fast
enough to handle one instance before the next one shows up?
I've been trying to reconstruct the rationale for our current
treatment of maskable MSI sources (in that we ack them at the
LAPIC right away), but so far wasn't really successful (sadly
commit 5f4c1bb65e lacks any word of description other than
its title).

(Ill behaved devices shouldn't be handed to guests anyway.)

> Either way we should tell the guest about those interrupts.
>> 
>> > The end result is that we can still live-lock. Unless we:
>> >  - Drop on the floor the injection of N interrupts and
>> >just deliever at max one per VMX_EXIT (and not bother
>> >with interrupts arriving when we are in the VMX handler).
>> 
>> I'm afraid I again don't see the point here.
> 
> I am basing all of this on the assumption that we have
> many interrupts for the same device coming it - and we have
> not been able to tell the guest about it (the guest could
> be descheduled, too slow, etc) so that it can do what it
> needs to silence the device.

But that's the same as with the native hardware case: When there
are new interrupt instances before the earlier one was acked, at
most one will be seen at the point the interrupt becomes unmasked
again.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 09/14] libxl: Check xc_domain_maximum_gpfn for negative return values

2015-03-18 Thread Ian Campbell
On Mon, 2015-03-16 at 11:39 -0400, Konrad Rzeszutek Wilk wrote:
> Instead of assuming everything is always OK.
> 
> Signed-off-by: Konrad Rzeszutek Wilk 
> ---
>  tools/libxc/xc_core_arm.c| 14 +++---
>  tools/libxc/xc_core_x86.c| 21 +
>  tools/libxc/xc_domain_save.c |  8 +++-
>  3 files changed, 35 insertions(+), 8 deletions(-)
> 
> diff --git a/tools/libxc/xc_core_arm.c b/tools/libxc/xc_core_arm.c
> index 34185cf..654692a 100644
> --- a/tools/libxc/xc_core_arm.c
> +++ b/tools/libxc/xc_core_arm.c
> @@ -31,9 +31,13 @@ xc_core_arch_gpfn_may_present(struct xc_core_arch_context 
> *arch_ctxt,
>  }
>  
> 
> -static int nr_gpfns(xc_interface *xch, domid_t domid)
> +static int nr_gpfns(xc_interface *xch, domid_t domid, int *rc)

It would be more natural to return the rc directly and update a pointer
argument for the max gpfn on success.

> diff --git a/tools/libxc/xc_core_x86.c b/tools/libxc/xc_core_x86.c
> index b5d442d..426b90d 100644
> --- a/tools/libxc/xc_core_x86.c
> +++ b/tools/libxc/xc_core_x86.c
> @@ -36,9 +36,13 @@ xc_core_arch_gpfn_may_present(struct xc_core_arch_context 
> *arch_ctxt,
>  }
>  
> 
> -static int nr_gpfns(xc_interface *xch, domid_t domid)
> +static int nr_gpfns(xc_interface *xch, domid_t domid, int *rc)

Likewise, although perhaps you could consolidate those two into one
common wrapper? And then use it everywhere (there were some open-coded
ones later on).

Or even fix the prototype of xc_domain_maximum_gpfn turning it into said
helper, it doesn't look like it has too many callers.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 10/14] libxl: Check xc_maximum_ram_page for negative return values.

2015-03-18 Thread Ian Campbell
On Mon, 2015-03-16 at 11:39 -0400, Konrad Rzeszutek Wilk wrote:
> Instead of assuming everything is always OK.
> 
> Signed-off-by: Konrad Rzeszutek Wilk 
> ---
>  tools/libxc/xg_save_restore.h | 3 +++
>  tools/misc/xen-mfndump.c  | 2 +-
>  2 files changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/libxc/xg_save_restore.h b/tools/libxc/xg_save_restore.h
> index bdd9009..6d26a0a 100644
> --- a/tools/libxc/xg_save_restore.h
> +++ b/tools/libxc/xg_save_restore.h
> @@ -313,6 +313,9 @@ static inline int get_platform_info(xc_interface *xch, 
> uint32_t dom,
>  
>  *max_mfn = xc_maximum_ram_page(xch);
>  
> +if (*max_mfn < 0)
> +return 0;

*max_mfn is unsigned, so the negative would have been lost already.
Perhaps do the same sort of thing with a helper as the last patch?

> +
>  *hvirt_start = xen_params.virt_start;
>  
>  if ( xc_domain_get_guest_width(xch, dom, guest_width) != 0)
> diff --git a/tools/misc/xen-mfndump.c b/tools/misc/xen-mfndump.c
> index 0761f6e..81ef448 100644
> --- a/tools/misc/xen-mfndump.c
> +++ b/tools/misc/xen-mfndump.c
> @@ -184,7 +184,7 @@ int dump_ptes_func(int argc, char *argv[])
>  
>  /* Map M2P and obtain gpfn */
>  max_mfn = xc_maximum_ram_page(xch);
> -if ( (mfn > max_mfn) ||
> +if ( (max_mfn < 0 ) || (mfn > max_mfn) ||

Same here.

>   !(m2p_table = xc_map_m2p(xch, max_mfn, PROT_READ, NULL)) )
>  {
>  xc_unmap_domain_meminfo(xch, &minfo);



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 11/14] libxl: If xc_domain_add_to_physmap fails, include errno value

2015-03-18 Thread Ian Campbell
On Mon, 2015-03-16 at 11:39 -0400, Konrad Rzeszutek Wilk wrote:

$subject says libxl but means libxc.

Actually, now I notice it so do almost all of the patches! Once fixed
any acks I gave (or will give shortly) can still be applied.

> Instead of just the return value.
> 
> Signed-off-by: Konrad Rzeszutek Wilk 
> ---
>  tools/libxc/xc_dom_x86.c | 7 ---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/libxc/xc_dom_x86.c b/tools/libxc/xc_dom_x86.c
> index bf06fe4..20e379c 100644
> --- a/tools/libxc/xc_dom_x86.c
> +++ b/tools/libxc/xc_dom_x86.c
> @@ -865,7 +865,8 @@ static int map_grant_table_frames(struct xc_dom_image 
> *dom)
>  }
>  xc_dom_panic(dom->xch, XC_INTERNAL_ERROR,
>   "%s: mapping grant tables failed " "(pfn=0x%" PRIpfn
> - ", rc=%d)", __FUNCTION__, dom->total_pages + i, rc);
> + ", rc=%d, errno=%d)", __FUNCTION__,
> + dom->total_pages + i, rc ,errno);

s/ ,/, /

(hey, I drew a pretty picture)

With that fixed: Acked-by Ian Campbell 

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 2/2] sched_credit2.c: runqueue_per_core code

2015-03-18 Thread Dario Faggioli
On Wed, 2015-03-18 at 15:26 +, George Dunlap wrote:
> On 03/18/2015 08:53 AM, Dario Faggioli wrote:
> >>> @@ -1986,9 +1982,13 @@ static void init_pcpu(const struct scheduler *ops, 
> >>> int cpu)
> >>>  static void *
> >>>  csched2_alloc_pdata(const struct scheduler *ops, int cpu)
> >>>  {
> >>> -/* Check to see if the cpu is online yet */
> >>> -/* Note: cpu 0 doesn't get a STARTING callback */
> >>> -if ( cpu == 0 || cpu_to_socket(cpu) >= 0 )
> >>> +/*
> >>> + * Actual initialization is deferred to when the pCPU will be
> >>> + * online, via a STARTING callback. The only exception is
> >>> + * the boot cpu, which does not get such a notification, and
> >>> + * hence needs to be taken care of here.
> >>> + */
> >>> +if ( system_state == SYS_STATE_boot )
> >>>  init_pcpu(ops, cpu);
> 
> I think this will break adding a cpu to a cpupool after boot, won't it?
> 
Oh, yeah, that. Yes, I knew it, and probably should have mentioned it...
I assumed it was clear enough that I was asking an opinion about the
shown use of system_state, as opposed to using a particular init value
of cpu_data.phys_proc_id to the same effect, during the boot process.

> Don't we want effectively, "if ( is_boot_cpu() ||
> is_cpu_topology_set_up() )"?
> 
> is_cpu_topology_set_up() could be "system_state >= SYS_STATE_smp_boot" I
> suppose?
> 
That is what I was thinking. Alternatively, I thought we can keep the
'<= SYS_STATE_boot' part for recognizing that the call comes from within
the boot process, and actually do tweak the initialization/assignment
logic of phys_proc_id, as it was also mentioned previously in the thread
(and as you also mention below, IIUIC), to make it better represent
whether topology info are valid or not.

> Is "cpu == 0" the best test for "is_boot_cpu()", or is there another test?
> 
It is a valid test for that, right now, AFAICT, but Jan would like for
it to stop being one. :-D

> So let's step back for a minute, and let me see if I can describe the
> situation.
> 
Yeah... thanks for this, it's been extremely helpful, IMO!

> Unlike the other schedulers, credit2 needs to know the topology of a
> cpus when setting it up, so that we can place it in the proper runqueue.
> 
> Setting up the per-cpu structures would normally happen in alloc_pdata.
>  This is called in three different ways:
> 
> * During boot, on the boot processer, alloc_pdata is called from
> scheduler_init().
> 
> * During boot, on the secondary processors, alloc_pdata is called from
> a CPU_UP_PREPARE callback, which happens just before the cpu is actually
> brought up.
> 
> * When switching a cpu to a new cpupool after boot, alloc_pdata is also
> called from cpu_schedule_switch().
> 
> The "normal" place the cpu topology information can be found is in
> global the cpu_data[] array, typically accessed by the cpu_to_socket()
> or cpu_to_core() macros.
> 
> This topology information is written to cpu_data[] by
> smp_store_cpu_info().  For the boot cpu, it happens in
> smp_prepare_cpus();  For secondary cpus, it's happens in
> start_secondary(), which is the code run on the cpu itself as it's being
> brought up.
> 
> Unfortunately, at the moment, both of these places are after the
> respective places where the alloc pdata is called for those cpus.
> Flattening the entire x86 setup at the moment you'd see:
>   alloc_pdata for boot cpu
>   smp_store_cpu_info for boot cpu
>   for each secondary cpu
> alloc_pdata for secondary cpu
> smp_store_cpu_info for secondary cpu
> 
> scheduler_init() is called before smp_prepare_cpus() in part because of
> a dependency chain elsewhere: we cannot set up the idle domain until
> scheduler_init() is called; and other places further on in the
> initialization but before setting up cpu topology information assume
> that the idle domain has been set up.
> 
However, as you say yourself below, there is boot_cpu_data, which
stashes the info we need for the boot cpu. It gets filled after
init_idle_domain()->scheduler_init() right now, but I don't see a reason
why it can't be pulled up a bit (and from my tests, it seems to work).

That happens during system_state==SYS_STATE_boot, which also means
system_state I haven't actually tracked down this dependency yet; nor have I explored
> the possibility of populating cpu_data[] for the boot cpu earier than
> it's done now.
> 
That looks like what boot_cpu_data is for, AFAICT.

> I'm not sure exactly why we call alloc_pdata on CPU_UP_PREPARE rather
> than CPU_STARTING -- whether that's just what seemed most sensible when
> cpupools were created, or whether there's a dependency somewhere between
> those two points.
> 
Me neither, this may be worth a try.

> At the moment we deal with the fact that the topology information for a
> CPU is not available on CPU_UP_PREPARE by setting a callback that will
> call init_pcpu() on the CPU_STARTING action.
> 
> The boot cpu will not get such a callback; but information about the
> topo

Re: [Xen-devel] [PATCH 12/14] libxl: Check xc_sharing_* for proper return values.

2015-03-18 Thread Ian Campbell
On Mon, 2015-03-16 at 11:39 -0400, Konrad Rzeszutek Wilk wrote:
> If there is a negative return value - check for that and
> also use errno for the proper error value.

Was xc_sharing_freed_pages fixed earlier in the series (which is
strictly speaking a bisection hazard, but nevermind) or was the existing
check for l == -E... just buggy?

> Signed-off-by: Konrad Rzeszutek Wilk 
> ---
>  tools/libxl/libxl.c  |  4 ++--
>  tools/tests/mem-sharing/memshrtool.c | 12 ++--
>  2 files changed, 12 insertions(+), 4 deletions(-)
> 
> diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
> index 94b4d59..99a99e9 100644
> --- a/tools/libxl/libxl.c
> +++ b/tools/libxl/libxl.c
> @@ -5024,7 +5024,7 @@ int libxl_get_physinfo(libxl_ctx *ctx, libxl_physinfo 
> *physinfo)
>  physinfo->scrub_pages = xcphysinfo.scrub_pages;
>  physinfo->outstanding_pages = xcphysinfo.outstanding_pages;
>  l = xc_sharing_freed_pages(ctx->xch);
> -if (l == -ENOSYS) {
> +if (l < 0 && errno == ENOSYS) {
>  l = 0;
>  } else if (l < 0) {
>  LIBXL__LOG_ERRNOVAL(ctx, LIBXL__LOG_ERROR, l,
> @@ -5033,7 +5033,7 @@ int libxl_get_physinfo(libxl_ctx *ctx, libxl_physinfo 
> *physinfo)
>  }
>  physinfo->sharing_freed_pages = l;
>  l = xc_sharing_used_frames(ctx->xch);
> -if (l == -ENOSYS) {
> +if (l < 0 && errno == ENOSYS) {
>  l = 0;
>  } else if (l < 0) {
>  LIBXL__LOG_ERRNOVAL(ctx, LIBXL__LOG_ERROR, l,
> diff --git a/tools/tests/mem-sharing/memshrtool.c 
> b/tools/tests/mem-sharing/memshrtool.c
> index db44294..d1dd8a2 100644
> --- a/tools/tests/mem-sharing/memshrtool.c
> +++ b/tools/tests/mem-sharing/memshrtool.c
> @@ -55,11 +55,19 @@ int main(int argc, const char** argv)
>  
>  if( !strcasecmp(cmd, "info") )
>  {
> +long rc;
>  if( argc != 2 )
>  return usage(argv[0]);
>  
> -printf("used = %ld\n", xc_sharing_used_frames(xch));
> -printf("freed = %ld\n", xc_sharing_freed_pages(xch));
> +rc = xc_sharing_freed_pages(xch);
> +if ( rc < 0 )
> +return errno;

Just return 1 please.

> +
> +printf("used = %ld\n", rc);
> +rc = xc_sharing_used_frames(xch);
> +if ( rc < 0 )
> +return errno;

Likewise.

> +printf("freed = %ld\n", rc);
>  }
>  else if( !strcasecmp(cmd, "enable") )
>  {



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 14/14] libxl: Fix do_memory_op to return negative value on errors

2015-03-18 Thread Ian Campbell
On Mon, 2015-03-16 at 11:39 -0400, Konrad Rzeszutek Wilk wrote:
> instead of the -Exx values (which should go in errno).
> 
> This patch has HUGE implications.

For such a small patch!

>  There is a lot of APIs
> that are using do_memory_op. Fortunatly most of them

"Fortunately".

> check for 'if (do_memory_op(..) < 0)' so will function
> properly. However there were some which printed the return
> value to the user. They have been fixed in:
> 
>  libxl: Don't assign return value to errno for E820 get/set xc_ calls.
>  libxl: Check xc_sharing_* for proper return values.
>  libxl: Print xc_domain_decrease_reservation proper errno value.
>  libxl: If xc_domain_add_to_physmap fails, include errno value
>  libxl: Check xc_maximum_ram_page for negative return values.
>  libxl: Check xc_domain_maximum_gpfn for negative return values

s/libxl/libxc/ in most of them.

> Signed-off-by: Konrad Rzeszutek Wilk 

Acked-by: Ian Campbell 

> ---
>  tools/libxc/xc_private.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/tools/libxc/xc_private.c b/tools/libxc/xc_private.c
> index 0735e23..1222d05 100644
> --- a/tools/libxc/xc_private.c
> +++ b/tools/libxc/xc_private.c
> @@ -516,7 +516,7 @@ int do_memory_op(xc_interface *xch, int cmd, void *arg, 
> size_t len)
>  {
>  DECLARE_HYPERCALL;
>  DECLARE_HYPERCALL_BOUNCE(arg, len, XC_HYPERCALL_BUFFER_BOUNCE_BOTH);
> -long ret = -EINVAL;
> +long ret = -1;
>  
>  if ( xc_hypercall_bounce_pre(xch, arg) )
>  {



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


  1   2   >