Thanks for the reply, guys.

Just wanted to point out that this is on 4.4 for me (although the issue may
also be present on master).

I have a sufficient number of IP addresses for both system and user VMs, so
that should be OK (but good thought, Punith).

I plan to continue debugging this later this afternoon, but have been in
meetings all morning.

Thanks!


On Mon, Apr 28, 2014 at 10:41 AM, Dave Scott <dave.sc...@citrix.com> wrote:

> Hi,
>
> (sorry to reply to my own email!)
>
> On 28 Apr 2014, at 11:42, Dave Scott <dave.sc...@citrix.com> wrote:
>
> >
> > Hi Mike,
> >
> > On 28 Apr 2014, at 04:44, Mike Tutkowski <mike.tutkow...@solidfire.com>
> wrote:
> >
> >> Hi,
> >>
> >> I recently installed 6.2 with XS62ESP1 and XS62ESP1004 (so that
> >> Xenserver625StorageProcessor would be utilized).
> >>
> >> When I create a cloud from scratch, my SSVM starts up fine, but CPVM
> ends
> >> up in the Paused state. I have to force a shutdown of that VM and then
> >> CloudStack restarts it and it works. This consistently happens. The
> system
> >> VMs are being deployed to the local storage of the one XS host I have
> in my
> >> one and only cluster.
> >>
> >> Any thoughts on that?
> >
> > I'm seeing the same symptom on my test cloud with 6.2 and XS62ESP1004. I
> think there's a problem with XenAPI session and task handling in the
> cloudstack master branch, although I've not tracked it down yet. In my
> management server log I see:
> >
> > WARN  [c.c.h.x.r.CitrixResourceBase] (DirectAgent-5:ctx-47dccee1) Unable
> to start VM(v-2-VM) on host(1c4a31e9-469e-45c3-a0ad-9792ac7b
> > 20f6) due to You gave an invalid session reference.  It may have been
> invalidated by a server restart, or timed out.  You should get
> > a new session handle, using one of the session.login_ calls.  This error
> does not invalidate the current connection.  The handle para
> > meter echoes the bad value given.
> > You gave an invalid session reference.  It may have been invalidated by
> a server restart, or timed out.  You should get a new session
> > handle, using one of the session.login_ calls.  This error does not
> invalidate the current connection.  The handle parameter echoes
> > the bad value given.
> >        at com.xensource.xenapi.Types.checkResponse(Types.java:218)
> >        at com.xensource.xenapi.Connection.dispatch(Connection.java:395)
> >        at
> com.cloud.hypervisor.xen.resource.XenServerConnectionPool$XenServerConnection.dispatch(XenServerConnectionPool.java:463)
> >        at com.xensource.xenapi.Event.from(Event.java:270)
> >        at
> org.apache.cloudstack.hypervisor.xenserver.XenServerResourceNewBase.waitForTask(XenServerResourceNewBase.java:113)
> >        at
> com.cloud.hypervisor.xen.resource.CitrixResourceBase.startVM(CitrixResourceBase.java:3455)
> >
> > Somehow the XenAPI session being used by the Event.from in the
> XenServerResourceNewBase.waitForTask (used for recent 6.2 XenServers only)
> is being logged-out somewhere. When this happens, the cloudstack cleanup
> code calls Task.cancel and Task.destroy, and then the XenServer
> Async.VM.start fails trying to update Task.progress before it internally
> calls VM.unpause.
> >
> > I made a hack to disable caching of Connection/sessions:
> >
> >
> https://github.com/djs55/cloudstack/commit/a388b71279086e42710e26340df0632d0d8135e4
>
> For reference / experimentation, I've made a slightly more plausible patch:
>
>
> https://github.com/djs55/cloudstack/commit/9d40f56c6384d04a5f0fb22e5b97530c0164e0b2
>
> It catches the SESSION_INVALID in the XenServerConnection and
> transparently logs back in. This would prevent the higher level bits of the
> XenServer plugin from having to deal with sessions being expired beneath
> them.
>
> Chers,
> Dave
>
> >
> > I suspect this now leaks Connections/sessions, but the symptom goes away.
> >
> > So far my thoughts are:
> >
> > 1. we need to find who's calling session.logout and why -- this will help
> fix the problem in the short term
> >
> > 2. The XenServer XenAPI bindings are harder to use than they should be
> (IMHO). In particular I think the bindings should take care of handling
> SESSION_INVALID exceptions and re-authenticating transparently, to avoid
> polluting the cloudstack code with rarely-used exception handlers.
> >
> > 3. the semantics of XenAPI task.destroy could be improved: instead of
> immediately removing the task (which then causes cleanup code to fail
> randomly it seems), it should be more like Unix waitpid with NOHANG i.e.
> set a bit which says, "I'm done with this. Destroy it when you are finished
> with it."
> >
> >
> >>
> >> Also, if I try to kick off a user VM to local storage, I get the
> >> general-purpose InsufficientCapacityException and the virtual router
> does
> >> not even start up.
> >
> > No idea about this one :)
> >
> > Cheers,
> > Dave
> >
> >>
> >> Can anyone create a similar cloud to what I've described here with XS
> 6.2,
> >> XS62ESP1, and XS62ESP1004? I re-ran this test using a XS 6.1 host and it
> >> works just fine.
> >>
> >> At the moment, this is blocking a test case I'm trying to execute to
> verify
> >> code I had to write in Xenserver625StorageProcessor.
> >>
> >> Thanks!
> >>
> >> --
> >> *Mike Tutkowski*
> >> *Senior CloudStack Developer, SolidFire Inc.*
> >> e: mike.tutkow...@solidfire.com
> >> o: 303.746.7302
> >> Advancing the way the world uses the
> >> cloud<http://solidfire.com/solution/overview/?video=play>
> >> *(tm)*
> >
>
>


-- 
*Mike Tutkowski*
*Senior CloudStack Developer, SolidFire Inc.*
e: mike.tutkow...@solidfire.com
o: 303.746.7302
Advancing the way the world uses the
cloud<http://solidfire.com/solution/overview/?video=play>
*(tm)*

Reply via email to