Re: CS 4.8 KVM VMs will not live migrate

2018-01-30 Thread Wei ZHOU
The answer should be caused by an exception in the cloudstack agent.
I tried to migrate a vm in our testing env, it is working.

there are some different between our env and yours.
(1) vlan VS vxlan
(2) no ISO VS attached ISO
(3) both of us use ceph and centos7.

I suspect it is caused by codes on vxlan.
However, could you detach the ISO and try again ?

-Wei



2018-01-29 19:48 GMT+01:00 David Mabry :

> Good day Cloudstack Devs,
>
> I've run across a real head scratcher.  I have two VMs, (initially 3 VMs,
> but more on that later) on a single host, that I cannot live migrate to any
> other host in the same cluster.  We discovered this after attempting to
> roll out patches going from CentOS 7.2 to CentOS 7.4.  Initially, we
> thought it had something to do with the new version of libvirtd or qemu-kvm
> on the other hosts in the cluster preventing these VMs from migrating, but
> we are able to live migrate other VMs to and from this host without issue.
> We can even create new VMs on this specific host and live migrate them
> after creation with no issue.  We've put the migration source agent,
> migration destination agent and the management server in debug and don't
> seem to get anything useful other than "Unsupported command".  Luckily, we
> did have one VM that was shutdown and restarted, this is the 3rd VM
> mentioned above.  Since that VM has been restarted, it has no issues live
> migrating to any other host in the cluster.
>
> I'm at a loss as to what to try next and I'm hoping that someone out there
> might have had a similar issue and could shed some light on what to do.
> Obviously, I can contact the customer and have them shutdown their VMs, but
> that will potentially just delay this problem to be solved another day.
> Even if shutting down the VMs is ultimately the solution, I'd still like to
> understand what happened to cause this issue in the first place with the
> hopes of preventing it in the future.
>
> Here's some information about my setup:
> Cloudstack 4.8 Advanced Networking
> CentOS 7.2 and 7.4 Hosts
> Ceph RBD Primary Storage
> NFS Secondary Storage
> Instance in Question for Debug: i-532-1392-NSVLTN
>
> I have attached relevant debug logs to this email if anyone wishes to take
> a look.  I think the most interesting error message that I have received is
> the following:
>
> 468390:2018-01-27 08:59:35,172 DEBUG [c.c.a.t.Request]
> (Work-Job-Executor-6:ctx-188ea30f job-181792/job-181802 ctx-8e7f45ad)
> (logid:f0888362) Seq 22-942378222027276319: Received:  { Ans: , MgmtId:
> 14038012703634, via: 22(csh02c01z01.nsvltn.ena.net), Ver: v1, Flags: 110,
> { UnsupportedAnswer } }
> 468391:2018-01-27 08:59:35,172 WARN  [c.c.a.m.AgentManagerImpl]
> (Work-Job-Executor-6:ctx-188ea30f job-181792/job-181802 ctx-8e7f45ad)
> (logid:f0888362) Unsupported Command: Unsupported command issued:
> com.cloud.agent.api.PrepareForMigrationCommand.  Are you sure you got the
> right type of server?
> 468392:2018-01-27 08:59:35,179 ERROR [c.c.v.VmWorkJobHandlerProxy]
> (Work-Job-Executor-6:ctx-188ea30f job-181792/job-181802 ctx-8e7f45ad)
> (logid:f0888362) Invocation exception, caused by: 
> com.cloud.exception.AgentUnavailableException:
> Resource [Host:22] is unreachable: Host 22: Unable to prepare for migration
> due to Unsupported command issued: 
> com.cloud.agent.api.PrepareForMigrationCommand.
> Are you sure you got the right type of server?
> 468393:2018-01-27 08:59:35,179 INFO  [c.c.v.VmWorkJobHandlerProxy]
> (Work-Job-Executor-6:ctx-188ea30f job-181792/job-181802 ctx-8e7f45ad)
> (logid:f0888362) Rethrow exception 
> com.cloud.exception.AgentUnavailableException:
> Resource [Host:22] is unreachable: Host 22: Unable to prepare for migration
> due to Unsupported command issued: 
> com.cloud.agent.api.PrepareForMigrationCommand.
> Are you sure you got the right type of server?
>
> I've tracked this "Unsupported command" down in the CS 4.8 code to
> cloudstack/api/src/com/cloud/agent/api/Answer.java which is the generic
> answer class.  I believe where the error is really being spawned from is
> cloudstack/engine/orchestration/src/com/cloud/
> vm/VirtualMachineManagerImpl.java.  Specifically:
> Answer pfma = null;
> try {
> pfma = _agentMgr.send(dstHostId, pfmc);
> if (pfma == null || !pfma.getResult()) {
> final String details = pfma != null ? pfma.getDetails() :
> "null answer returned";
> final String msg = "Unable to prepare for migration due to
> " + details;
> pfma = null;
> throw new AgentUnavailableException(msg, dstHostId);
> }
>
> The pfma returned must be in error or is never returned and therefore
> still null.  That answer appears that it should be coming from the
> destination agent, but for the life of me I can't figure out what the root
> cause of this error is beyond, "Unsupported command issued".  What command
> is unsupported?  My guess is that it could be something wro

Re: CS 4.8 KVM VMs will not live migrate

2018-01-30 Thread David Mabry
Hi Wei,

I detached the iso and received the same error.  Just out of curiosity, what 
leads you to believe it is something in the vxlan code?  I guess at this point, 
attaching a remote debugger to the agent in question might be the best way to 
get to the bottom of what is going on.

Thanks in advance for the help.  I really, really appreciate it.

Thanks,
David Mabry

On 1/30/18, 3:30 AM, "Wei ZHOU"  wrote:

The answer should be caused by an exception in the cloudstack agent.
I tried to migrate a vm in our testing env, it is working.

there are some different between our env and yours.
(1) vlan VS vxlan
(2) no ISO VS attached ISO
(3) both of us use ceph and centos7.

I suspect it is caused by codes on vxlan.
However, could you detach the ISO and try again ?

-Wei



2018-01-29 19:48 GMT+01:00 David Mabry :

> Good day Cloudstack Devs,
>
> I've run across a real head scratcher.  I have two VMs, (initially 3 VMs,
> but more on that later) on a single host, that I cannot live migrate to 
any
> other host in the same cluster.  We discovered this after attempting to
> roll out patches going from CentOS 7.2 to CentOS 7.4.  Initially, we
> thought it had something to do with the new version of libvirtd or 
qemu-kvm
> on the other hosts in the cluster preventing these VMs from migrating, but
> we are able to live migrate other VMs to and from this host without issue.
> We can even create new VMs on this specific host and live migrate them
> after creation with no issue.  We've put the migration source agent,
> migration destination agent and the management server in debug and don't
> seem to get anything useful other than "Unsupported command".  Luckily, we
> did have one VM that was shutdown and restarted, this is the 3rd VM
> mentioned above.  Since that VM has been restarted, it has no issues live
> migrating to any other host in the cluster.
>
> I'm at a loss as to what to try next and I'm hoping that someone out there
> might have had a similar issue and could shed some light on what to do.
> Obviously, I can contact the customer and have them shutdown their VMs, 
but
> that will potentially just delay this problem to be solved another day.
> Even if shutting down the VMs is ultimately the solution, I'd still like 
to
> understand what happened to cause this issue in the first place with the
> hopes of preventing it in the future.
>
> Here's some information about my setup:
> Cloudstack 4.8 Advanced Networking
> CentOS 7.2 and 7.4 Hosts
> Ceph RBD Primary Storage
> NFS Secondary Storage
> Instance in Question for Debug: i-532-1392-NSVLTN
>
> I have attached relevant debug logs to this email if anyone wishes to take
> a look.  I think the most interesting error message that I have received 
is
> the following:
>
> 468390:2018-01-27 08:59:35,172 DEBUG [c.c.a.t.Request]
> (Work-Job-Executor-6:ctx-188ea30f job-181792/job-181802 ctx-8e7f45ad)
> (logid:f0888362) Seq 22-942378222027276319: Received:  { Ans: , MgmtId:
> 14038012703634, via: 22(csh02c01z01.nsvltn.ena.net), Ver: v1, Flags: 110,
> { UnsupportedAnswer } }
> 468391:2018-01-27 08:59:35,172 WARN  [c.c.a.m.AgentManagerImpl]
> (Work-Job-Executor-6:ctx-188ea30f job-181792/job-181802 ctx-8e7f45ad)
> (logid:f0888362) Unsupported Command: Unsupported command issued:
> com.cloud.agent.api.PrepareForMigrationCommand.  Are you sure you got the
> right type of server?
> 468392:2018-01-27 08:59:35,179 ERROR [c.c.v.VmWorkJobHandlerProxy]
> (Work-Job-Executor-6:ctx-188ea30f job-181792/job-181802 ctx-8e7f45ad)
> (logid:f0888362) Invocation exception, caused by: 
com.cloud.exception.AgentUnavailableException:
> Resource [Host:22] is unreachable: Host 22: Unable to prepare for 
migration
> due to Unsupported command issued: 
com.cloud.agent.api.PrepareForMigrationCommand.
> Are you sure you got the right type of server?
> 468393:2018-01-27 08:59:35,179 INFO  [c.c.v.VmWorkJobHandlerProxy]
> (Work-Job-Executor-6:ctx-188ea30f job-181792/job-181802 ctx-8e7f45ad)
> (logid:f0888362) Rethrow exception 
com.cloud.exception.AgentUnavailableException:
> Resource [Host:22] is unreachable: Host 22: Unable to prepare for 
migration
> due to Unsupported command issued: 
com.cloud.agent.api.PrepareForMigrationCommand.
> Are you sure you got the right type of server?
>
> I've tracked this "Unsupported command" down in the CS 4.8 code to
> cloudstack/api/src/com/cloud/agent/api/Answer.java which is the generic
> answer class.  I believe where the error is really being spawned from is
> cloudstack/engine/orchestration/src/com/cloud/
> vm/VirtualMachineManagerImpl.java.  Specifically:
> Answer pfma = null;
> try {
> pfma = _agentM

Re: CS 4.8 KVM VMs will not live migrate

2018-01-30 Thread Wei ZHOU
Hi David,

I encountered the UnsupportAnswer once before, when I made some changes in
the kvm plugin.

Normally there should be some network configurations in the agent.log but I
do not see it.

-Wei


2018-01-30 15:00 GMT+01:00 David Mabry :

> Hi Wei,
>
> I detached the iso and received the same error.  Just out of curiosity,
> what leads you to believe it is something in the vxlan code?  I guess at
> this point, attaching a remote debugger to the agent in question might be
> the best way to get to the bottom of what is going on.
>
> Thanks in advance for the help.  I really, really appreciate it.
>
> Thanks,
> David Mabry
>
> On 1/30/18, 3:30 AM, "Wei ZHOU"  wrote:
>
> The answer should be caused by an exception in the cloudstack agent.
> I tried to migrate a vm in our testing env, it is working.
>
> there are some different between our env and yours.
> (1) vlan VS vxlan
> (2) no ISO VS attached ISO
> (3) both of us use ceph and centos7.
>
> I suspect it is caused by codes on vxlan.
> However, could you detach the ISO and try again ?
>
> -Wei
>
>
>
> 2018-01-29 19:48 GMT+01:00 David Mabry :
>
> > Good day Cloudstack Devs,
> >
> > I've run across a real head scratcher.  I have two VMs, (initially 3
> VMs,
> > but more on that later) on a single host, that I cannot live migrate
> to any
> > other host in the same cluster.  We discovered this after attempting
> to
> > roll out patches going from CentOS 7.2 to CentOS 7.4.  Initially, we
> > thought it had something to do with the new version of libvirtd or
> qemu-kvm
> > on the other hosts in the cluster preventing these VMs from
> migrating, but
> > we are able to live migrate other VMs to and from this host without
> issue.
> > We can even create new VMs on this specific host and live migrate
> them
> > after creation with no issue.  We've put the migration source agent,
> > migration destination agent and the management server in debug and
> don't
> > seem to get anything useful other than "Unsupported command".
> Luckily, we
> > did have one VM that was shutdown and restarted, this is the 3rd VM
> > mentioned above.  Since that VM has been restarted, it has no issues
> live
> > migrating to any other host in the cluster.
> >
> > I'm at a loss as to what to try next and I'm hoping that someone out
> there
> > might have had a similar issue and could shed some light on what to
> do.
> > Obviously, I can contact the customer and have them shutdown their
> VMs, but
> > that will potentially just delay this problem to be solved another
> day.
> > Even if shutting down the VMs is ultimately the solution, I'd still
> like to
> > understand what happened to cause this issue in the first place with
> the
> > hopes of preventing it in the future.
> >
> > Here's some information about my setup:
> > Cloudstack 4.8 Advanced Networking
> > CentOS 7.2 and 7.4 Hosts
> > Ceph RBD Primary Storage
> > NFS Secondary Storage
> > Instance in Question for Debug: i-532-1392-NSVLTN
> >
> > I have attached relevant debug logs to this email if anyone wishes
> to take
> > a look.  I think the most interesting error message that I have
> received is
> > the following:
> >
> > 468390:2018-01-27 08:59:35,172 DEBUG [c.c.a.t.Request]
> > (Work-Job-Executor-6:ctx-188ea30f job-181792/job-181802
> ctx-8e7f45ad)
> > (logid:f0888362) Seq 22-942378222027276319: Received:  { Ans: ,
> MgmtId:
> > 14038012703634, via: 22(csh02c01z01.nsvltn.ena.net), Ver: v1,
> Flags: 110,
> > { UnsupportedAnswer } }
> > 468391:2018-01-27 08:59:35,172 WARN  [c.c.a.m.AgentManagerImpl]
> > (Work-Job-Executor-6:ctx-188ea30f job-181792/job-181802
> ctx-8e7f45ad)
> > (logid:f0888362) Unsupported Command: Unsupported command issued:
> > com.cloud.agent.api.PrepareForMigrationCommand.  Are you sure you
> got the
> > right type of server?
> > 468392:2018-01-27 08:59:35,179 ERROR [c.c.v.VmWorkJobHandlerProxy]
> > (Work-Job-Executor-6:ctx-188ea30f job-181792/job-181802
> ctx-8e7f45ad)
> > (logid:f0888362) Invocation exception, caused by:
> com.cloud.exception.AgentUnavailableException:
> > Resource [Host:22] is unreachable: Host 22: Unable to prepare for
> migration
> > due to Unsupported command issued: com.cloud.agent.api.
> PrepareForMigrationCommand.
> > Are you sure you got the right type of server?
> > 468393:2018-01-27 08:59:35,179 INFO  [c.c.v.VmWorkJobHandlerProxy]
> > (Work-Job-Executor-6:ctx-188ea30f job-181792/job-181802
> ctx-8e7f45ad)
> > (logid:f0888362) Rethrow exception com.cloud.exception.
> AgentUnavailableException:
> > Resource [Host:22] is unreachable: Host 22: Unable to prepare for
> migration
> > due to Unsupported command issued: com.cloud.agent.api.
> PrepareForMigrationCommand.
> > Are you sure you got the right type of server?

[NOTICE] Remove branches 4.10.0.0-RC* from Apache CloudStack official repository

2018-01-30 Thread Rafael Weingärtner
Following the protocol defined in [1], this is the notice email regarding
the removal of 4.10.0.0-RC* branches from Apache CloudStack official
repository. The Jira ticket for the branches removal is
https://issues.apache.org/jira/browse/CLOUDSTACK-10258. The branches that
will be removed are the following:


   - 4.10.0.0-RC20170301T0634
   - 4.10.0.0-RC20170509T1030
   - 4.10.0.0-RC20170607T1407
   - 4.10.0.0-RC20170609T1354
   - 4.10.0.0-RC20170620T1023
   - 4.10.0.0-RC20170626T1011
   - 4.10.0.0-RC20170703T1006

If you have objections, please do share your concerns before the deletion. The
removal will happen on 09/Feb/18.

[1]
https://cwiki.apache.org/confluence/display/CLOUDSTACK/Clean+up+old+and+obsolete+branches+protocol
-- 
Rafael Weingärtner


Re: CS 4.8 KVM VMs will not live migrate

2018-01-30 Thread David Mabry
Ah, understood.  I'll take a closer look at the logs and make sure that I 
didn't accidentally miss those lines when I pulled together the logs for this 
email chain.

Thanks,
David Mabry
On 1/30/18, 8:34 AM, "Wei ZHOU"  wrote:

Hi David,

I encountered the UnsupportAnswer once before, when I made some changes in
the kvm plugin.

Normally there should be some network configurations in the agent.log but I
do not see it.

-Wei


2018-01-30 15:00 GMT+01:00 David Mabry :

> Hi Wei,
>
> I detached the iso and received the same error.  Just out of curiosity,
> what leads you to believe it is something in the vxlan code?  I guess at
> this point, attaching a remote debugger to the agent in question might be
> the best way to get to the bottom of what is going on.
>
> Thanks in advance for the help.  I really, really appreciate it.
>
> Thanks,
> David Mabry
>
> On 1/30/18, 3:30 AM, "Wei ZHOU"  wrote:
>
> The answer should be caused by an exception in the cloudstack agent.
> I tried to migrate a vm in our testing env, it is working.
>
> there are some different between our env and yours.
> (1) vlan VS vxlan
> (2) no ISO VS attached ISO
> (3) both of us use ceph and centos7.
>
> I suspect it is caused by codes on vxlan.
> However, could you detach the ISO and try again ?
>
> -Wei
>
>
>
> 2018-01-29 19:48 GMT+01:00 David Mabry :
>
> > Good day Cloudstack Devs,
> >
> > I've run across a real head scratcher.  I have two VMs, (initially 3
> VMs,
> > but more on that later) on a single host, that I cannot live migrate
> to any
> > other host in the same cluster.  We discovered this after attempting
> to
> > roll out patches going from CentOS 7.2 to CentOS 7.4.  Initially, we
> > thought it had something to do with the new version of libvirtd or
> qemu-kvm
> > on the other hosts in the cluster preventing these VMs from
> migrating, but
> > we are able to live migrate other VMs to and from this host without
> issue.
> > We can even create new VMs on this specific host and live migrate
> them
> > after creation with no issue.  We've put the migration source agent,
> > migration destination agent and the management server in debug and
> don't
> > seem to get anything useful other than "Unsupported command".
> Luckily, we
> > did have one VM that was shutdown and restarted, this is the 3rd VM
> > mentioned above.  Since that VM has been restarted, it has no issues
> live
> > migrating to any other host in the cluster.
> >
> > I'm at a loss as to what to try next and I'm hoping that someone out
> there
> > might have had a similar issue and could shed some light on what to
> do.
> > Obviously, I can contact the customer and have them shutdown their
> VMs, but
> > that will potentially just delay this problem to be solved another
> day.
> > Even if shutting down the VMs is ultimately the solution, I'd still
> like to
> > understand what happened to cause this issue in the first place with
> the
> > hopes of preventing it in the future.
> >
> > Here's some information about my setup:
> > Cloudstack 4.8 Advanced Networking
> > CentOS 7.2 and 7.4 Hosts
> > Ceph RBD Primary Storage
> > NFS Secondary Storage
> > Instance in Question for Debug: i-532-1392-NSVLTN
> >
> > I have attached relevant debug logs to this email if anyone wishes
> to take
> > a look.  I think the most interesting error message that I have
> received is
> > the following:
> >
> > 468390:2018-01-27 08:59:35,172 DEBUG [c.c.a.t.Request]
> > (Work-Job-Executor-6:ctx-188ea30f job-181792/job-181802
> ctx-8e7f45ad)
> > (logid:f0888362) Seq 22-942378222027276319: Received:  { Ans: ,
> MgmtId:
> > 14038012703634, via: 22(csh02c01z01.nsvltn.ena.net), Ver: v1,
> Flags: 110,
> > { UnsupportedAnswer } }
> > 468391:2018-01-27 08:59:35,172 WARN  [c.c.a.m.AgentManagerImpl]
> > (Work-Job-Executor-6:ctx-188ea30f job-181792/job-181802
> ctx-8e7f45ad)
> > (logid:f0888362) Unsupported Command: Unsupported command issued:
> > com.cloud.agent.api.PrepareForMigrationCommand.  Are you sure you
> got the
> > right type of server?
> > 468392:2018-01-27 08:59:35,179 ERROR [c.c.v.VmWorkJobHandlerProxy]
> > (Work-Job-Executor-6:ctx-188ea30f job-181792/job-181802
> ctx-8e7f45ad)
> > (logid:f0888362) Invocation exception, caused by:
> com.cloud.exception.AgentUnavailableException:
> > Reso