Glad to hear you fixed the issue! :)
> On Jan 31, 2018, at 7:16 AM, David Mabry <dma...@ena.com.INVALID> wrote: > > Mike and Wei, > > Good news! I was able to manually live migrate these VMs following the steps > outlined below: > > 1.) virsh dumpxml 38 --migratable > 38.xml > 2.) Change the vnc information in 38.xml to match destination host IP and > available VNC port > 3.) virsh migrate --verbose --live 38 --xml 38.xml > qemu+tcp://destination.host.net/system > > To my surprise, Cloudstack was able to discover and properly handle the fact > that this VM was live migrated to a new host without issue. Very cool. > > Wei, I suspect you are correct when you said this was an issue with the > cloudstack agent code. After digging a little deeper, the agent is never > attempting to talk to libvirt at all after prepping the dxml to send to the > destination host. I'm going to attempt to reproduce this in my lab and > attach a remote debugger and see if I can get to the bottom of it. > > Thanks again for the help guys! I really appreciate it. > > Thanks, > David Mabry > > On 1/30/18, 9:55 AM, "David Mabry" <dma...@ena.com.INVALID> wrote: > > Ah, understood. I'll take a closer look at the logs and make sure that I > didn't accidentally miss those lines when I pulled together the logs for this > email chain. > > Thanks, > David Mabry > On 1/30/18, 8:34 AM, "Wei ZHOU" <ustcweiz...@gmail.com> wrote: > > Hi David, > > I encountered the UnsupportAnswer once before, when I made some > changes in > the kvm plugin. > > Normally there should be some network configurations in the agent.log > but I > do not see it. > > -Wei > > > 2018-01-30 15:00 GMT+01:00 David Mabry <dma...@ena.com.invalid>: > >> Hi Wei, >> >> I detached the iso and received the same error. Just out of curiosity, >> what leads you to believe it is something in the vxlan code? I guess at >> this point, attaching a remote debugger to the agent in question might be >> the best way to get to the bottom of what is going on. >> >> Thanks in advance for the help. I really, really appreciate it. >> >> Thanks, >> David Mabry >> >> On 1/30/18, 3:30 AM, "Wei ZHOU" <ustcweiz...@gmail.com> wrote: >> >> The answer should be caused by an exception in the cloudstack agent. >> I tried to migrate a vm in our testing env, it is working. >> >> there are some different between our env and yours. >> (1) vlan VS vxlan >> (2) no ISO VS attached ISO >> (3) both of us use ceph and centos7. >> >> I suspect it is caused by codes on vxlan. >> However, could you detach the ISO and try again ? >> >> -Wei >> >> >> >> 2018-01-29 19:48 GMT+01:00 David Mabry <dma...@ena.com.invalid>: >> >>> Good day Cloudstack Devs, >>> >>> I've run across a real head scratcher. I have two VMs, (initially 3 >> VMs, >>> but more on that later) on a single host, that I cannot live migrate >> to any >>> other host in the same cluster. We discovered this after attempting >> to >>> roll out patches going from CentOS 7.2 to CentOS 7.4. Initially, we >>> thought it had something to do with the new version of libvirtd or >> qemu-kvm >>> on the other hosts in the cluster preventing these VMs from >> migrating, but >>> we are able to live migrate other VMs to and from this host without >> issue. >>> We can even create new VMs on this specific host and live migrate >> them >>> after creation with no issue. We've put the migration source agent, >>> migration destination agent and the management server in debug and >> don't >>> seem to get anything useful other than "Unsupported command". >> Luckily, we >>> did have one VM that was shutdown and restarted, this is the 3rd VM >>> mentioned above. Since that VM has been restarted, it has no issues >> live >>> migrating to any other host in the cluster. >>> >>> I'm at a loss as to what to try next and I'm hoping that someone out >> there >>> might have had a similar issue and could shed some light on what to >> do. >>> Obviously, I can contact the customer and have them shutdown their >> VMs, but >>> that will potentially just delay this problem to be solved another >> day. >>> Even if shutting down the VMs is ultimately the solution, I'd still >> like to >>> understand what happened to cause this issue in the first place with >> the >>> hopes of preventing it in the future. >>> >>> Here's some information about my setup: >>> Cloudstack 4.8 Advanced Networking >>> CentOS 7.2 and 7.4 Hosts >>> Ceph RBD Primary Storage >>> NFS Secondary Storage >>> Instance in Question for Debug: i-532-1392-NSVLTN >>> >>> I have attached relevant debug logs to this email if anyone wishes >> to take >>> a look. I think the most interesting error message that I have >> received is >>> the following: >>> >>> 468390:2018-01-27 08:59:35,172 DEBUG [c.c.a.t.Request] >>> (Work-Job-Executor-6:ctx-188ea30f job-181792/job-181802 >> ctx-8e7f45ad) >>> (logid:f0888362) Seq 22-942378222027276319: Received: { Ans: , >> MgmtId: >>> 14038012703634, via: 22(csh02c01z01.nsvltn.ena.net), Ver: v1, >> Flags: 110, >>> { UnsupportedAnswer } } >>> 468391:2018-01-27 08:59:35,172 WARN [c.c.a.m.AgentManagerImpl] >>> (Work-Job-Executor-6:ctx-188ea30f job-181792/job-181802 >> ctx-8e7f45ad) >>> (logid:f0888362) Unsupported Command: Unsupported command issued: >>> com.cloud.agent.api.PrepareForMigrationCommand. Are you sure you >> got the >>> right type of server? >>> 468392:2018-01-27 08:59:35,179 ERROR [c.c.v.VmWorkJobHandlerProxy] >>> (Work-Job-Executor-6:ctx-188ea30f job-181792/job-181802 >> ctx-8e7f45ad) >>> (logid:f0888362) Invocation exception, caused by: >> com.cloud.exception.AgentUnavailableException: >>> Resource [Host:22] is unreachable: Host 22: Unable to prepare for >> migration >>> due to Unsupported command issued: com.cloud.agent.api. >> PrepareForMigrationCommand. >>> Are you sure you got the right type of server? >>> 468393:2018-01-27 08:59:35,179 INFO [c.c.v.VmWorkJobHandlerProxy] >>> (Work-Job-Executor-6:ctx-188ea30f job-181792/job-181802 >> ctx-8e7f45ad) >>> (logid:f0888362) Rethrow exception com.cloud.exception. >> AgentUnavailableException: >>> Resource [Host:22] is unreachable: Host 22: Unable to prepare for >> migration >>> due to Unsupported command issued: com.cloud.agent.api. >> PrepareForMigrationCommand. >>> Are you sure you got the right type of server? >>> >>> I've tracked this "Unsupported command" down in the CS 4.8 code to >>> cloudstack/api/src/com/cloud/agent/api/Answer.java which is the >> generic >>> answer class. I believe where the error is really being spawned >> from is >>> cloudstack/engine/orchestration/src/com/cloud/ >>> vm/VirtualMachineManagerImpl.java. Specifically: >>> Answer pfma = null; >>> try { >>> pfma = _agentMgr.send(dstHostId, pfmc); >>> if (pfma == null || !pfma.getResult()) { >>> final String details = pfma != null ? >> pfma.getDetails() : >>> "null answer returned"; >>> final String msg = "Unable to prepare for migration >> due to >>> " + details; >>> pfma = null; >>> throw new AgentUnavailableException(msg, dstHostId); >>> } >>> >>> The pfma returned must be in error or is never returned and therefore >>> still null. That answer appears that it should be coming from the >>> destination agent, but for the life of me I can't figure out what >> the root >>> cause of this error is beyond, "Unsupported command issued". What >> command >>> is unsupported? My guess is that it could be something wrong with >> the dxml >>> that is generated and passed to the destination host, but I have as >> yet >>> been unable to catch that dxml in debug. >>> >>> Any help or guidance is greatly appreciated. >>> >>> Thanks, >>> David Mabry >>> >>> >> >> >> > > > >