Thank you Kirk. 
As a preface to my response a virtual router is automatically created every 
time I create my first virtual machine. I have not yet successfully been able 
to add an instance. The virtual router created when I make an instance is the 
VR I am having problems with.

In answering your questions,
1. The virtual router goes into the starting state and never makes it to 
running. It errors out and then goes from starting state to stopping. I did see 
some suspicious lines on the log file:
2013-08-23 18:47:58,361 DEBUG [xen.resource.CitrixResourceBase] 
(DirectAgent-9:null) Trying to connect to 169.254.3.124
2013-08-23 18:47:59,178 DEBUG [xen.resource.CitrixResourceBase] 
(DirectAgent-9:null) Ping command port succeeded for vm r-11-VM
2013-08-23 18:47:59,561 DEBUG [agent.manager.DirectAgentAttache] 
(DirectAgent-9:null) Seq 1-585236521: Cancelling because one of the answers is 
false and it is stop on error.

Do you think the problem with my router not starting up is related to the "stop 
on error" part of those log lines? A few lines later in the log it tells me the 
guru did not like the answers so is stopping the router:

2013-08-23 18:47:59,623 INFO  [cloud.vm.VirtualMachineManagerImpl] 
(Job-Executor-11:job-31) The guru did not like the answers so stopping 
VM[DomainRouter|r-11-VM]
2013-08-23 18:47:59,626 DEBUG [agent.transport.Request] 
(Job-Executor-11:job-31) Seq 1-585236524: Sending  { Cmd , MgmtId: 
166316981724, via: 1, Ver: v1, Flags: 100111, 
[{"StopCommand":{"isProxy":false,"vmName":"r-11-VM","wait":0}}] }
2013-08-23 18:47:59,627 DEBUG [agent.transport.Request] 
(Job-Executor-11:job-31) Seq 1-585236524: Executing:  { Cmd , MgmtId: 
166316981724, via: 1, Ver: v1, Flags: 100111, 
[{"StopCommand":{"isProxy":false,"vmName":"r-11-VM","wait":0}}] }
2013-08-23 18:47:59,627 DEBUG [agent.manager.DirectAgentAttache] 
(DirectAgent-22:null) Seq 1-585236524: Executing request
2013-08-23 18:47:59,774 DEBUG [xen.resource.CitrixResourceBase] 
(DirectAgent-22:null) 9. The VM r-11-VM is in Stopping state
2013-08-23 18:48:00,101 INFO  [xen.resource.CitrixResourceBase] 
(DirectAgent-22:null) Removed  network rules for vm r-11-VM

Later it says it is in stopped state and that stopping succeeded:
2013-08-23 18:48:08,312 DEBUG [xen.resource.CitrixResourceBase] 
(DirectAgent-22:null) 10. The VM r-11-VM is in Stopped state
2013-08-23 18:48:08,313 DEBUG [agent.manager.AgentManagerImpl] 
(Job-Executor-11:job-31) Details from executing class 
com.cloud.agent.api.StopCommand: Stop VM r-11-VM Succeed

Then it tells me the error was in finalizeStart:
2013-08-23 18:48:08,313 ERROR [cloud.vm.VirtualMachineManagerImpl] 
(Job-Executor-11:job-31) Failed to start instance VM[DomainRouter|r-11-VM]
com.cloud.utils.exception.ExecutionException: Unable to start 
VM[DomainRouter|r-11-VM] due to error in finalizeStart, not retrying

Where can I see the finalizeStart method and the part of it that is failing? 

2. I checked the /var/log/SMlog on the XenServer host and did not find any 
lines that contained "getDomRVersion" How would I know if any of the entries 
are related to the getDomRVersion script?
I did notice that when I tried to start the VR then the SMlog showed a failure 
after running router_proxy.sh and get_template_version.sh (shown in the log 
below):
[31327] 2013-08-24 16:14:20.346551      #### VMOPS enter  routerProxy ####
[31327] 2013-08-24 16:14:20.346675      ['/bin/bash', 
'/opt/xensource/bin/router_proxy.sh', 'get_template_version.sh', 
'169.254.3.168']
[31327] 2013-08-24 16:14:20.464558      FAILED in util.pread: (rc 255) stdout: 
'', stderr: ''
[31327] 2013-08-24 16:14:20.464766      routerProxy command 
get_template_version.sh 169.254.3.168 failed 
[31327] 2013-08-24 16:14:20.464872      #### VMOPS exit  routerProxy ####

3. I checked for the /root/.ssh/id_rsa.cloud file on the XenServer host and it 
did exist.

4. I tried forcing reconnect through the UI and that did not solve the problem.
5. I tried unmanaging and re-managing the cluster (I had to put my host in 
maintenance mode first)
6. I tried clearing the host tags

None of this solved my problem. I did see some entries in the SMlog that made 
me wonder if vhd-util needs to be in my primary or secondary storage directory. 
To your knowledge, does it need to be? I thought vhd-util only needed to be in 
/usr/bin and /opt/xensource/bin. Is there somewhere else it needs to be?


I am starting to wonder if I should reinstall everything and use KVM with 
Libvirt. I really want to use Xen and CloudStack. This problem with my 
instances and the virtual router really is an impasse for my project.

Thank you for your help, Kirk.

Best,

Kent Johnson
University of Utah
Graduate Student
MSIS Program
-----Original Message-----
From: Kirk Kosinski [mailto:[email protected]] 
Sent: Friday, August 23, 2013 8:35 PM
To: [email protected]
Cc: Ahmad Emneina
Subject: Re: Cs 4.1.0 plus Xen 6.1 No Instances or VR's starting up

Does the virtual router start on XenServer and stop after a few seconds or 
minutes?  Or does it never start at all?

Check /var/log/SMlog on the XenServer host for any entries or
(especially) errors related to the getDomRVersion script.  If there are no 
useful errors, find the script arguments in SMlog and try running it manually 
with bash -x to see where it is failing.

One potential cause is a missing /root/.ssh/id_rsa.cloud on the XenServer host, 
so confirm it exists.  Besides that, some general steps that might clear up the 
problem include:
1. Force reconnect the host (CS API or UI).
2. Unmanage and re-manage the cluster (CS API or UI).
3. Unmanage cluster / clear host tags (xe host-param-clear uuid=host_id
param-name=tags) / re-manage cluster

Best regards,
Kirk


On 08/23/2013 04:24 PM, Ahmad Emneina wrote:
> looks like your management server reaches out to the router but barfs 
> out
> here:
> 2013-08-23 17:04:53,938 DEBUG [agent.transport.Request]
> (Job-Executor-1:job-18) Seq 1-1363673129: Received:  { Ans: , MgmtId:
> 166316981724, via: 1, Ver: v1, Flags: 110, { StartAnswer, 
> CheckSshAnswer, GetDomRVersionAnswer } }
> 2013-08-23 17:04:53,983 WARN
>  [network.router.VirtualNetworkApplianceManagerImpl]
> (Job-Executor-1:job-18) Unable to get the template/scripts version of 
> router r-6-VM due to: getDomRVersionCmd failed
> 2013-08-23 17:04:53,984 INFO  [cloud.vm.VirtualMachineManagerImpl]
> (Job-Executor-1:job-18) The guru did not like the answers so stopping 
> VM[DomainRouter|r-6-VM]
> 2013-08-23 17:04:53,988 DEBUG [agent.transport.Request]
> (Job-Executor-1:job-18) Seq 1-1363673130: Sending  { Cmd , MgmtId:
> 166316981724, via: 1, Ver: v1, Flags: 100111, 
> [{"StopCommand":{"isProxy":false,"vmName":"r-6-VM","wait":0}}] }
> 
> so the good news is your network config looks good, i just wonder if 
> youre using the right template or if the system vm isnt patched properly.
> 
> 
> On Fri, Aug 23, 2013 at 4:16 PM, Kent Johnson <[email protected]> wrote:
> 
>> Thank you Marty and Ahmad.
>>
>> Here is my management-server.log log: http://pastebin.com/TnRpsB8j
>>
>> Here is my catalina.out log: http://pastebin.com/index
>>
>> Here is my xensource.log from my xenserver host:
>> http://pastebin.com/RC6LhXQ4
>>
>> Best,
>>
>> Kent Johnson
>>
>> -----Original Message-----
>> From: Ahmad Emneina [mailto:[email protected]]
>> Sent: Friday, August 23, 2013 5:00 PM
>> To: Cloudstack users mailing list
>> Subject: Re: Cs 4.1.0 plus Xen 6.1 No Instances or VR's starting up
>>
>> probably best to post your logs to pastebin and have us sift over them.
>> see if we can spot anything.
>>
>>
>> On Fri, Aug 23, 2013 at 3:53 PM, Kent Johnson <[email protected]>
>> wrote:
>>
>>> Can anyone point me in the right direction on solving my instance 
>>> startup problem? They won't start up, nor will my virtual router.
>>>
>>> I am using CloudStack 4.1.0 on CentOS for my Management Server and 
>>> XenServer 6.1 for my VM host.
>>>
>>> I installed CS as per the documentation and have successfully added 
>>> a zone, pod, cluster, host, primary, and secondary storage.
>>> My System VM's start up and run correctly. My systemVM template and 
>>> the default CentOS templates download correctly.
>>> I can create instances but they always try to start up and then fail 
>>> and end up in "Error" state. My Virtual Router tries to start up But 
>>> it always fails.
>>>
>>> Are there any gotchas or quick suggestions anyone could give me to 
>>> help me understand how to get my instances to run and possibly my
>> virtual router?
>>> I can provide more details from the logs if needed.
>>>
>>> Kent Johnson
>>> University of Utah
>>> Graduate Student
>>>
>>>
>>
> 

Reply via email to