Hello guys,
could someone help me to solve the problem with virtual routers on ACS 4.3
using Ubuntu 12.04 for both management and host servers.
I've recenly upgraded from ACS 4.2.1 following the release notes. In the
process of upgrading i've added new system vm template and following the
upgrade i've restarted all virtual routers. The process went well so far as
there we no errors.
Next day i've noticed that I am no longer able to start new virtual routers or
restart networks. I can successfully start existing virtual routers which are
in the Stopped state, but can't start a new virtual router. For instance, the
management server log shows the following when I am trying to restart an
existing network:
-------------------
2014-05-07 00:11:32,069 DEBUG [c.c.a.ApiServlet] (catalina-exec-5:ctx-f0be1010)
===START=== 192.168.169.52 -- GET command=restartNetwork&id=13
1e86d0-8d0b-4e9a-964d-e102511b055a&cleanup=true&response=json&sessionkey=i9vBkmoEtC2L4tAjX%2BMQQ9NzZKw%3D&_=1399417892014
2014-05-07 00:11:32,106 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
(catalina-exec-5:ctx-f0be1010 ctx-d6f35608) submit async job-4913, details:
Asyn
cJobVO {id:4913, userId: 3, accountId: 2, instanceType: None, instanceId: null,
cmd: org.apache.cloudstack.api.command.user.network.RestartNetwo
rkCmd, cmdInfo:
{"id":"131e86d0-8d0b-4e9a-964d-e102511b055a","response":"json","cleanup":"true","sessionkey":"i9vBkmoEtC2L4tAjX+MQQ9NzZKw\u003d"
,"cmdEventType":"NETWORK.RESTART","ctxUserId":"3","httpmethod":"GET","_":"1399417892014","ctxAccountId":"2","ctxStartEventId":"15168"},
cmdVersi
on: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0, result: null,
initMsid: 238402986947280, completeMsid: null, lastUpdated: null, las
tPolled: null, created: null}
2014-05-07 00:11:32,107 INFO [o.a.c.f.j.i.AsyncJobMonitor]
(Job-Executor-2:ctx-549fa81b) Add job-4913 into job monitoring
2014-05-07 00:11:32,107 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
(Job-Executor-2:ctx-549fa81b) Executing AsyncJobVO {id:4913, userId: 3,
accountI
d: 2, instanceType: None, instanceId: null, cmd:
org.apache.cloudstack.api.command.user.network.RestartNetworkCmd, cmdInfo:
{"id":"131e86d0-8d0b
-4e9a-964d-e102511b055a","response":"json","cleanup":"true","sessionkey":"i9vBkmoEtC2L4tAjX+MQQ9NzZKw\u003d","cmdEventType":"NETWORK.RESTART","c
txUserId":"3","httpmethod":"GET","_":"1399417892014","ctxAccountId":"2","ctxStartEventId":"15168"},
cmdVersion: 0, status: IN_PROGRESS, processS
tatus: 0, resultCode: 0, result: null, initMsid: 238402986947280, completeMsid:
null, lastUpdated: null, lastPolled: null, created: null}
2014-05-07 00:11:32,108 DEBUG [c.c.a.ApiServlet] (catalina-exec-5:ctx-f0be1010
ctx-d6f35608) ===END=== 192.168.169.52 -- GET command=restartNe
twork&id=131e86d0-8d0b-4e9a-964d-e102511b055a&cleanup=true&response=json&sessionkey=i9vBkmoEtC2L4tAjX%2BMQQ9NzZKw%3D&_=1399417892014
2014-05-07 00:11:32,130 DEBUG [o.a.c.e.o.NetworkOrchestrator]
(Job-Executor-2:ctx-549fa81b ctx-d6f35608) Restarting network 264...
2014-05-07 00:11:32,130 DEBUG [o.a.c.e.o.NetworkOrchestrator]
(Job-Executor-2:ctx-549fa81b ctx-d6f35608) Shutting down the network id=264 as
a p
art of network restart
2014-05-07 00:11:32,134 DEBUG [o.a.c.e.o.NetworkOrchestrator]
(Job-Executor-2:ctx-549fa81b ctx-d6f35608) Releasing 2 port forwarding rules
for n
etwork id=264 as a part of shutdownNetworkRules
2014-05-07 00:11:32,160 DEBUG [c.c.n.e.VirtualRouterElement]
(Job-Executor-2:ctx-549fa81b ctx-d6f35608) Virtual router elemnt doesn't need
to ap
ply firewall rules on the backend; virtual router doesn't exist in the network
264
2014-05-07 00:11:32,162 DEBUG [o.a.c.e.o.NetworkOrchestrator]
(Job-Executor-2:ctx-549fa81b ctx-d6f35608) Releasing 0 static nat rules for
networ
k id=264 as a part of shutdownNetworkRules
2014-05-07 00:11:32,162 DEBUG [c.c.n.f.FirewallManagerImpl]
(Job-Executor-2:ctx-549fa81b ctx-d6f35608) There are no rules to forward to the
netw
ork elements
2014-05-07 00:11:32,164 DEBUG [c.c.n.l.LoadBalancingRulesManagerImpl]
(Job-Executor-2:ctx-549fa81b ctx-d6f35608) Revoking 0 Public load balancing
rules for network id=264
2014-05-07 00:11:32,164 DEBUG [c.c.n.l.LoadBalancingRulesManagerImpl]
(Job-Executor-2:ctx-549fa81b ctx-d6f35608) There are no Load Balancing Rules to
forward to the network elements
2014-05-07 00:11:32,166 DEBUG [c.c.n.l.LoadBalancingRulesManagerImpl]
(Job-Executor-2:ctx-549fa81b ctx-d6f35608) Revoking 0 Internal load balancing
rules for network id=264
2014-05-07 00:11:32,166 DEBUG [c.c.n.l.LoadBalancingRulesManagerImpl]
(Job-Executor-2:ctx-549fa81b ctx-d6f35608) There are no Load Balancing Rules to
forward to the network elements
2014-05-07 00:11:32,168 DEBUG [o.a.c.e.o.NetworkOrchestrator]
(Job-Executor-2:ctx-549fa81b ctx-d6f35608) Releasing 5 firewall ingress rules
for network id=264 as a part of shutdownNetworkRules
2014-05-07 00:11:32,186 DEBUG [c.c.n.e.VirtualRouterElement]
(Job-Executor-2:ctx-549fa81b ctx-d6f35608) Virtual router elemnt doesn't need
to apply firewall rules on the backend; virtual router doesn't exist in the
network 264
2014-05-07 00:11:32,188 DEBUG [o.a.c.e.o.NetworkOrchestrator]
(Job-Executor-2:ctx-549fa81b ctx-d6f35608) Releasing 1 firewall egress rules
for network id=264 as a part of shutdownNetworkRules
2014-05-07 00:11:32,192 DEBUG [c.c.n.f.FirewallManagerImpl]
(Job-Executor-2:ctx-549fa81b ctx-d6f35608) applying default firewall egress
rules
2014-05-07 00:11:32,208 DEBUG [c.c.n.e.VirtualRouterElement]
(Job-Executor-2:ctx-549fa81b ctx-d6f35608) Virtual router elemnt doesn't need
to apply firewall rules on the backend; virtual router doesn't exist in the
network 264
2014-05-07 00:11:32,222 DEBUG [c.c.n.e.VirtualRouterElement]
(Job-Executor-2:ctx-549fa81b ctx-d6f35608) Virtual router elemnt doesn't need
to apply firewall rules on the backend; virtual router doesn't exist in the
network 264
2014-05-07 00:11:32,224 DEBUG [c.c.n.r.RulesManagerImpl]
(Job-Executor-2:ctx-549fa81b ctx-d6f35608) Found 0 static nat rules to apply
for network id 264
2014-05-07 00:11:32,251 DEBUG [c.c.n.e.VirtualRouterElement]
(Job-Executor-2:ctx-549fa81b ctx-d6f35608) Virtual router elemnt doesn't need
to associate ip addresses on the backend; virtual router doesn't exist in the
network 264
2014-05-07 00:11:32,253 DEBUG [o.a.c.e.o.NetworkOrchestrator]
(Job-Executor-2:ctx-549fa81b ctx-d6f35608) Sending network shutdown to
VirtualRouter
2014-05-07 00:11:32,253 DEBUG [o.a.c.e.o.NetworkOrchestrator]
(Job-Executor-2:ctx-549fa81b ctx-d6f35608) Implementing the network
Ntwk[264|Guest|8] elements and resources as a part of network restart
2014-05-07 00:11:32,257 DEBUG [o.a.c.e.o.NetworkOrchestrator]
(Job-Executor-2:ctx-549fa81b ctx-d6f35608) Asking VirtualRouter to implemenet
Ntwk[264|Guest|8]
2014-05-07 00:11:32,260 DEBUG [c.c.n.r.VirtualNetworkApplianceManagerImpl]
(Job-Executor-2:ctx-549fa81b ctx-d6f35608) Lock is acquired for network id 264
as a part of router startup in
Dest[Zone(Id)-Pod(Id)-Cluster(Id)-Host(Id)-Storage(Volume(Id|Type-->Pool(Id))]
: Dest[Zone(1)-Pod(null)-Cluster(null)-Host(null)-Storage()]
2014-05-07 00:11:32,277 DEBUG [c.c.n.r.VirtualNetworkApplianceManagerImpl]
(Job-Executor-2:ctx-549fa81b ctx-d6f35608) Adding nic for Virtual Router in
Guest network Ntwk[264|Guest|8]
2014-05-07 00:11:32,277 DEBUG [c.c.n.r.VirtualNetworkApplianceManagerImpl]
(Job-Executor-2:ctx-549fa81b ctx-d6f35608) Adding nic for Virtual Router in
Control network
2014-05-07 00:11:32,281 DEBUG [o.a.c.e.o.NetworkOrchestrator]
(Job-Executor-2:ctx-549fa81b ctx-d6f35608) Found existing network configuration
for offering [Network Offering [3-Control-System-Control-Network]:
Ntwk[202|Control|3]
2014-05-07 00:11:32,281 DEBUG [o.a.c.e.o.NetworkOrchestrator]
(Job-Executor-2:ctx-549fa81b ctx-d6f35608) Releasing lock for
Acct[06ee8d45-65f2-11e3-9bd1-d8d38559b2d0-system]
2014-05-07 00:11:32,282 DEBUG [c.c.n.r.VirtualNetworkApplianceManagerImpl]
(Job-Executor-2:ctx-549fa81b ctx-d6f35608) Adding nic for Virtual Router in
Public network
2014-05-07 00:11:32,287 DEBUG [o.a.c.e.o.NetworkOrchestrator]
(Job-Executor-2:ctx-549fa81b ctx-d6f35608) Found existing network configuration
for offering [Network Offering [1-Public-System-Public-Network]:
Ntwk[200|Public|1]
2014-05-07 00:11:32,287 DEBUG [o.a.c.e.o.NetworkOrchestrator]
(Job-Executor-2:ctx-549fa81b ctx-d6f35608) Releasing lock for
Acct[06ee8d45-65f2-11e3-9bd1-d8d38559b2d0-system]
2014-05-07 00:11:32,300 DEBUG [c.c.n.r.VirtualNetworkApplianceManagerImpl]
(Job-Executor-2:ctx-549fa81b ctx-d6f35608) Allocating the VR i=831 in
datacenter com.cloud.dc.DataCenterVO$$EnhancerByCGLIB$$732fb519@1with the
hypervisor type KVM
2014-05-07 00:11:32,304 DEBUG [c.c.n.r.VirtualNetworkApplianceManagerImpl]
(Job-Executor-2:ctx-549fa81b ctx-d6f35608) KVM won't support system vm, skip it
2014-05-07 00:11:32,305 DEBUG [c.c.n.r.VirtualNetworkApplianceManagerImpl]
(Job-Executor-2:ctx-549fa81b ctx-d6f35608) Lock is released for network id 264
as a part of router startup in
Dest[Zone(Id)-Pod(Id)-Cluster(Id)-Host(Id)-Storage(Volume(Id|Type-->Pool(Id))]
: Dest[Zone(1)-Pod(null)-Cluster(null)-Host(null)-Storage()]
2014-05-07 00:11:32,305 WARN [o.a.c.e.o.NetworkOrchestrator]
(Job-Executor-2:ctx-549fa81b ctx-d6f35608) Failed to implement network
Ntwk[264|Guest|8] elements and resources as a part of network restart due to
com.cloud.exception.ResourceUnavailableException: Resource [DataCenter:1] is
unreachable: Can't find at least one running router!
at
com.cloud.network.element.VirtualRouterElement.implement(VirtualRouterElement.java:192)
at
org.apache.cloudstack.engine.orchestration.NetworkOrchestrator.implementNetworkElementsAndResources(NetworkOrchestrator.java:1070)
at
org.apache.cloudstack.engine.orchestration.NetworkOrchestrator.restartNetwork(NetworkOrchestrator.java:2387)
at
com.cloud.network.NetworkServiceImpl.restartNetwork(NetworkServiceImpl.java:1847)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:622)
at
org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317)
at
org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
at
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
at
com.cloud.event.ActionEventInterceptor.invoke(ActionEventInterceptor.java:50)
at
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161)
at
org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91)
at
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
at
org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
at com.sun.proxy.$Proxy199.restartNetwork(Unknown Source)
at
org.apache.cloudstack.api.command.user.network.RestartNetworkCmd.execute(RestartNetworkCmd.java:92)
at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:161)
at
com.cloud.api.ApiAsyncJobDispatcher.runJobInContext(ApiAsyncJobDispatcher.java:109)
at com.cloud.api.ApiAsyncJobDispatcher$1.run(ApiAsyncJobDispatcher.java:66)
at
org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
at
org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
at
org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
at com.cloud.api.ApiAsyncJobDispatcher.runJob(ApiAsyncJobDispatcher.java:63)
at
org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:509)
at
org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
at
org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
at
org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
at
org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
at
org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:701)
2014-05-07 00:11:32,307 WARN [c.c.n.NetworkServiceImpl]
(Job-Executor-2:ctx-549fa81b ctx-d6f35608) Network id=264 failed to restart.
2014-05-07 00:11:32,311 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
(Job-Executor-2:ctx-549fa81b) Complete async job-4913, jobStatus: FAILED,
resultCode: 530, result:
org.apache.cloudstack.api.response.ExceptionResponse/null/{"uuidList":[],"errorcode":530,"errortext":"Failed
to restart network"}
2014-05-07 00:11:32,317 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
(Job-Executor-2:ctx-549fa81b) Done executing
org.apache.cloudstack.api.command.user.network.RestartNetworkCmd for job-4913
2014-05-07 00:11:32,321 INFO [o.a.c.f.j.i.AsyncJobMonitor]
(Job-Executor-2:ctx-549fa81b) Remove job-4913 from job monitoring
2014-05-07 00:11:34,215 DEBUG [c.c.s.StatsCollector]
(StatsCollector-1:ctx-d23e62b6) HostStatsCollector is running...
----------
>From the logs, the following line looks very odd to me:
2014-05-07 00:11:32,304 DEBUG [c.c.n.r.VirtualNetworkApplianceManagerImpl]
(Job-Executor-2:ctx-549fa81b ctx-d6f35608) KVM won't support system vm, skip it
Not sure what this means or what to do with this information. I've downloaded
the system vm for the kvm hypervisor, so why is it not supported? Not sure.
Anyway, i've also followed the following guide
(http://cloud.kelceydamage.com/cloudfire/blog/2013/10/08/conquering-the-cloudstack-4-2-dragon-kvm/)
and completely recreated the system vm templates. Following the steps using
Method 2 I've managed to install the new systemvm template using the latest 4.3
template and i've successfully recreated console proxy and ssvm vms. Both vms
are showing VM and Agent states as Up. I've tried destroying both vms and they
are recreated automatically without any issues. Also, the ssvm check script -
/usr/local/cloud/systemvm/ssvm-check.sh is not showing any errors. all looks
good and the secondary storage is mountable and writable.
However, I am still unable to create the new virtual routers. I still get the
same error and not sure what to do.
Thanks for any help.
Andrei