Re: Emergency: Cloud NOT starting

Maurice Lawler Sat, 13 Apr 2013 10:45:40 -0700

Thank you.

The FSCK was already completed during boot up, it was forced. However, how can 
I access the VM's when they are in starting state to see if they need a FSCK?


Agent log is showing this presently.


2013-04-13 12:35:09,989 INFO  [cloud.agent.Agent] (AgentShutdownThread:null) 
Stopping the agent: Reason = sig.kill
2013-04-13 12:37:32,244 INFO  [utils.component.ComponentLocator] (main:null) 
Unable to find components.xml
2013-04-13 12:37:32,285 INFO  [utils.component.ComponentLocator] (main:null) 
Skipping configuration using components.xml
2013-04-13 12:37:32,285 INFO  [cloud.agent.AgentShell] (main:null) 
Implementation Version is 4.0.1.20130201075054
2013-04-13 12:37:32,286 INFO  [cloud.agent.AgentShell] (main:null) 
agent.properties found at /etc/cloud/agent/agent.properties
2013-04-13 12:37:32,287 INFO  [cloud.agent.AgentShell] (main:null) Defaulting 
to using properties file for storage
2013-04-13 12:37:32,289 INFO  [cloud.agent.AgentShell] (main:null) Defaulting 
to the constant time backoff algorithm
2013-04-13 12:37:32,413 INFO  [cloud.agent.Agent] (main:null) id is 1
2013-04-13 12:37:32,418 ERROR [cloud.resource.ServerResourceBase] (main:null) 
Nics are not configured!
2013-04-13 12:37:32,420 ERROR [cloud.agent.AgentShell] (main:null) Unable to 
start agent: Private NIC is not configured
2013-04-13 12:42:30,653 INFO  [utils.component.ComponentLocator] (main:null) 
Unable to find components.xml
2013-04-13 12:42:30,654 INFO  [utils.component.ComponentLocator] (main:null) 
Skipping configuration using components.xml
2013-04-13 12:42:30,654 INFO  [cloud.agent.AgentShell] (main:null) 
Implementation Version is 4.0.1.20130201075054
2013-04-13 12:42:30,655 INFO  [cloud.agent.AgentShell] (main:null) 
agent.properties found at /etc/cloud/agent/agent.properties
2013-04-13 12:42:30,656 INFO  [cloud.agent.AgentShell] (main:null) Defaulting 
to using properties file for storage
2013-04-13 12:42:30,658 INFO  [cloud.agent.AgentShell] (main:null) Defaulting 
to the constant time backoff algorithm
2013-04-13 12:42:30,721 INFO  [cloud.agent.Agent] (main:null) id is 1
2013-04-13 12:42:30,820 INFO  [resource.virtualnetwork.VirtualRoutingResource] 
(main:null) VirtualRoutingResource _scriptDir to use: scripts/network/domr/kvm
2013-04-13 12:42:32,094 INFO  [kvm.resource.LibvirtComputingResource] 
(main:null) No libvirt.vif.driver specififed. Defaults to BridgeVifDriver.
2013-04-13 12:42:32,147 INFO  [cloud.agent.Agent] (main:null) Agent [id = 1 : 
type = LibvirtComputingResource : zone = 1 : pod = 1 : workers = 5 : host = 
96.31.67.232 : port = 8250
2013-04-13 12:42:32,154 INFO  [utils.nio.NioClient] (Agent-Selector:null) 
Connecting to myipaddress:8250
2013-04-13 12:42:32,444 INFO  [utils.nio.NioClient] (Agent-Selector:null) SSL: 
Handshake done
2013-04-13 12:42:32,599 INFO  [cloud.serializer.GsonHelper] 
(Agent-Handler-1:null) Default Builder inited.
2013-04-13 12:42:32,803 INFO  [cloud.agent.Agent] (Agent-Handler-2:null) 
Proccess agent startup answer, agent id = 1
2013-04-13 12:42:32,803 INFO  [cloud.agent.Agent] (Agent-Handler-2:null) Set 
agent id 1
2013-04-13 12:42:32,808 INFO  [cloud.agent.Agent] (Agent-Handler-2:null) 
Startup Response Received: agent id = 1


The management log says this: 

2013-04-13 12:43:28,952 DEBUG [cloud.network.NetworkManagerImpl] 
(secstorage-1:null) Lock is released for network id 201 as a part of network 
implement
2013-04-13 12:43:28,969 DEBUG [db.Transaction.Transaction] (secstorage-1:null) 
Rolling back the transaction: Time = 1 Name =  
-SystemVmLoadScanner$1.run:71-Executors$RunnableAdapter.call:471-FutureTask$Sync.innerRunAndReset:351-FutureTask.runAndReset:178-ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201:165-ScheduledThreadPoolExecutor$ScheduledFutureTask.run:267-ThreadPoolExecutor.runWorker:1146-ThreadPoolExecutor$Worker.run:615-Thread.run:679;
 called by 
-Transaction.rollback:887-DataCenterIpAddressDaoImpl.takeIpAddress:57-DatabaseCallback.intercept:34-DataCenterDaoImpl.allocatePrivateIpAddress:228-DatabaseCallback.intercept:34-PodBasedNetworkGuru.reserve:119-NetworkManagerImpl.prepareNic:2143-NetworkManagerImpl.prepare:2113-VirtualMachineManagerImpl.advanceStart:752-VirtualMachineManagerImpl.start:472-VirtualMachineManagerImpl.start:465-SecondaryStorageManagerImpl.startSecStorageVm:257
2013-04-13 12:43:28,970 INFO  [cloud.vm.VirtualMachineManagerImpl] 
(secstorage-1:null) Insufficient capacity
com.cloud.exception.InsufficientAddressCapacityException: Unable to get a 
management ip addressScope=interface com.cloud.dc.Pod; id=1
        at 
com.cloud.network.guru.PodBasedNetworkGuru.reserve(PodBasedNetworkGuru.java:121)
        at 
com.cloud.network.NetworkManagerImpl.prepareNic(NetworkManagerImpl.java:2143)
        at 
com.cloud.network.NetworkManagerImpl.prepare(NetworkManagerImpl.java:2113)
        at 
com.cloud.vm.VirtualMachineManagerImpl.advanceStart(VirtualMachineManagerImpl.java:752)
        at 
com.cloud.vm.VirtualMachineManagerImpl.start(VirtualMachineManagerImpl.java:472)
        at 
com.cloud.vm.VirtualMachineManagerImpl.start(VirtualMachineManagerImpl.java:465)
        at 
com.cloud.storage.secondary.SecondaryStorageManagerImpl.startSecStorageVm(SecondaryStorageManagerImpl.java:257)
        at 
com.cloud.storage.secondary.SecondaryStorageManagerImpl.allocCapacity(SecondaryStorageManagerImpl.java:684)
        at 
com.cloud.storage.secondary.SecondaryStorageManagerImpl.expandPool(SecondaryStorageManagerImpl.java:1310)
        at 
com.cloud.secstorage.PremiumSecondaryStorageManagerImpl.scanPool(PremiumSecondaryStorageManagerImpl.java:119)
        at 
com.cloud.secstorage.PremiumSecondaryStorageManagerImpl.scanPool(PremiumSecondaryStorageManagerImpl.java:50)
        at 
com.cloud.vm.SystemVmLoadScanner.loadScan(SystemVmLoadScanner.java:106)
        at 
com.cloud.vm.SystemVmLoadScanner.access$100(SystemVmLoadScanner.java:34)
        at 
com.cloud.vm.SystemVmLoadScanner$1.reallyRun(SystemVmLoadScanner.java:83)
        at com.cloud.vm.SystemVmLoadScanner$1.run(SystemVmLoadScanner.java:73)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at 
java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:165)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:267)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:679)
2013-04-13 12:43:28,973 DEBUG [cloud.vm.VirtualMachineManagerImpl] 
(secstorage-1:null) Cleaning up resources for the vm 
VM[SecondaryStorageVm|s-588-VM] in Starting state
2013-04-13 12:43:28,975 DEBUG [agent.transport.Request] (secstorage-1:null) Seq 
1-751304715: Waiting for Seq 751304714 Scheduling:  { Cmd , MgmtId: 
219948120943996, via: 1, Ver: v1, Flags: 100111, 
[{"StopCommand":{"isProxy":false,"vmName":"s-588-VM","wait":0}}] }
2013-04-13 12:43:29,186 DEBUG 
[network.router.VirtualNetworkApplianceManagerImpl] 
(RouterStatusMonitor-1:null) Found 0 routers.
2013-04-13 12:43:37,927 DEBUG [agent.manager.AgentManagerImpl] 
(AgentManager-Handler-14:null) Ping from 1
2013-04-13 12:43:43,240 DEBUG [cloud.server.StatsCollector] 
(StatsCollector-3:null) VmStatsCollector is running...
2013-04-13 12:43:43,323 DEBUG [cloud.server.StatsCollector] 
(StatsCollector-3:null) StorageCollector is running...
2013-04-13 12:43:43,327 DEBUG [cloud.server.StatsCollector] 
(StatsCollector-3:null) There is no secondary storage VM for secondary storage 
host nfs://96.31.67.232/secondary
2013-04-13 12:43:43,400 DEBUG [agent.transport.Request] (StatsCollector-3:null) 
Seq 1-751304716: Received:  { Ans: , MgmtId: 219948120943996, via: 1, Ver: v1, 
Flags: 10, { GetStorageStatsAnswer } }
2013-04-13 12:43:43,936 DEBUG [cloud.server.StatsCollector] 
(StatsCollector-3:null) HostStatsCollector is running...
2013-04-13 12:43:44,545 DEBUG [agent.transport.Request] (StatsCollector-3:null) 
Seq 1-751304717: Received:  { Ans: , MgmtId: 219948120943996, via: 1, Ver: v1, 
Flags: 10, { GetHostStatsAnswer } }
2013-04-13 12:43:58,231 DEBUG [cloud.server.ManagementServerImpl] 
(EventChecker-1:null) Deleting events older than: Fri Apr 12 12:43:58 CDT 2013
2013-04-13 12:43:58,233 DEBUG [cloud.server.ManagementServerImpl] 
(EventChecker-1:null) Found 0 events to be purged
2013-04-13 12:43:58,235 DEBUG [cloud.server.ManagementServerImpl] 
(EventChecker-1:null) Deleting events older than: Fri Apr 12 12:43:58 CDT 2013
2013-04-13 12:43:58,238 DEBUG [cloud.server.ManagementServerImpl] 
(EventChecker-1:null) Found 0 events to be purged
2013-04-13 12:43:59,186 DEBUG 
[network.router.VirtualNetworkApplianceManagerImpl] 
(RouterStatusMonitor-1:null) Found 0 routers.
[root@lunder agent]#




On Apr 13, 2013, at 12:30 PM, Marcus Sorensen <[email protected]> wrote:

> Well you've got something trying to start, because you have vnet
> interfaces. You need to look at your agent logs to see why the system VMS
> refuse to start. If the power went out it could be corruption, the system
> VMS may be waiting for you to fsck. It sounds like maybe the system was put
> into production without testing to make sure the host settings were
> persistent and would survive a reboot?
> 
> So 1) look at your agent logs. And 2) use vnc to look at whatever system
> VMS are running and see what state they are in. They will probably
> continually try to start and then shut down.
> On Apr 13, 2013 11:24 AM, "Maurice Lawler" <[email protected]> wrote:
> 
>> Greetings,
>> 
>> I'm have a terrible way to go, nothing I have done will start my cloud.
>> None of my system VM's will start, which in turn do not permit the regular
>> OS VM's to start. I suffered from first a power outage, then I manually
>> rebooted my server. Now, nothing is coming back online.
>> 
>> I was previously told, having cloud0 first is the cause of this. Even when
>> doing ifconfig cloud0 down, nothing seems to come back online.
>> 
>> I have gone as far as stopping iptables / eatables along with
>> stopping/starting the network and the management console.
>> 
>> 
>> Checking the system VM's the continue to remain in a 'starting' status.
>> 
>> [root@lunder ~]# service iptables status
>> iptables: Firewall is not running.
>> [root@lunder ~]# service ebtables status
>> # Generated by ebtables-save v1.0 on Sat Apr 13 12:21:04 CDT 2013
>> *nat
>> :PREROUTING ACCEPT
>> :OUTPUT ACCEPT
>> :POSTROUTING ACCEPT
>> 
>> [root@lunder ~]#
>> 
>> 
>> [root@lunder daoenix]# ifconfig
>> cloud0    Link encap:Ethernet  HWaddr FE:00:A9:FE:00:67
>>          inet addr:169.254.0.1  Bcast:169.254.255.255  Mask:255.255.0.0
>>          inet6 addr: fe80::200:ff:fe00:0/64 Scope:Link
>>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>>          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>>          TX packets:658 errors:0 dropped:0 overruns:0 carrier:0
>>          collisions:0 txqueuelen:0
>>          RX bytes:0 (0.0 b)  TX bytes:28068 (27.4 KiB)
>> 
>> cloudbr0  Link encap:Ethernet  HWaddr C8:0A:A9:9E:2D:7C
>>          inet addr:myipaddress  Bcast:9myipaddress Mask:255.255.255.224
>>          inet6 addr: fe80::fc2c:bcff:fe00:5/64 Scope:Link
>>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>>          RX packets:192832 errors:0 dropped:0 overruns:0 frame:0
>>          TX packets:11251 errors:0 dropped:0 overruns:0 carrier:0
>>          collisions:0 txqueuelen:0
>>          RX bytes:11481135 (10.9 MiB)  TX bytes:25153331 (23.9 MiB)
>> 
>> eth0      Link encap:Ethernet  HWaddr C8:0A:A9:9E:2D:7C
>>          inet6 addr: fe80::ca0a:a9ff:fe9e:2d7c/64 Scope:Link
>>          UP BROADCAST RUNNING PROMISC MULTICAST  MTU:1500  Metric:1
>>          RX packets:199794 errors:0 dropped:0 overruns:0 frame:0
>>          TX packets:24157 errors:0 dropped:0 overruns:0 carrier:0
>>          collisions:0 txqueuelen:1000
>>          RX bytes:14647159 (13.9 MiB)  TX bytes:25994485 (24.7 MiB)
>>          Memory:df6e0000-df700000
>> 
>> lo        Link encap:Local Loopback
>>          inet addr:127.0.0.1  Mask:255.0.0.0
>>          inet6 addr: ::1/128 Scope:Host
>>          UP LOOPBACK RUNNING  MTU:16436  Metric:1
>>          RX packets:7850808 errors:0 dropped:0 overruns:0 frame:0
>>          TX packets:7850808 errors:0 dropped:0 overruns:0 carrier:0
>>          collisions:0 txqueuelen:0
>>          RX bytes:1611132695 (1.5 GiB)  TX bytes:1611132695 (1.5 GiB)
>> 
>> virbr0    Link encap:Ethernet  HWaddr 52:54:00:D9:D9:9A
>>          inet addr:192.168.122.1  Bcast:192.168.122.255
>> Mask:255.255.255.0
>>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>>          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>>          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
>>          collisions:0 txqueuelen:0
>>          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
>> 
>> vnet0     Link encap:Ethernet  HWaddr FE:00:A9:FE:00:67
>>          inet6 addr: fe80::fc00:a9ff:fefe:67/64 Scope:Link
>>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>>          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>>          TX packets:116 errors:0 dropped:0 overruns:0 carrier:0
>>          collisions:0 txqueuelen:500
>>          RX bytes:0 (0.0 b)  TX bytes:5232 (5.1 KiB)
>> 
>> vnet1     Link encap:Ethernet  HWaddr FE:84:4C:00:00:01
>>          inet6 addr: fe80::fc84:4cff:fe00:1/64 Scope:Link
>>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>>          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>>          TX packets:256 errors:0 dropped:0 overruns:1 carrier:0
>>          collisions:0 txqueuelen:500
>>          RX bytes:0 (0.0 b)  TX bytes:17849 (17.4 KiB)
>> 
>> vnet2     Link encap:Ethernet  HWaddr FE:2C:BC:00:00:05
>>          inet6 addr: fe80::fc2c:bcff:fe00:5/64 Scope:Link
>>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>>          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>>          TX packets:256 errors:0 dropped:0 overruns:1 carrier:0
>>          collisions:0 txqueuelen:500
>>          RX bytes:0 (0.0 b)  TX bytes:17849 (17.4 KiB)
>> 
>> [root@lunder daoenix]#
>> 
>> 
>>

Re: Emergency: Cloud NOT starting

Reply via email to