Re: Can't launch VMs
Carlos, I think you found a bug in our upgrade scripts. Can you describe the scenario in a ticket? On Sat, Oct 4, 2014 at 1:48 AM, Carlos Reátegui wrote: > Hmm. I just checked another deployment I did with 4.4.0 using the same > zone type and network offering and it has the broadcast_uri set to > vlan://untagged. > > I also checked previous backups from older versions (4.10 and 4.2.1) and > in all of those the broadcast_uri was blank. I am wondering if this is a > bug in the upgrade scripts. > > I updated network 204 and set the broadcast_uri and now I am able to start > new instances. > > thank you for your help! > > I am including my nics table just in case you spot something else out of > the ordinary. > > mysql> select id, instance_id, network_id, netmask, gateway, ip4_address, > broadcast_uri, mode, state, strategy from nics where network_id=204 order > by ip4_address; > > +-+-++---+-+---+-+--+--+-+ > | id | instance_id | network_id | netmask | gateway | > ip4_address | broadcast_uri | mode | state| strategy| > > +-+-++---+-+---+-+--+--+-+ > | 60 | 52 |204 | NULL | NULL| NULL > | NULL| Dhcp | Deallocating | Start | > | 64 | 53 |204 | NULL | NULL| NULL > | NULL| Dhcp | Deallocating | Start | > | 68 | 54 |204 | NULL | NULL| NULL > | NULL| Dhcp | Deallocating | Start | > | 72 | 55 |204 | NULL | NULL| NULL > | NULL| Dhcp | Deallocating | Start | > | 76 | 56 |204 | NULL | NULL| NULL > | NULL| Dhcp | Deallocating | Start | > | 30 | 21 |204 | 255.255.255.0 | 172.30.45.1 | > 172.30.45.100 | vlan://untagged | Dhcp | Deallocating | Start | > | 32 | 23 |204 | 255.255.255.0 | 172.30.45.1 | > 172.30.45.100 | vlan://untagged | Dhcp | Deallocating | Start | > | 43 | 34 |204 | 255.255.255.0 | 172.30.45.1 | > 172.30.45.100 | vlan://untagged | Dhcp | Deallocating | Start | > | 47 | 38 |204 | 255.255.255.0 | 172.30.45.1 | > 172.30.45.100 | vlan://untagged | Dhcp | Reserved | Start | > | 1 | 1 |204 | 255.255.255.0 | 172.30.45.1 | > 172.30.45.101 | vlan://untagged | Dhcp | Deallocating | Start | > | 13 | 5 |204 | 255.255.255.0 | 172.30.45.1 | > 172.30.45.101 | vlan://untagged | Dhcp | Deallocating | Start | > | 87 | 59 |204 | 255.255.255.0 | 172.30.45.1 | > 172.30.45.101 | vlan://untagged | Dhcp | Deallocating | Start | > | 116 |NULL |204 | NULL | 172.30.45.1 | > 172.30.45.101 | NULL| NULL | Reserved | PlaceHolder | > | 120 | 91 |204 | 255.255.255.0 | 172.30.45.1 | > 172.30.45.101 | vlan://untagged | Dhcp | Reserved | Start | > | 23 | 14 |204 | 255.255.255.0 | 172.30.45.1 | > 172.30.45.102 | vlan://untagged | Dhcp | Reserved | Start | > | 21 | 12 |204 | 255.255.255.0 | 172.30.45.1 | > 172.30.45.103 | vlan://untagged | Dhcp | Deallocating | Start | > | 109 | 81 |204 | 255.255.255.0 | 172.30.45.1 | > 172.30.45.103 | vlan://untagged | Dhcp | Deallocating | Start | > | 4 | 2 |204 | 255.255.255.0 | 172.30.45.1 | > 172.30.45.104 | vlan://untagged | Dhcp | Deallocating | Start | > | 84 | 58 |204 | 255.255.255.0 | 172.30.45.1 | > 172.30.45.104 | vlan://untagged | Dhcp | Deallocating | Start | > | 89 | 60 |204 | 255.255.255.0 | 172.30.45.1 | > 172.30.45.104 | vlan://untagged | Dhcp | Deallocating | Start | > | 117 | 90 |204 | 255.255.255.0 | 172.30.45.1 | > 172.30.45.104 | vlan://untagged | Dhcp | Reserved | Start | > | 19 | 10 |204 | 255.255.255.0 | 172.30.45.1 | > 172.30.45.105 | vlan://untagged | Dhcp | Reserved | Start | > | 34 | 25 |204 | 255.255.255.0 | 172.30.45.1 | > 172.30.45.106 | vlan://untagged | Dhcp | Deallocating | Start | > | 38 | 29 |204 | 255.255.255.0 | 172.30.45.1 | > 172.30.45.106 | vlan://untagged | Dhcp | Reserved | Start | > | 8 | 3 |204 | 255.255.255.0 | 172.30.45.1 | > 172.30.45.107 | vlan://untagged | Dhcp | Deallocating | Start | > | 95 | 64 |204 | 255.255.255.0 | 172.30.45.1 | > 172.30.45.107 | vlan://untagged | Dhcp | Deallocating | Start | > | 27 | 18 |204 | 255.255.255.0 | 172.30.45.1 | > 172.30.45.108 | vlan://unta
Re: Can't launch VMs
Sure, but I am not sure if the problem happened when I upgraded to 4.3.0 or 4.3.1. When (what version) did the broadcast_uri in the network table require an entry for Guest networks and not able to be null? On Oct 4, 2014, at 4:15 AM, Daan Hoogland wrote: > Carlos, I think you found a bug in our upgrade scripts. Can you describe the > scenario in a ticket? > > On Sat, Oct 4, 2014 at 1:48 AM, Carlos Reátegui wrote: > Hmm. I just checked another deployment I did with 4.4.0 using the same zone > type and network offering and it has the broadcast_uri set to vlan://untagged. > > I also checked previous backups from older versions (4.10 and 4.2.1) and in > all of those the broadcast_uri was blank. I am wondering if this is a bug in > the upgrade scripts. > > I updated network 204 and set the broadcast_uri and now I am able to start > new instances. > > thank you for your help! > > I am including my nics table just in case you spot something else out of the > ordinary. > > mysql> select id, instance_id, network_id, netmask, gateway, ip4_address, > broadcast_uri, mode, state, strategy from nics where network_id=204 order by > ip4_address; > +-+-++---+-+---+-+--+--+-+ > | id | instance_id | network_id | netmask | gateway | ip4_address > | broadcast_uri | mode | state| strategy| > +-+-++---+-+---+-+--+--+-+ > | 60 | 52 |204 | NULL | NULL| NULL > | NULL| Dhcp | Deallocating | Start | > | 64 | 53 |204 | NULL | NULL| NULL > | NULL| Dhcp | Deallocating | Start | > | 68 | 54 |204 | NULL | NULL| NULL > | NULL| Dhcp | Deallocating | Start | > | 72 | 55 |204 | NULL | NULL| NULL > | NULL| Dhcp | Deallocating | Start | > | 76 | 56 |204 | NULL | NULL| NULL > | NULL| Dhcp | Deallocating | Start | > | 30 | 21 |204 | 255.255.255.0 | 172.30.45.1 | > 172.30.45.100 | vlan://untagged | Dhcp | Deallocating | Start | > | 32 | 23 |204 | 255.255.255.0 | 172.30.45.1 | > 172.30.45.100 | vlan://untagged | Dhcp | Deallocating | Start | > | 43 | 34 |204 | 255.255.255.0 | 172.30.45.1 | > 172.30.45.100 | vlan://untagged | Dhcp | Deallocating | Start | > | 47 | 38 |204 | 255.255.255.0 | 172.30.45.1 | > 172.30.45.100 | vlan://untagged | Dhcp | Reserved | Start | > | 1 | 1 |204 | 255.255.255.0 | 172.30.45.1 | > 172.30.45.101 | vlan://untagged | Dhcp | Deallocating | Start | > | 13 | 5 |204 | 255.255.255.0 | 172.30.45.1 | > 172.30.45.101 | vlan://untagged | Dhcp | Deallocating | Start | > | 87 | 59 |204 | 255.255.255.0 | 172.30.45.1 | > 172.30.45.101 | vlan://untagged | Dhcp | Deallocating | Start | > | 116 |NULL |204 | NULL | 172.30.45.1 | > 172.30.45.101 | NULL| NULL | Reserved | PlaceHolder | > | 120 | 91 |204 | 255.255.255.0 | 172.30.45.1 | > 172.30.45.101 | vlan://untagged | Dhcp | Reserved | Start | > | 23 | 14 |204 | 255.255.255.0 | 172.30.45.1 | > 172.30.45.102 | vlan://untagged | Dhcp | Reserved | Start | > | 21 | 12 |204 | 255.255.255.0 | 172.30.45.1 | > 172.30.45.103 | vlan://untagged | Dhcp | Deallocating | Start | > | 109 | 81 |204 | 255.255.255.0 | 172.30.45.1 | > 172.30.45.103 | vlan://untagged | Dhcp | Deallocating | Start | > | 4 | 2 |204 | 255.255.255.0 | 172.30.45.1 | > 172.30.45.104 | vlan://untagged | Dhcp | Deallocating | Start | > | 84 | 58 |204 | 255.255.255.0 | 172.30.45.1 | > 172.30.45.104 | vlan://untagged | Dhcp | Deallocating | Start | > | 89 | 60 |204 | 255.255.255.0 | 172.30.45.1 | > 172.30.45.104 | vlan://untagged | Dhcp | Deallocating | Start | > | 117 | 90 |204 | 255.255.255.0 | 172.30.45.1 | > 172.30.45.104 | vlan://untagged | Dhcp | Reserved | Start | > | 19 | 10 |204 | 255.255.255.0 | 172.30.45.1 | > 172.30.45.105 | vlan://untagged | Dhcp | Reserved | Start | > | 34 | 25 |204 | 255.255.255.0 | 172.30.45.1 | > 172.30.45.106 | vlan://untagged | Dhcp | Deallocating | Start | > | 38 | 29 |204 | 255.255.255.0 | 172.30.45.1 | > 172.30.45.106 | vlan://untagged | Dhcp | Reserved | Start | > | 8 | 3 |204 | 255.255.255.0
Re: Cloudstack operations and Ceph RBD in degraded state
Hi Wido, Is there a threshold of the "size" you just mention ? Thanks a lot. 2014-10-01 19:48 GMT+08:00 Wido den Hollander : > > > On 10/01/2014 01:43 PM, Indra Pramana wrote: > > Hi Wido, > > > > Can you elaborate more on what do you mean by the size of our cluster? Is > > it because the cluster size is too big, or too small? > > > > I think it's probably because the Ceph cluster is to small. That causes > to much stress on the other nodes during recovery. > > That again leads to libvirt and Qemu not being able to talk to Ceph > which leads to slow I/O. > > I've seen multiple occasions where Ceph clusters are recovering but > CloudStack is still working just fine. > > Wido > > > Thank you. > > > > On Wed, Oct 1, 2014 at 4:28 PM, Wido den Hollander > wrote: > > > >> > >> > >> On 10/01/2014 09:21 AM, Indra Pramana wrote: > >>> Dear all, > >>> > >>> Anyone using CloudStack with Ceph RBD as primary storage? I am using > >>> CloudStack 4.2.0 with KVM hypervisors and Ceph latest stable version of > >>> dumpling. > >>> > >> > >> I am :) > >> > >>> Based on what I see, when Ceph cluster is in degraded state (not > >>> active+clean), for example due to one node is down and in recovering > >>> process, it might affect CloudStack operations. For example: > >>> > >>> - Stopped VM cannot be started, because it says cannot find suitable > >>> storage pool. > >>> > >>> - Disconnected host cannot be reconnected easily, even after restarting > >>> agent and libvirt on agent side, and restarting management server on > the > >>> server side. Need to keep on trying and suddenly it will be > connected/up > >> by > >>> itself. > >>> > >> > >> It really depends on the size of the cluster. It could be that the Ceph > >> is cluster is so busy with recovery that it can't process the I/O coming > >> from CloudStack and thus stalls. > >> > >> This is not a Ceph or CloudStack problem, but probably the size of your > >> cluster. > >> > >> Wido > >> > >>> Once Ceph has recovered and back to active+clean state, then CloudStack > >>> operations will be back to normal. Host agents will be up, and VMs can > be > >>> started. > >>> > >>> Anyone seeing similar behaviour? > >>> > >>> Looking forward to your reply, thank you. > >>> > >>> Cheers. > >>> > >> > > >