Re: Can't launch VMs

2014-10-04 Thread Daan Hoogland
Carlos, I think you found a bug in our upgrade scripts. Can you describe
the scenario in a ticket?

On Sat, Oct 4, 2014 at 1:48 AM, Carlos Reátegui  wrote:

> Hmm.  I just checked another deployment I did with 4.4.0 using the same
> zone type and network offering and it has the broadcast_uri set to
> vlan://untagged.
>
> I also checked previous backups from older versions (4.10 and 4.2.1) and
> in all of those the broadcast_uri was blank.  I am wondering if this is a
> bug in the upgrade scripts.
>
> I updated network 204 and set the broadcast_uri and now I am able to start
> new instances.
>
> thank you for your help!
>
> I am including my nics table just in case you spot something else out of
> the ordinary.
>
> mysql> select id, instance_id, network_id, netmask, gateway, ip4_address,
> broadcast_uri, mode, state, strategy from nics where network_id=204 order
> by ip4_address;
>
> +-+-++---+-+---+-+--+--+-+
> | id  | instance_id | network_id | netmask   | gateway |
> ip4_address   | broadcast_uri   | mode | state| strategy|
>
> +-+-++---+-+---+-+--+--+-+
> |  60 |  52 |204 | NULL  | NULL| NULL
>  | NULL| Dhcp | Deallocating | Start   |
> |  64 |  53 |204 | NULL  | NULL| NULL
>  | NULL| Dhcp | Deallocating | Start   |
> |  68 |  54 |204 | NULL  | NULL| NULL
>  | NULL| Dhcp | Deallocating | Start   |
> |  72 |  55 |204 | NULL  | NULL| NULL
>  | NULL| Dhcp | Deallocating | Start   |
> |  76 |  56 |204 | NULL  | NULL| NULL
>  | NULL| Dhcp | Deallocating | Start   |
> |  30 |  21 |204 | 255.255.255.0 | 172.30.45.1 |
> 172.30.45.100 | vlan://untagged | Dhcp | Deallocating | Start   |
> |  32 |  23 |204 | 255.255.255.0 | 172.30.45.1 |
> 172.30.45.100 | vlan://untagged | Dhcp | Deallocating | Start   |
> |  43 |  34 |204 | 255.255.255.0 | 172.30.45.1 |
> 172.30.45.100 | vlan://untagged | Dhcp | Deallocating | Start   |
> |  47 |  38 |204 | 255.255.255.0 | 172.30.45.1 |
> 172.30.45.100 | vlan://untagged | Dhcp | Reserved | Start   |
> |   1 |   1 |204 | 255.255.255.0 | 172.30.45.1 |
> 172.30.45.101 | vlan://untagged | Dhcp | Deallocating | Start   |
> |  13 |   5 |204 | 255.255.255.0 | 172.30.45.1 |
> 172.30.45.101 | vlan://untagged | Dhcp | Deallocating | Start   |
> |  87 |  59 |204 | 255.255.255.0 | 172.30.45.1 |
> 172.30.45.101 | vlan://untagged | Dhcp | Deallocating | Start   |
> | 116 |NULL |204 | NULL  | 172.30.45.1 |
> 172.30.45.101 | NULL| NULL | Reserved | PlaceHolder |
> | 120 |  91 |204 | 255.255.255.0 | 172.30.45.1 |
> 172.30.45.101 | vlan://untagged | Dhcp | Reserved | Start   |
> |  23 |  14 |204 | 255.255.255.0 | 172.30.45.1 |
> 172.30.45.102 | vlan://untagged | Dhcp | Reserved | Start   |
> |  21 |  12 |204 | 255.255.255.0 | 172.30.45.1 |
> 172.30.45.103 | vlan://untagged | Dhcp | Deallocating | Start   |
> | 109 |  81 |204 | 255.255.255.0 | 172.30.45.1 |
> 172.30.45.103 | vlan://untagged | Dhcp | Deallocating | Start   |
> |   4 |   2 |204 | 255.255.255.0 | 172.30.45.1 |
> 172.30.45.104 | vlan://untagged | Dhcp | Deallocating | Start   |
> |  84 |  58 |204 | 255.255.255.0 | 172.30.45.1 |
> 172.30.45.104 | vlan://untagged | Dhcp | Deallocating | Start   |
> |  89 |  60 |204 | 255.255.255.0 | 172.30.45.1 |
> 172.30.45.104 | vlan://untagged | Dhcp | Deallocating | Start   |
> | 117 |  90 |204 | 255.255.255.0 | 172.30.45.1 |
> 172.30.45.104 | vlan://untagged | Dhcp | Reserved | Start   |
> |  19 |  10 |204 | 255.255.255.0 | 172.30.45.1 |
> 172.30.45.105 | vlan://untagged | Dhcp | Reserved | Start   |
> |  34 |  25 |204 | 255.255.255.0 | 172.30.45.1 |
> 172.30.45.106 | vlan://untagged | Dhcp | Deallocating | Start   |
> |  38 |  29 |204 | 255.255.255.0 | 172.30.45.1 |
> 172.30.45.106 | vlan://untagged | Dhcp | Reserved | Start   |
> |   8 |   3 |204 | 255.255.255.0 | 172.30.45.1 |
> 172.30.45.107 | vlan://untagged | Dhcp | Deallocating | Start   |
> |  95 |  64 |204 | 255.255.255.0 | 172.30.45.1 |
> 172.30.45.107 | vlan://untagged | Dhcp | Deallocating | Start   |
> |  27 |  18 |204 | 255.255.255.0 | 172.30.45.1 |
> 172.30.45.108 | vlan://unta

Re: Can't launch VMs

2014-10-04 Thread Carlos Reátegui
Sure, but I am not sure if the problem happened when I upgraded to 4.3.0 or 
4.3.1.  When (what version) did the broadcast_uri in the network table require 
an entry for Guest networks and not able to be null?


On Oct 4, 2014, at 4:15 AM, Daan Hoogland  wrote:

> Carlos, I think you found a bug in our upgrade scripts. Can you describe the 
> scenario in a ticket?
> 
> On Sat, Oct 4, 2014 at 1:48 AM, Carlos Reátegui  wrote:
> Hmm.  I just checked another deployment I did with 4.4.0 using the same zone 
> type and network offering and it has the broadcast_uri set to vlan://untagged.
> 
> I also checked previous backups from older versions (4.10 and 4.2.1) and in 
> all of those the broadcast_uri was blank.  I am wondering if this is a bug in 
> the upgrade scripts. 
> 
> I updated network 204 and set the broadcast_uri and now I am able to start 
> new instances.
> 
> thank you for your help!
> 
> I am including my nics table just in case you spot something else out of the 
> ordinary.
> 
> mysql> select id, instance_id, network_id, netmask, gateway, ip4_address, 
> broadcast_uri, mode, state, strategy from nics where network_id=204 order by 
> ip4_address;
> +-+-++---+-+---+-+--+--+-+
> | id  | instance_id | network_id | netmask   | gateway | ip4_address  
>  | broadcast_uri   | mode | state| strategy|
> +-+-++---+-+---+-+--+--+-+
> |  60 |  52 |204 | NULL  | NULL| NULL 
>  | NULL| Dhcp | Deallocating | Start   |
> |  64 |  53 |204 | NULL  | NULL| NULL 
>  | NULL| Dhcp | Deallocating | Start   |
> |  68 |  54 |204 | NULL  | NULL| NULL 
>  | NULL| Dhcp | Deallocating | Start   |
> |  72 |  55 |204 | NULL  | NULL| NULL 
>  | NULL| Dhcp | Deallocating | Start   |
> |  76 |  56 |204 | NULL  | NULL| NULL 
>  | NULL| Dhcp | Deallocating | Start   |
> |  30 |  21 |204 | 255.255.255.0 | 172.30.45.1 | 
> 172.30.45.100 | vlan://untagged | Dhcp | Deallocating | Start   |
> |  32 |  23 |204 | 255.255.255.0 | 172.30.45.1 | 
> 172.30.45.100 | vlan://untagged | Dhcp | Deallocating | Start   |
> |  43 |  34 |204 | 255.255.255.0 | 172.30.45.1 | 
> 172.30.45.100 | vlan://untagged | Dhcp | Deallocating | Start   |
> |  47 |  38 |204 | 255.255.255.0 | 172.30.45.1 | 
> 172.30.45.100 | vlan://untagged | Dhcp | Reserved | Start   |
> |   1 |   1 |204 | 255.255.255.0 | 172.30.45.1 | 
> 172.30.45.101 | vlan://untagged | Dhcp | Deallocating | Start   |
> |  13 |   5 |204 | 255.255.255.0 | 172.30.45.1 | 
> 172.30.45.101 | vlan://untagged | Dhcp | Deallocating | Start   |
> |  87 |  59 |204 | 255.255.255.0 | 172.30.45.1 | 
> 172.30.45.101 | vlan://untagged | Dhcp | Deallocating | Start   |
> | 116 |NULL |204 | NULL  | 172.30.45.1 | 
> 172.30.45.101 | NULL| NULL | Reserved | PlaceHolder |
> | 120 |  91 |204 | 255.255.255.0 | 172.30.45.1 | 
> 172.30.45.101 | vlan://untagged | Dhcp | Reserved | Start   |
> |  23 |  14 |204 | 255.255.255.0 | 172.30.45.1 | 
> 172.30.45.102 | vlan://untagged | Dhcp | Reserved | Start   |
> |  21 |  12 |204 | 255.255.255.0 | 172.30.45.1 | 
> 172.30.45.103 | vlan://untagged | Dhcp | Deallocating | Start   |
> | 109 |  81 |204 | 255.255.255.0 | 172.30.45.1 | 
> 172.30.45.103 | vlan://untagged | Dhcp | Deallocating | Start   |
> |   4 |   2 |204 | 255.255.255.0 | 172.30.45.1 | 
> 172.30.45.104 | vlan://untagged | Dhcp | Deallocating | Start   |
> |  84 |  58 |204 | 255.255.255.0 | 172.30.45.1 | 
> 172.30.45.104 | vlan://untagged | Dhcp | Deallocating | Start   |
> |  89 |  60 |204 | 255.255.255.0 | 172.30.45.1 | 
> 172.30.45.104 | vlan://untagged | Dhcp | Deallocating | Start   |
> | 117 |  90 |204 | 255.255.255.0 | 172.30.45.1 | 
> 172.30.45.104 | vlan://untagged | Dhcp | Reserved | Start   |
> |  19 |  10 |204 | 255.255.255.0 | 172.30.45.1 | 
> 172.30.45.105 | vlan://untagged | Dhcp | Reserved | Start   |
> |  34 |  25 |204 | 255.255.255.0 | 172.30.45.1 | 
> 172.30.45.106 | vlan://untagged | Dhcp | Deallocating | Start   |
> |  38 |  29 |204 | 255.255.255.0 | 172.30.45.1 | 
> 172.30.45.106 | vlan://untagged | Dhcp | Reserved | Start   |
> |   8 |   3 |204 | 255.255.255.0 

Re: Cloudstack operations and Ceph RBD in degraded state

2014-10-04 Thread Guo Star
Hi Wido,

Is there a threshold of the "size" you just mention ?

Thanks a lot.

2014-10-01 19:48 GMT+08:00 Wido den Hollander :

>
>
> On 10/01/2014 01:43 PM, Indra Pramana wrote:
> > Hi Wido,
> >
> > Can you elaborate more on what do you mean by the size of our cluster? Is
> > it because the cluster size is too big, or too small?
> >
>
> I think it's probably because the Ceph cluster is to small. That causes
> to much stress on the other nodes during recovery.
>
> That again leads to libvirt and Qemu not being able to talk to Ceph
> which leads to slow I/O.
>
> I've seen multiple occasions where Ceph clusters are recovering but
> CloudStack is still working just fine.
>
> Wido
>
> > Thank you.
> >
> > On Wed, Oct 1, 2014 at 4:28 PM, Wido den Hollander 
> wrote:
> >
> >>
> >>
> >> On 10/01/2014 09:21 AM, Indra Pramana wrote:
> >>> Dear all,
> >>>
> >>> Anyone using CloudStack with Ceph RBD as primary storage? I am using
> >>> CloudStack 4.2.0 with KVM hypervisors and Ceph latest stable version of
> >>> dumpling.
> >>>
> >>
> >> I am :)
> >>
> >>> Based on what I see, when Ceph cluster is in degraded state (not
> >>> active+clean), for example due to one node is down and in recovering
> >>> process, it might affect CloudStack operations. For example:
> >>>
> >>> - Stopped VM cannot be started, because it says cannot find suitable
> >>> storage pool.
> >>>
> >>> - Disconnected host cannot be reconnected easily, even after restarting
> >>> agent and libvirt on agent side, and restarting management server on
> the
> >>> server side. Need to keep on trying and suddenly it will be
> connected/up
> >> by
> >>> itself.
> >>>
> >>
> >> It really depends on the size of the cluster. It could be that the Ceph
> >> is cluster is so busy with recovery that it can't process the I/O coming
> >> from CloudStack and thus stalls.
> >>
> >> This is not a Ceph or CloudStack problem, but probably the size of your
> >> cluster.
> >>
> >> Wido
> >>
> >>> Once Ceph has recovered and back to active+clean state, then CloudStack
> >>> operations will be back to normal. Host agents will be up, and VMs can
> be
> >>> started.
> >>>
> >>> Anyone seeing similar behaviour?
> >>>
> >>> Looking forward to your reply, thank you.
> >>>
> >>> Cheers.
> >>>
> >>
> >
>