Re: Cloudstack operations and Ceph RBD in degraded state

Guo Star Sat, 04 Oct 2014 23:34:29 -0700

Hi Wido,

Is there a threshold of the "size" you just mention ?


Thanks a lot.

2014-10-01 19:48 GMT+08:00 Wido den Hollander <[email protected]>:

>
>
> On 10/01/2014 01:43 PM, Indra Pramana wrote:
> > Hi Wido,
> >
> > Can you elaborate more on what do you mean by the size of our cluster? Is
> > it because the cluster size is too big, or too small?
> >
>
> I think it's probably because the Ceph cluster is to small. That causes
> to much stress on the other nodes during recovery.
>
> That again leads to libvirt and Qemu not being able to talk to Ceph
> which leads to slow I/O.
>
> I've seen multiple occasions where Ceph clusters are recovering but
> CloudStack is still working just fine.
>
> Wido
>
> > Thank you.
> >
> > On Wed, Oct 1, 2014 at 4:28 PM, Wido den Hollander <[email protected]>
> wrote:
> >
> >>
> >>
> >> On 10/01/2014 09:21 AM, Indra Pramana wrote:
> >>> Dear all,
> >>>
> >>> Anyone using CloudStack with Ceph RBD as primary storage? I am using
> >>> CloudStack 4.2.0 with KVM hypervisors and Ceph latest stable version of
> >>> dumpling.
> >>>
> >>
> >> I am :)
> >>
> >>> Based on what I see, when Ceph cluster is in degraded state (not
> >>> active+clean), for example due to one node is down and in recovering
> >>> process, it might affect CloudStack operations. For example:
> >>>
> >>> - Stopped VM cannot be started, because it says cannot find suitable
> >>> storage pool.
> >>>
> >>> - Disconnected host cannot be reconnected easily, even after restarting
> >>> agent and libvirt on agent side, and restarting management server on
> the
> >>> server side. Need to keep on trying and suddenly it will be
> connected/up
> >> by
> >>> itself.
> >>>
> >>
> >> It really depends on the size of the cluster. It could be that the Ceph
> >> is cluster is so busy with recovery that it can't process the I/O coming
> >> from CloudStack and thus stalls.
> >>
> >> This is not a Ceph or CloudStack problem, but probably the size of your
> >> cluster.
> >>
> >> Wido
> >>
> >>> Once Ceph has recovered and back to active+clean state, then CloudStack
> >>> operations will be back to normal. Host agents will be up, and VMs can
> be
> >>> started.
> >>>
> >>> Anyone seeing similar behaviour?
> >>>
> >>> Looking forward to your reply, thank you.
> >>>
> >>> Cheers.
> >>>
> >>
> >
>

Re: Cloudstack operations and Ceph RBD in degraded state

Reply via email to