Hi friends,
Since deployment of our Ceph cluster we've been plagued by slow
metadata error.
Namely, cluster goes into HEALTH_WARN with a message similar to this
one:
2 MDSs report slow metadata IOs
1 MDSs report slow requests
1 slow ops, oldest one blocked for 32 sec, daemons [osd.22,osd.4] have
_queue_cut_off
high
root@cephosd01:~#
I do appreciate your input anyway.
> This helped us a lot, the number of slow requests has decreased
> significantly.
>
> Regards,
> Eugen
>
>
> [1]
> https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/MK672ROJSW3X56PC2KWOK2
16:39 +0200, Momčilo Medić wrote:
> Hi Eugen,
>
> On Mon, 2020-08-24 at 14:26 +, Eugen Block wrote:
> > Hi,
> >
> > there have been several threads about this topic [1], most likely
> > it's
> > the metadata operation during the cleanup that s
Hi Dave,
On Tue, 2020-08-25 at 15:25 +0100, david.neal wrote:
> Hi Momo,
>
> This can be caused by many things apart from the ceph sw.
>
> For example I saw this once with the MTU in openvswitch not fully
> matching on a few nodes . We realised this using ping between nodes.
> For a 9000 MTU:
>
Hey Eugen,
On Wed, 2020-08-26 at 09:29 +, Eugen Block wrote:
> Hi,
>
> > > root@cephosd01:~# ceph config get mds.cephosd01 osd_op_queue
> > > wpq
> > > root@0cephosd01:~# ceph config get mds.cephosd01
> > > osd_op_queue_cut_off
> > > high
>
> just to make sure, I referred to OSD not MDS sett
d that adding more disks would create better
> > parallelisation, that's why I'm asking about larger drives.
>
> I don't think larger drives would improve that, probably even the
> opposite, depending on the drives, of course. More drives should
>