On Fri, Nov 9, 2018 at 2:24 AM Kenneth Waegeman <kenneth.waege...@ugent.be>
wrote:

> Hi all,
>
> On Mimic 13.2.1, we are seeing blocked ops on cephfs after removing some
> snapshots:
>
> [root@osd001 ~]# ceph -s
>    cluster:
>      id:     92bfcf0a-1d39-43b3-b60f-44f01b630e47
>      health: HEALTH_WARN
>              5 slow ops, oldest one blocked for 1162 sec, mon.mds03 has
> slow ops
>
>    services:
>      mon: 3 daemons, quorum mds01,mds02,mds03
>      mgr: mds02(active), standbys: mds03, mds01
>      mds: ceph_fs-2/2/2 up  {0=mds03=up:active,1=mds01=up:active}, 1
> up:standby
>      osd: 544 osds: 544 up, 544 in
>
>    io:
>      client:   5.4 KiB/s wr, 0 op/s rd, 0 op/s wr
>
> [root@osd001 ~]# ceph health detail
> HEALTH_WARN 5 slow ops, oldest one blocked for 1327 sec, mon.mds03 has
> slow ops
> SLOW_OPS 5 slow ops, oldest one blocked for 1327 sec, mon.mds03 has slow
> ops
>
> [root@osd001 ~]# ceph -v
> ceph version 13.2.1 (5533ecdc0fda920179d7ad84e0aa65a127b20d77) mimic
> (stable)
>
> Is this a known issue?
>

It's not exactly a known issue, but from the output and story you've got
here it looks like the OSDs are deleting the snapshot data too fast and the
MDS isn't getting quick enough replies? Or maybe you have an overlarge
CephFS directory which is taking a long time to clean up somehow; you
should get the MDS ops and the MDS' objecter ops in flight and see what
specifically is taking so long.
-Greg


>
> Cheers,
>
> Kenneth
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to