----- Original Message ----- From: "Butkeev Stas" <staer...@ya.ru> To: ceph-us...@ceph.com, ceph-commun...@lists.ceph.com, supp...@ceph.com Sent: Friday, 31 July, 2015 9:10:40 PM Subject: [ceph-users] problem with RGW
>Hello everybody > >We have ceph cluster that consist of 8 host with 12 osd per each host. It's 2T >SATA disks. >In log osd.0 > >2015-07-31 14:03:24.490774 7f2cd95c5700 0 log_channel(cluster) log [WRN] : 35 >slow requests, 9 included below; oldest blocked for > 3003.952332 secs >2015-07-31 14:03:24.490782 7f2cd95c5700 0 log_channel(cluster) log [WRN] : >slow request 960.179599 seconds old, received at 2015-07-31 13:47:24.311080: >osd_op(client.67321.0:7856 >default.34169.37__shadow_.AnULxoR-51Q7fGdIVVP92CPeptlQJIm_226 [writefull 0~0] >26.f9af7c89 ack+ondisk+write+known_if_redirected e9467) currently no flag >points reached >2015-07-31 14:03:24.490791 7f2cd95c5700 0 log_channel(cluster) log [WRN] : >slow request 960.179357 seconds old, received at 2015-07-31 13:47:24.311323: >osd_op(client.67321.0:7857 >default.34169.37__shadow_.AnULxoR-51Q7fGdIVVP92CPeptlQJIm_226 [writefull >0~524288] 26.f9af7c89 ack+ondisk+write+known_if_redirected e9467) currently no >flag points reached >2015-07-31 14:03:24.490794 7f2cd95c5700 0 log_channel(cluster) log [WRN] : >slow request 960.167539 seconds old, received at 2015-07-31 13:47:24.323141: >osd_op(client.67321.0:7858 >default.34169.37__shadow_.AnULxoR-51Q7fGdIVVP92CPeptlQJIm_226 [write >524288~524288] 26.f9af7c89 ack+ondisk+write+known_if_redirected e9467) >currently no flag points reached >2015-07-31 14:03:24.490797 7f2cd95c5700 0 log_channel(cluster) log [WRN] : >slow request 960.155554 seconds old, received at 2015-07-31 13:47:24.335126: >osd_op(client.67321.0:7859 >default.34169.37__shadow_.AnULxoR-51Q7fGdIVVP92CPeptlQJIm_226 [write >1048576~524288] 26.f9af7c89 ack+ondisk+write+known_if_redirected e9467) >currently no flag points reached >2015-07-31 14:03:24.490801 7f2cd95c5700 0 log_channel(cluster) log [WRN] : >slow request 960.145867 seconds old, received at 2015-07-31 13:47:24.344813: >osd_op(client.67321.0:7860 >default.34169.37__shadow_.AnULxoR-51Q7fGdIVVP92CPeptlQJIm_226 [write >1572864~524288] 26.f9af7c89 ack+ondisk+write+known_if_redirected e9467) >currently no flag points reached >2015-07-31 14:03:25.491062 7f2cd95c5700 0 log_channel(cluster) log [WRN] : 35 >slow requests, 4 included below; oldest blocked for > 3004.952621 secs >2015-07-31 14:03:25.491078 7f2cd95c5700 0 log_channel(cluster) log [WRN] : >slow request 961.140790 seconds old, received at 2015-07-31 13:47:24.350178: >osd_op(client.67321.0:7861 >default.34169.37__shadow_.AnULxoR-51Q7fGdIVVP92CPeptlQJIm_226 [write >2097152~524288] 26.f9af7c89 ack+ondisk+write+known_if_redirected e9467) >currently no flag points reached >2015-07-31 14:03:25.491084 7f2cd95c5700 0 log_channel(cluster) log [WRN] : >slow request 961.097870 seconds old, received at 2015-07-31 13:47:24.393098: >osd_op(client.67321.0:7862 >default.34169.37__shadow_.AnULxoR-51Q7fGdIVVP92CPeptlQJIm_226 [write >2621440~524288] 26.f9af7c89 ack+ondisk+write+known_if_redirected e9467) >currently no flag points reached >2015-07-31 14:03:25.491089 7f2cd95c5700 0 log_channel(cluster) log [WRN] : >slow request 961.093229 seconds old, received at 2015-07-31 13:47:24.397740: >osd_op(client.67321.0:7863 >default.34169.37__shadow_.AnULxoR-51Q7fGdIVVP92CPeptlQJIm_226 [write >3145728~524288] 26.f9af7c89 ack+ondisk+write+known_if_redirected e9467) >currently no flag points reached >2015-07-31 14:03:25.491095 7f2cd95c5700 0 log_channel(cluster) log [WRN] : >slow request 961.002957 seconds old, received at 2015-07-31 13:47:24.488012: >osd_op(client.67321.0:7864 >default.34169.37__shadow_.AnULxoR-51Q7fGdIVVP92CPeptlQJIm_226 [write >3670016~524288] 26.f9af7c89 ack+ondisk+write+known_if_redirected e9467) >currently no flag points reached > >How I can avoid these blocked requests? What is root cause of this problem? > Do a "ceph pg dump" and look for the pgs in this state, ack+ondisk+write+known_if_redirected then do a "ceph pg [pgid] query" and post the output here (if there aren't too many, otherwise a representative sample). Also look carefully at the acting OSDs for these pgs and check the output of "ceph daemon /var/run/ceph/ceph-osd.NNN.asok dump_ops_in_flight". There could be problems with these OSDs slowing down the requests, including hardware problems so check thoroughly. _______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com