
I simply grepped for "slow request" in ceph.log. What exactly do you mean by "effective OSD"?

If I have this log line:
2017-01-11 [...] osd.16 [...] cluster [WRN] slow request 32.868141 seconds old, received at 2017-01-11 [...] ack+ondisk+write+known_if_redirected e12440) currently waiting for subops from 0,12

I assumed that osd.16 is the one causing problems. But now that you mention the subops, I only noticed them yesterday, but didn't have the time yet to investigate further. I'll have a look into the subops messages and report back.


Zitat von Burkhard Linke <burkhard.li...@computational.bio.uni-giessen.de>:


just for clarity:

Did you parse the slow request messages and use the effective OSD in the statistics? Some message may refer to other OSDs, e.g. "waiting for sub op on OSD X,Y". The reporting OSD is not the root cause in that case, but one of the mentioned OSDs (and I'm currently not aware of a method to determine which of the both OSD is the cause in case of 3 replicates.....).



ceph-users mailing list

Eugen Block                             voice   : +49-40-559 51 75
NDE Netzdesign und -entwicklung AG      fax     : +49-40-559 51 77
Postfach 61 03 15
D-22423 Hamburg                         e-mail  : ebl...@nde.ag

        Vorsitzende des Aufsichtsrates: Angelika Mozdzen
          Sitz und Registergericht: Hamburg, HRB 90934
                  Vorstand: Jens-U. Mozdzen
                   USt-IdNr. DE 814 013 983

ceph-users mailing list

Reply via email to