It is still "querying", after 6 days now. I have not tried any scrubbing options, I'll try them just to see. My next idea was to clobber osd 8, the one it is supposedly "querying".



   I ran into this problem too.  I don't know what I did to fix it.

   I tried ceph pg scrub <pgid>, ceph pg deep-scrub <pgid>, and ceph
   osd scrub <osdid>.  None of them had an immediate effect.  In the
   end, it finally cleared several days later in the middle of the
   night.  I can't even say what or when it finally cleared.  A
   different OSDs got kicked out, then rejoined.  While everything was
   moving from degraded to active+clean, it finally finished probing.

   If it's still happening tomorrow, I'd try to find a Geeks on IRC
   Duty (http://ceph.com/help/community/).


   On 5/3/14 09:43 , Kevin Horan wrote:
    Craig,
        Thanks for your response. I have already marked osd.6 as lost,
    as you suggested. The problem is that it is still querying osd.8
    which is not lost. I don't know why it is stuck there. It has been
    querying osd.8 for 4 days now.
        I also tried deleting the broken RBD image but the operation
    just hangs.

    Kevin


        On 5/1/14 10:11 , kevin horan wrote:
        Here is how I got into this state. I have only 6 OSDs total,
        3 on one host (vashti) and 3 on another (zadok). I set the
        noout flag so I could reboot zadok. Zadok was down for 2
        minutes. When it came up ceph began recovering the objects
        that had not been replicated yet. Before recovery finished,
        osd.6, on vashti, died (IO errors on disk, whole drive
        un-recoverable). Since osd.6 had objects that had not yet had
        a chance to replicate to any OSD on zadok, they were lost. I
        cannot recover anything further from osd.6.


        I'm pretty far out of my element here, but if osd.6 is gone,
        it might help to mark it lost:
        ceph osd lost 6

        I had similiar issues when I lost some PGs.  I don't think
        that it actually fixed my issue, but marking osds as lost did
        help Ceph move forward.


        You could also try deleting the broken RBD image, and see if
        that helps.


--
   Craig Lewis


Senior Systems Engineer


Office +1.714.602.1309


Email clewis-04jk9tcbggyp2ihm84uzcnbpr1lh4...@public.gmane.org <mailto:clewis-04jk9tcbggyp2ihm84uzcnbpr1lh4...@public.gmane.org>

Central Desktop. Work together in ways you never thought possible.


Connect with us Website <http://www.centraldesktop.com/> | Twitter <http://www.twitter.com/centraldesktop> | Facebook <http://www.facebook.com/CentralDesktop> | LinkedIn <http://www.linkedin.com/groups?gid=147417> | Blog <http://cdblog.centraldesktop.com/>

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to