Re: [ceph-users] ONE pg deep-scrub blocks cluster

ceph Thu, 25 Aug 2016 05:05:51 -0700

Hey JC,

Thank you very much for your mail!


I will provide the Informations tomorrow when i am at work again.

Hope that we will find a solution :)

- Mehmet

Am 24. August 2016 16:58:58 MESZ, schrieb LOPEZ Jean-Charles 
<jelo...@redhat.com>:
>Hi Mehmet,
>
>I’m just seeing your message and read the thread going with it.
>
>Can you please provide me with a copy of the ceph.conf file on the MON
>and OSD side assuming it’s identical and if the ceph.conf file is
>different on the client side (the VM side) can you please provide me
>with a copy of it.
>
>Can you also provide me as attached txt files with
>output of your pg query of the pg 0.223?
>output of ceph -s
>output of ceph df
>output of ceph osd df
>output of ceph osd dump | grep pool
>output of ceph osd crush rule dump
>
>Thank you and I’ll see if I can get something to ease your pain.
>
>As a remark, assuming the size parameter of the rbd pool is set to 3,
>the number of PGs in your cluster should be higher
>
>If we manage to move forward and get it fixed, we will repost to the
>mailing list the changes we made to your configuration.
>
>Regards
>JC
>
>
>> On Aug 24, 2016, at 06:41, Mehmet <c...@elchaka.de> wrote:
>> 
>> Hello Guys,
>> 
>> the issue still exists :(
>> 
>> If we run a "ceph pg deep-scrub 0.223" nearly all VMs stop for a
>while (blocked requests).
>> 
>> - we already replaced the OSDs (SAS Disks - journal on NVMe)
>> - Removed OSDs so that acting set for pg 0.223 has changed
>> - checked the filesystem on the acting OSDs
>> - changed the tunables back from jewel to default
>> - changed the tunables again to jewel from default
>> - done a deep-scrub on the hole OSDs (ceph osd deep-scrub osd.<id>) -
>only when a deeph-scrub on pg 0.223 runs we get blocked requests
>> 
>> The deep-scrub on pg 0.223 took always 13-15 Min. to finish. It does
>not matter which OSDs are in the acting set for this pg.
>> 
>> So, i dont have any ideas what could be the issue for this.
>> 
>> As long as "ceph osd set nodeep-scrub" is set - so that no deep-scrub
>on 0.223 is running - the cluster is fine!
>> 
>> Could this be a bug?
>> 
>> ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)
>> Kernel: 4.4.0-31-generic #50-Ubuntu
>> 
>> Any ideas?
>> - Mehmet
>> 
>> 
>> 
>> Am 2016-08-02 17:57, schrieb c:
>>> Am 2016-08-02 13:30, schrieb c:
>>>> Hello Guys,
>>>> this time without the original acting-set osd.4, 16 and 28. The
>issue
>>>> still exists...
>>>> [...]
>>>>>>>> For the record, this ONLY happens with this PG and no others
>that
>>>>>>>> share
>>>>>>>> the same OSDs, right?
>>>>>>> Yes, right.
>>>> [...]
>>>>>>>> When doing the deep-scrub, monitor (atop, etc) all 3 nodes and
>>>>>>>> see if a
>>>>>>>> particular OSD (HDD) stands out, as I would expect it to.
>>>>>>> Now I logged all disks via atop each 2 seconds while the
>deep-scrub
>>>>>>> was running ( atop -w osdXX_atop 2 ).
>>>>>>> As you expected all disks was 100% busy - with constant 150MB
>>>>>>> (osd.4), 130MB (osd.28) and 170MB (osd.16)...
>>>>>>> - osd.4 (/dev/sdf) http://slexy.org/view/s21emd2u6j [1]
>>>>>>> - osd.16 (/dev/sdm): http://slexy.org/view/s20vukWz5E [2]
>>>>>>> - osd.28 (/dev/sdh): http://slexy.org/view/s20YX0lzZY [3]
>>>>>>> [...]
>>>>>>> But what is causing this? A deep-scrub on all other disks - same
>>>>>>> model and ordered at the same time - seems to not have this
>issue.
>>>> [...]
>>>>>>> Next week, I will do this
>>>>>>> 1.1 Remove osd.4 completely from Ceph - again (the actual
>primary
>>>>>>> for PG 0.223)
>>>> osd.4 is now removed completely.
>>>> The Primary PG is now on "osd.9"
>>>> # ceph pg map 0.223
>>>> osdmap e8671 pg 0.223 (0.223) -> up [9,16,28] acting [9,16,28]
>>>>>>> 1.2 xfs_repair -n /dev/sdf1 (osd.4): to see possible error
>>>> xfs_repair did not find/show any error
>>>>>>> 1.3 ceph pg deep-scrub 0.223
>>>>>>> - Log with " ceph tell osd.4,16,28 injectargs "--debug_osd 5/5"
>>>> Because now osd.9 is the Primary PG i have set the debug_osd on
>this too:
>>>> ceph tell osd.9 injectargs "--debug_osd 5/5"
>>>> and run the deep-scrub on 0.223 (and againg nearly all of my VMs
>stop
>>>> working for a while)
>>>> Start @ 15:33:27
>>>> End @ 15:48:31
>>>> The "ceph.log"
>>>> - http://slexy.org/view/s2WbdApDLz
>>>> The related LogFiles (OSDs 9,16 and 28) and the LogFile via atop
>for the osds
>>>> LogFile - osd.9 (/dev/sdk)
>>>> - ceph-osd.9.log: http://slexy.org/view/s2kXeLMQyw
>>>> - atop Log: http://slexy.org/view/s21wJG2qr8
>>>> LogFile - osd.16 (/dev/sdh)
>>>> - ceph-osd.16.log: http://slexy.org/view/s20D6WhD4d
>>>> - atop Log: http://slexy.org/view/s2iMjer8rC
>>>> LogFile - osd.28 (/dev/sdm)
>>>> - ceph-osd.28.log: http://slexy.org/view/s21dmXoEo7
>>>> - atop log: http://slexy.org/view/s2gJqzu3uG
>>>>>>> 2.1 Remove osd.16 completely from Ceph
>>>> osd.16 is now removed completely - now replaced with osd.17 witihin
>>>> the acting set.
>>>> # ceph pg map 0.223
>>>> osdmap e9017 pg 0.223 (0.223) -> up [9,17,28] acting [9,17,28]
>>>>>>> 2.2 xfs_repair -n /dev/sdh1
>>>> xfs_repair did not find/show any error
>>>>>>> 2.3 ceph pg deep-scrub 0.223
>>>>>>> - Log with " ceph tell osd.9,17,28 injectargs "--debug_osd 5/5"
>>>> and run the deep-scrub on 0.223 (and againg nearly all of my VMs
>stop
>>>> working for a while)
>>>> Start @ 2016-08-02 10:02:44
>>>> End @ 2016-08-02 10:17:22
>>>> The "Ceph.log": http://slexy.org/view/s2ED5LvuV2
>>>> LogFile - osd.9 (/dev/sdk)
>>>> - ceph-osd.9.log: http://slexy.org/view/s21z9JmwSu
>>>> - atop Log: http://slexy.org/view/s20XjFZFEL
>>>> LogFile - osd.17 (/dev/sdi)
>>>> - ceph-osd.17.log: http://slexy.org/view/s202fpcZS9
>>>> - atop Log: http://slexy.org/view/s2TxeR1JSz
>>>> LogFile - osd.28 (/dev/sdm)
>>>> - ceph-osd.28.log: http://slexy.org/view/s2eCUyC7xV
>>>> - atop log: http://slexy.org/view/s21AfebBqK
>>>>>>> 3.1 Remove osd.28 completely from Ceph
>>>> Now osd.28 is also removed completely from Ceph - now replaced with
>osd.23
>>>> # ceph pg map 0.223
>>>> osdmap e9363 pg 0.223 (0.223) -> up [9,17,23] acting [9,17,23]
>>>>>>> 3.2 xfs_repair -n /dev/sdm1
>>>> As expected: xfs_repair did not find/show any error
>>>>>>> 3.3 ceph pg deep-scrub 0.223
>>>>>>> - Log with " ceph tell osd.9,17,23 injectargs "--debug_osd 5/5"
>>>> ... againg nearly all of my VMs stop working for a while...
>>>> Now are all "original" OSDs (4,16,28) removed which was in the
>>>> acting-set when i wrote my first eMail to this mailinglist. But the
>>>> issue still exists with different OSDs (9,17,23) as the acting-set
>>>> while the questionable PG 0.223 is still the same!
>>>> In suspicion that the "tunable" could be the cause, i have now
>changed
>>>> this back to "default" via " ceph osd crush tunables default ".
>>>> This will take a whille... then i will do " ceph pg deep-scrub
>0.223 "
>>>> again (without osds 4,16,28)...
>>> Really, i do not know whats going on here.
>>> Ceph finished its recovering to "default" tunables but the issue
>still
>>> exists!:*(
>>> The acting set has changed again
>>> # ceph pg map 0.223
>>> osdmap e11230 pg 0.223 (0.223) -> up [9,11,20] acting [9,11,20]
>>> But when i start " ceph pg deep-scrub 0.223 ", again nearly all of
>my
>>> VMs stop working for a while!
>>> Does any one have an idea where i should have a look to find the
>cause for this?
>>> It seems that everytime the Primary OSD from the acting set of PG
>>> 0.223 (*4*,16,28; *9*,17,23 or *9*,11,20) leads to "currently
>waiting
>>> for subops from 9,X" and the deep-scrub takes always nearly 15
>minutes
>>> to finish.
>>> My output from " ceph pg 0.223 query "
>>> - http://slexy.org/view/s21d6qUqnV
>>> Mehmet
>>>> For the records: Although nearly all disks are busy i have no
>>>> slow/blocked requests and i am watching the logfiles for nearly 20
>>>> minutes now...
>>>> Your help is realy appreciated!
>>>> - Mehmet
>>>> _______________________________________________
>>>> ceph-users mailing list
>>>> ceph-users@lists.ceph.com
>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
>JC Lopez
>S. Technical Instructor, Global Storage Consulting Practice
>Red Hat, Inc.
>jelo...@redhat.com
>+1 408-680-6959

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] ONE pg deep-scrub blocks cluster

Reply via email to