Re: [ceph-users] Help: pool not responding

Mario Giammarco Fri, 04 Mar 2016 00:11:15 -0800

I have restarted each host using init scripts. Is there another way?

2016-03-03 21:51 GMT+01:00 Dimitar Boichev <dimitar.boic...@axsmarine.com>:


> But the whole cluster or what ?
>
> Regards.
>
> *Dimitar Boichev*
> SysAdmin Team Lead
> AXSMarine Sofia
> Phone: +359 889 22 55 42
> Skype: dimitar.boichev.axsmarine
> E-mail: dimitar.boic...@axsmarine.com
>
> On Mar 3, 2016, at 22:47, Mario Giammarco <mgiamma...@gmail.com> wrote:
>
> Uses init script to restart
>
> *Da: *Dimitar Boichev
> *Inviato: *giovedì 3 marzo 2016 21:44
> *A: *Mario Giammarco
> *Cc: *Oliver Dzombic; ceph-users@lists.ceph.com
> *Oggetto: *Re: [ceph-users] Help: pool not responding
>
> I see a lot of people (including myself) ending with PGs that are stuck in
> “creating” state when you force create them.
>
> How did you restart ceph ?
> Mine were created fine after I restarted the monitor nodes after a minor
> version upgrade.
> Did you do it monitors firs, osds second, etc etc …..
>
> Regards.
>
>
> On Mar 3, 2016, at 13:13, Mario Giammarco <mgiamma...@gmail.com> wrote:
>
> I have tried "force create". It says "creating" but at the end problem
> persists.
> I have restarted ceph as usual.
> I am evaluating ceph and I am shocked because it semeed a very robust
> filesystem and now for a glitch I have an entire pool blocked and there is
> no simple procedure to force a recovery.
>
> 2016-03-02 18:31 GMT+01:00 Oliver Dzombic <i...@ip-interactive.de>:
>
>> Hi,
>>
>> i could also not find any delete, but a create.
>>
>> I found this here, its basically your situation:
>>
>> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2013-July/032412.html
>>
>> --
>> Mit freundlichen Gruessen / Best regards
>>
>> Oliver Dzombic
>> IP-Interactive
>>
>> mailto:i...@ip-interactive.de
>>
>> Anschrift:
>>
>> IP Interactive UG ( haftungsbeschraenkt )
>> Zum Sonnenberg 1-3
>> 63571 Gelnhausen
>>
>> HRB 93402 beim Amtsgericht Hanau
>> Geschäftsführung: Oliver Dzombic
>>
>> Steuer Nr.: 35 236 3622 1
>> UST ID: DE274086107
>>
>>
>> Am 02.03.2016 um 18:28 schrieb Mario Giammarco:
>> > Thans for info even if it is a bad info.
>> > Anyway I am reading docs again and I do not see a way to delete PGs.
>> > How can I remove them?
>> > Thanks,
>> > Mario
>> >
>> > 2016-03-02 17:59 GMT+01:00 Oliver Dzombic <i...@ip-interactive.de
>> > <mailto:i...@ip-interactive.de>>:
>> >
>> >     Hi,
>> >
>> >     as i see your situation, somehow this 4 pg's got lost.
>> >
>> >     They will not recover, because they are incomplete. So there is no
>> data
>> >     from which it could be recovered.
>> >
>> >     So all what is left is to delete this pg's.
>> >
>> >     Since all 3 osd's are in and up, it does not seem like you can
>> somehow
>> >     access this lost pg's.
>> >
>> >     --
>> >     Mit freundlichen Gruessen / Best regards
>> >
>> >     Oliver Dzombic
>> >     IP-Interactive
>> >
>> >     mailto:i...@ip-interactive.de <mailto:i...@ip-interactive.de>
>> >
>> >     Anschrift:
>> >
>> >     IP Interactive UG ( haftungsbeschraenkt )
>> >     Zum Sonnenberg 1-3
>> >     63571 Gelnhausen
>> >
>> >     HRB 93402 beim Amtsgericht Hanau
>> >     Geschäftsführung: Oliver Dzombic
>> >
>> >     Steuer Nr.: 35 236 3622 1 <tel:35%20236%203622%201>
>> >     UST ID: DE274086107
>> >
>> >
>> >     Am 02.03.2016 <tel:02.03.2016> um 17:45 schrieb Mario Giammarco:
>> >     >
>> >     >
>> >     > Here it is:
>> >     >
>> >     >  cluster ac7bc476-3a02-453d-8e5c-606ab6f022ca
>> >     >      health HEALTH_WARN
>> >     >             4 pgs incomplete
>> >     >             4 pgs stuck inactive
>> >     >             4 pgs stuck unclean
>> >     >             1 requests are blocked > 32 sec
>> >     >      monmap e8: 3 mons at
>> >     > {0=10.1.0.12:6789/0,1=10.1.0.14:6789/0,2=10.1.0.17:6789/0
>> >     <http://10.1.0.12:6789/0,1=10.1.0.14:6789/0,2=10.1.0.17:6789/0>
>> >     > <http://10.1.0.12:6789/0,1=10.1.0.14:6789/0,2=10.1.0.17:6789/0>}
>> >     >             election epoch 840, quorum 0,1,2 0,1,2
>> >     >      osdmap e2405: 3 osds: 3 up, 3 in
>> >     >       pgmap v5904430: 288 pgs, 4 pools, 391 GB data, 100 kobjects
>> >     >             1090 GB used, 4481 GB / 5571 GB avail
>> >     >                  284 active+clean
>> >     >                    4 incomplete
>> >     >   client io 4008 B/s rd, 446 kB/s wr, 23 op/s
>> >     >
>> >     >
>> >     > 2016-03-02 9:31 GMT+01:00 Shinobu Kinjo <ski...@redhat.com
>> >     <mailto:ski...@redhat.com>
>> >     > <mailto:ski...@redhat.com <mailto:ski...@redhat.com>>>:
>> >     >
>> >     >     Is "ceph -s" still showing you same output?
>> >     >
>> >     >     >     cluster ac7bc476-3a02-453d-8e5c-606ab6f022ca
>> >     >     >      health HEALTH_WARN
>> >     >     >             4 pgs incomplete
>> >     >     >             4 pgs stuck inactive
>> >     >     >             4 pgs stuck unclean
>> >     >     >      monmap e8: 3 mons at
>> >     >     > {0=10.1.0.12:6789/0,1=10.1.0.14:6789/0,2=10.1.0.17:6789/0
>> >     <http://10.1.0.12:6789/0,1=10.1.0.14:6789/0,2=10.1.0.17:6789/0>
>> >     >     <
>> http://10.1.0.12:6789/0,1=10.1.0.14:6789/0,2=10.1.0.17:6789/0>}
>> >     >     >             election epoch 832, quorum 0,1,2 0,1,2
>> >     >     >      osdmap e2400: 3 osds: 3 up, 3 in
>> >     >     >       pgmap v5883297: 288 pgs, 4 pools, 391 GB data, 100
>> >     kobjects
>> >     >     >             1090 GB used, 4481 GB / 5571 GB avail
>> >     >     >                  284 active+clean
>> >     >     >                    4 incomplete
>> >     >
>> >     >     Cheers,
>> >     >     S
>> >     >
>> >     >     ----- Original Message -----
>> >     >     From: "Mario Giammarco" <mgiamma...@gmail.com
>> >     <mailto:mgiamma...@gmail.com>
>> >     >     <mailto:mgiamma...@gmail.com <mailto:mgiamma...@gmail.com>>>
>> >     >     To: "Lionel Bouton" <lionel-subscript...@bouton.name
>> >     <mailto:lionel-subscript...@bouton.name>
>> >     >     <mailto:lionel-subscript...@bouton.name
>> >     <mailto:lionel-subscript...@bouton.name>>>
>> >     >     Cc: "Shinobu Kinjo" <ski...@redhat.com
>> >     <mailto:ski...@redhat.com> <mailto:ski...@redhat.com
>> >     <mailto:ski...@redhat.com>>>,
>> >     >     ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com>
>> >     <mailto:ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com
>> >>
>> >     >     Sent: Wednesday, March 2, 2016 4:27:15 PM
>> >     >     Subject: Re: [ceph-users] Help: pool not responding
>> >     >
>> >     >     Tried to set min_size=1 but unfortunately nothing has changed.
>> >     >     Thanks for the idea.
>> >     >
>> >     >     2016-02-29 22:56 GMT+01:00 Lionel Bouton
>> >     >     <lionel-subscript...@bouton.name
>> >     <mailto:lionel-subscript...@bouton.name>
>> >     >     <mailto:lionel-subscript...@bouton.name
>> >     <mailto:lionel-subscript...@bouton.name>>>:
>> >     >
>> >     >     > Le 29/02/2016 22:50, Shinobu Kinjo a écrit :
>> >     >     >
>> >     >     > the fact that they are optimized for benchmarks and
>> >     certainly not
>> >     >     > Ceph OSD usage patterns (with or without internal journal).
>> >     >     >
>> >     >     > Are you assuming that SSHD is causing the issue?
>> >     >     > If you could elaborate on this more, it would be helpful.
>> >     >     >
>> >     >     >
>> >     >     > Probably not (unless they reveal themselves extremely
>> unreliable
>> >     >     with Ceph
>> >     >     > OSD usage patterns which would be surprising to me).
>> >     >     >
>> >     >     > For incomplete PG the documentation seems good enough for
>> what
>> >     >     should be
>> >     >     > done :
>> >     >     >
>> http://docs.ceph.com/docs/master/rados/operations/pg-states/
>> >     >     >
>> >     >     > The relevant text:
>> >     >     >
>> >     >     > *Incomplete* Ceph detects that a placement group is missing
>> >     >     information
>> >     >     > about writes that may have occurred, or does not have any
>> >     healthy
>> >     >     copies.
>> >     >     > If you see this state, try to start any failed OSDs that may
>> >     >     contain the
>> >     >     > needed information or temporarily adjust min_size to allow
>> >     recovery.
>> >     >     >
>> >     >     > We don't have the full history but the most probable cause
>> >     of these
>> >     >     > incomplete PGs is that min_size is set to 2 or 3 and at some
>> >     time
>> >     >     the 4
>> >     >     > incomplete pgs didn't have as many replica as the min_size
>> >     value.
>> >     >     So if
>> >     >     > setting min_size to 2 isn't enough setting it to 1 should
>> >     unfreeze
>> >     >     them.
>> >     >     >
>> >     >     > Lionel
>> >     >     >
>> >     >
>> >     >
>> >     >
>> >     >
>> >     >
>> >     > _______________________________________________
>> >     > ceph-users mailing list
>> >     > ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com>
>> >     > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >     >
>> >     _______________________________________________
>> >     ceph-users mailing list
>> >     ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com>
>> >     http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >
>> >
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
>
>

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Help: pool not responding

Reply via email to