Hi,

Are you sure all OSDs have been updated to 0.94.7? Those messages should
only be printed by 0.94.6 OSDs trying to handle messages from a 0.94.7
ceph-mon.
Also, see the thread about the 0.94.7 release -- I mentioned a workaround
there.

--
Dan

On Thu, Jun 2, 2016 at 11:29 AM, Romero Junior <r.jun...@global.leaseweb.com
> wrote:

> Guys,
>
> After the update to 0.94.7 (from 0.94.6) everytime I replaced a broken OSD
> (1 out of 300) I get flooded by "[WRN] failed to encode map eXXX with
> expected crc", and the amount of blocked requests (> 32 secs) increase
> drastically, consequently killing all radosgw sessions.
>
> Nothing changed in our cluster expect the version update, and before that,
> we never had any issues like that, the cluster was able to handle disks
> replacement quite well.
>
> The procedure used for the OSD replacement is the following:
>
> Removing the dead disk:
>
> ceph osd out <id>
> ceph osd crush remove osd.<id>
> —> here the problem starts
> ceph osd rm <id>
> ceph auth del osd.<id>
>
> Adding a new OSD:
>
> ceph-deploy disk zap <node>:/dev/<disk>
> ceph-deploy --overwrite-conf osd prepare <node>:<disk>:/dev/<journal
> partition>
> ceph-deploy --overwrite-conf osd
> activate <node>:<disk>:/dev/<journal partition>
>
>
> Warning flood messages:
>
>     cluster xxx
>      health HEALTH_WARN
>             97 pgs backfill
>             12 pgs backfilling
>             3 pgs peering
>             2 pgs stuck inactive
>             112 pgs stuck unclean
>             242 requests are blocked > 32 sec
>             recovery 111320/18148458 objects misplaced (0.613%)
>      monmap e1: 3 mons at
> {mon001=xxx:6789/0,mon002=xxx:6789/0,mon003=xxx:6789/0}
>             election epoch 526, quorum 0,1,2 mon001,mon002,mon003
>      osdmap e134086: 296 osds: 296 up, 296 in; 108 remapped pgs
>       pgmap v12721457: 18368 pgs, 15 pools, 17163 GB data, 5889 kobjects
>             55811 GB used, 397 TB / 451 TB avail
>             111320/18148458 objects misplaced (0.613%)
>                18254 active+clean
>                   97 active+remapped+wait_backfill
>                   12 active+remapped+backfilling
>                    3 peering
>                    1 active+clean+scrubbing+deep
>                    1 active+clean+scrubbing
> recovery io 42311 kB/s, 15 objects/s
>   client io 5205 B/s rd, 6 op/s
>
> 2016-06-02 11:22:56.319615 osd.43 [WRN] failed to encode map e134066 with
> expected crc
> 2016-06-02 11:22:56.320236 osd.21 [WRN] failed to encode map e134066 with
> expected crc
> 2016-06-02 11:22:56.320862 osd.60 [WRN] failed to encode map e134066 with
> expected crc
> 2016-06-02 11:22:56.322256 osd.21 [WRN] failed to encode map e134066 with
> expected crc
> 2016-06-02 11:22:56.322833 osd.60 [WRN] failed to encode map e134066 with
> expected crc
> 2016-06-02 11:22:56.324521 osd.21 [WRN] failed to encode map e134066 with
> expected crc
> 2016-06-02 11:22:56.324533 osd.60 [WRN] failed to encode map e134066 with
> expected crc
> 2016-06-02 11:22:56.326382 osd.21 [WRN] failed to encode map e134066 with
> expected crc
> 2016-06-02 11:22:56.326716 osd.60 [WRN] failed to encode map e134066 with
> expected crc
> 2016-06-02 11:22:56.328460 osd.60 [WRN] failed to encode map e134066 with
> expected crc
> 2016-06-02 11:22:56.328500 osd.21 [WRN] failed to encode map e134066 with
> expected crc
> 2016-06-02 11:22:56.330503 osd.60 [WRN] failed to encode map e134066 with
> expected crc
> 2016-06-02 11:22:56.330517 osd.43 [WRN] failed to encode map e134066 with
> expected crc
> 2016-06-02 11:22:56.330671 osd.21 [WRN] failed to encode map e134066 with
> expected crc
>
>
>
>
> Kind regards,
>
> Romero Junior
> DevOps Infra Engineer
> LeaseWeb Global Services B.V.
>
> T: +31 20 316 0230
> M: +31 6 2115 9310
> E: r.jun...@global.leaseweb.com
> W: www.leaseweb.com
> Luttenbergweg 8,  1101 EC Amsterdam,  Netherlands
>
>
> *LeaseWeb is the brand name under which the various independent LeaseWeb
> companies operate. Each company is a separate and distinct entity that
> provides services in a particular geographic area. LeaseWeb Global Services
> B.V. does not provide third-party services. Please see
> www.leaseweb.com/en/legal <http://www.leaseweb.com/en/legal> for more
> information.*
>
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to