Re: [ceph-users] jewel - recovery keeps stalling (continues after restarting OSDs)

linghucongsong Fri, 28 Jul 2017 02:54:09 -0700


You have two crush rule? One is ssd the other is hdd?

Can you show ceph osd dump|grep pool

ceph osd crush dump






At 2017-07-28 17:47:48, "Nikola Ciprich" <nikola.cipr...@linuxbox.cz> wrote:
>
>On Fri, Jul 28, 2017 at 05:43:14PM +0800, linghucongsong wrote:
>> 
>> 
>> It look like the osd in your cluster is not all the same size.
>> 
>> can you show ceph osd df output?
>
>you're right, they're not..  here's the output:
>
>[root@v1b ~]# ceph osd df tree
>ID  WEIGHT   REWEIGHT SIZE   USE   AVAIL  %USE  VAR  PGS TYPE NAME         
> -2  1.55995        -  1706G  883G   805G 51.78 2.55   0 root ssd          
> -9  0.39999        -   393G  221G   171G 56.30 2.78   0     host v1c-ssd  
> 10  0.39999  1.00000   393G  221G   171G 56.30 2.78  98         osd.10    
>-10  0.59998        -   683G  275G   389G 40.39 1.99   0     host v1a-ssd  
>  5  0.29999  1.00000   338G  151G   187G 44.77 2.21  65         osd.5     
> 26  0.29999  1.00000   344G  124G   202G 36.07 1.78  52         osd.26    
>-11  0.34000        -   338G  219G   119G 64.68 3.19   0     host v1b-ssd  
> 13  0.34000  1.00000   338G  219G   119G 64.68 3.19  96         osd.13    
> -7  0.21999        -   290G  166G   123G 57.43 2.83   0     host v1d-ssd  
> 19  0.21999  1.00000   290G  166G   123G 57.43 2.83  73         osd.19    
> -1 39.29982        - 43658G 8312G 34787G 19.04 0.94   0 root default      
> -4 11.89995        - 12806G 2422G 10197G 18.92 0.93   0     host v1a      
>  6  1.59999  1.00000  1833G  358G  1475G 19.53 0.96 366         osd.6     
>  8  1.79999  1.00000  1833G  313G  1519G 17.11 0.84 370         osd.8     
>  2  1.59999  1.00000  1833G  320G  1513G 17.46 0.86 331         osd.2     
>  0  1.70000  1.00000  1804G  431G  1373G 23.90 1.18 359         osd.0     
>  4  1.59999  1.00000  1833G  294G  1539G 16.07 0.79 360         osd.4     
> 25  3.59999  1.00000  3667G  704G  2776G 19.22 0.95 745         osd.25    
> -5 10.39995        - 10914G 2154G  8573G 19.74 0.97   0     host v1b      
>  1  1.59999  1.00000  1804G  350G  1454G 19.42 0.96 409         osd.1     
>  3  1.79999  1.00000  1804G  360G  1444G 19.98 0.99 412         osd.3     
>  9  1.59999  1.00000  1804G  331G  1473G 18.37 0.91 363         osd.9     
> 11  1.79999  1.00000  1833G  367G  1465G 20.06 0.99 415         osd.11    
> 24  3.59999  1.00000  3667G  744G  2736G 20.30 1.00 834         osd.24    
> -6  7.79996        -  9051G 1769G  7282G 19.54 0.96   0     host v1c      
> 14  1.59999  1.00000  1804G  370G  1433G 20.54 1.01 442         osd.14    
> 15  1.79999  1.00000  1833G  383G  1450G 20.92 1.03 447         osd.15    
> 16  1.39999  1.00000  1804G  295G  1508G 16.38 0.81 355         osd.16    
> 18  1.39999  1.00000  1804G  366G  1438G 20.29 1.00 381         osd.18    
> 17  1.59999  1.00000  1804G  353G  1451G 19.57 0.97 429         osd.17    
> -3  9.19997        - 10885G 1965G  8733G 18.06 0.89   0     host v1d-sata 
> 12  1.39999  1.00000  1804G  348G  1455G 19.32 0.95 365         osd.12    
> 20  1.39999  1.00000  1804G  335G  1468G 18.60 0.92 371         osd.20    
> 21  3.59999  1.00000  3667G  695G  2785G 18.97 0.94 871         osd.21    
> 22  1.39999  1.00000  1804G  281G  1522G 15.63 0.77 326         osd.22    
> 23  1.39999  1.00000  1804G  303G  1500G 16.83 0.83 321         osd.23    
>                TOTAL 45365G 9195G 35592G 20.27                            
>MIN/MAX VAR: 0.77/3.19  STDDEV: 14.69
>
>
>
>apart from replacing OSDs, how can I help it?
>
>
>
>
>> 
>> 
>> At 2017-07-28 17:24:29, "Nikola Ciprich" <nikola.cipr...@linuxbox.cz> wrote:
>> >I forgot to add that OSD daemons really seem to be idle, no disk
>> >activity, no CPU usage.. it just looks to me like  some kind of
>> >deadlock, as they were waiting for each other..
>> >
>> >and so I'm trying to get last 1.5% of misplaced / degraded PGs
>> >for almost a week..
>> >
>> >
>> >On Fri, Jul 28, 2017 at 10:56:02AM +0200, Nikola Ciprich wrote:
>> >> Hi,
>> >> 
>> >> I'm trying to find reason for strange recovery issues I'm seeing on
>> >> our cluster..
>> >> 
>> >> it's mostly idle, 4 node cluster with 26 OSDs evenly distributed
>> >> across nodes. jewel 10.2.9
>> >> 
>> >> the problem is that after some disk replaces and data moves, recovery
>> >> is progressing extremely slowly.. pgs seem to be stuck in 
>> >> active+recovering+degraded
>> >> state:
>> >> 
>> >> [root@v1d ~]# ceph -s
>> >>     cluster a5efbc87-3900-4c42-a977-8c93f7aa8c33
>> >>      health HEALTH_WARN
>> >>             159 pgs backfill_wait
>> >>             4 pgs backfilling
>> >>             259 pgs degraded
>> >>             12 pgs recovering
>> >>             113 pgs recovery_wait
>> >>             215 pgs stuck degraded
>> >>             266 pgs stuck unclean
>> >>             140 pgs stuck undersized
>> >>             151 pgs undersized
>> >>             recovery 37788/2327775 objects degraded (1.623%)
>> >>             recovery 23854/2327775 objects misplaced (1.025%)
>> >>             noout,noin flag(s) set
>> >>      monmap e21: 3 mons at 
>> >> {v1a=10.0.0.1:6789/0,v1b=10.0.0.2:6789/0,v1c=10.0.0.3:6789/0}
>> >>             election epoch 6160, quorum 0,1,2 v1a,v1b,v1c
>> >>       fsmap e817: 1/1/1 up {0=v1a=up:active}, 1 up:standby
>> >>      osdmap e76002: 26 osds: 26 up, 26 in; 185 remapped pgs
>> >>             flags noout,noin,sortbitwise,require_jewel_osds
>> >>       pgmap v80995844: 3200 pgs, 4 pools, 2876 GB data, 757 kobjects
>> >>             9215 GB used, 35572 GB / 45365 GB avail
>> >>             37788/2327775 objects degraded (1.623%)
>> >>             23854/2327775 objects misplaced (1.025%)
>> >>                 2912 active+clean
>> >>                  130 active+undersized+degraded+remapped+wait_backfill
>> >>                   97 active+recovery_wait+degraded
>> >>                   29 active+remapped+wait_backfill
>> >>                   12 active+recovery_wait+undersized+degraded+remapped
>> >>                    6 active+recovering+degraded
>> >>                    5 active+recovering+undersized+degraded+remapped
>> >>                    4 active+undersized+degraded+remapped+backfilling
>> >>                    4 active+recovery_wait+degraded+remapped
>> >>                    1 active+recovering+degraded+remapped
>> >>   client io 2026 B/s rd, 146 kB/s wr, 9 op/s rd, 21 op/s wr
>> >> 
>> >> 
>> >>  when I restart affected OSDs, it bumps the recovery, but then another
>> >> PGs get stuck.. All OSDs were restarted multiple times, none are even 
>> >> close to
>> >> nearfull, I just cant find what I'm doing wrong..
>> >> 
>> >> possibly related OSD options:
>> >> 
>> >> osd max backfills = 4
>> >> osd recovery max active = 15
>> >> debug osd = 0/0
>> >> osd op threads = 4
>> >> osd backfill scan min = 4
>> >> osd backfill scan max = 16
>> >> 
>> >> Any hints would be greatly appreciated
>> >> 
>> >> thanks
>> >> 
>> >> nik
>> >> 
>> >> 
>> >> -- 
>> >> -------------------------------------
>> >> Ing. Nikola CIPRICH
>> >> LinuxBox.cz, s.r.o.
>> >> 28.rijna 168, 709 00 Ostrava
>> >> 
>> >> tel.:   +420 591 166 214
>> >> fax:    +420 596 621 273
>> >> mobil:  +420 777 093 799
>> >> www.linuxbox.cz
>> >> 
>> >> mobil servis: +420 737 238 656
>> >> email servis: ser...@linuxbox.cz
>> >> -------------------------------------
>> >> _______________________________________________
>> >> ceph-users mailing list
>> >> ceph-users@lists.ceph.com
>> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >> 
>> >
>> >-- 
>> >-------------------------------------
>> >Ing. Nikola CIPRICH
>> >LinuxBox.cz, s.r.o.
>> >28.rijna 168, 709 00 Ostrava
>> >
>> >tel.:   +420 591 166 214
>> >fax:    +420 596 621 273
>> >mobil:  +420 777 093 799
>> >www.linuxbox.cz
>> >
>> >mobil servis: +420 737 238 656
>> >email servis: ser...@linuxbox.cz
>> >-------------------------------------
>> >_______________________________________________
>> >ceph-users mailing list
>> >ceph-users@lists.ceph.com
>> >http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>-- 
>-------------------------------------
>Ing. Nikola CIPRICH
>LinuxBox.cz, s.r.o.
>28.rijna 168, 709 00 Ostrava
>
>tel.:   +420 591 166 214
>fax:    +420 596 621 273
>mobil:  +420 777 093 799
>www.linuxbox.cz
>
>mobil servis: +420 737 238 656
>email servis: ser...@linuxbox.cz
>-------------------------------------

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] jewel - recovery keeps stalling (continues after restarting OSDs)

Reply via email to