[ceph-users] Blocked requests problem
Hello, I have a Ceph Cluster with specifications below: 3 x Monitor node 6 x Storage Node (6 disk per Storage Node, 6TB SATA Disks, all disks have SSD journals) Distributed public and private networks. All NICs are 10Gbit/s osd pool default size = 3 osd pool default min size = 2 Ceph version is Jewel 10.2.6. My cluster is active and a lot of virtual machines running on it (Linux and Windows VM's, database clusters, web servers etc). During normal use, cluster slowly went into a state of blocked requests. Blocked requests periodically incrementing. All OSD's seems healthy. Benchmark, iowait, network tests, all of them succeed. Yerterday, 08:00: $ ceph health detail HEALTH_WARN 3 requests are blocked > 32 sec; 3 osds have slow requests 1 ops are blocked > 134218 sec on osd.31 1 ops are blocked > 134218 sec on osd.3 1 ops are blocked > 8388.61 sec on osd.29 3 osds have slow requests Todat, 16:05: $ ceph health detail HEALTH_WARN 32 requests are blocked > 32 sec; 3 osds have slow requests 1 ops are blocked > 134218 sec on osd.31 1 ops are blocked > 134218 sec on osd.3 16 ops are blocked > 134218 sec on osd.29 11 ops are blocked > 67108.9 sec on osd.29 2 ops are blocked > 16777.2 sec on osd.29 1 ops are blocked > 8388.61 sec on osd.29 3 osds have slow requests $ ceph pg dump | grep scrub dumped all in format plain pg_stat objects mip degrmispunf bytes log disklog state state_stamp v reportedup up_primary acting acting_primary last_scrub scrub_stamp last_deep_scrub deep_scrub_stamp 20.1e 25183 0 0 0 0 98332537930 30663066 active+clean+scrubbing 2017-08-21 04:55:13.354379 6930'23908781 6930:20905696 [29,31,3] 29 [29,31,3] 29 6712'22950171 2017-08-20 04:46:59.208792 6712'22950171 2017-08-20 04:46:59.208792 Active scrub does not finish (about 24 hours). I did not restart any OSD meanwhile. I'm thinking set noscrub, noscrub-deep, norebalance, nobackfill, and norecover flags and restart 3,29,31th OSDs. Is this solve my problem? Or anyone has suggestion about this problem? Thanks, Ramazan ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Blocked requests problem
Hi Ranjan, Thanks for your reply. I did set scrub and nodeep-scrub flags. But active scrubbing operation can’t working properly. Scrubbing operation always in same pg (20.1e). $ ceph pg dump | grep scrub dumped all in format plain pg_stat objects mip degrmispunf bytes log disklog state state_stamp v reportedup up_primary acting acting_primary last_scrub scrub_stamp last_deep_scrub deep_scrub_stamp 20.1e 25189 0 0 0 0 98359116362 30483048 active+clean+scrubbing 2017-08-21 04:55:13.354379 6930'2393 6930:20949058 [29,31,3] 29 [29,31,3] 29 6712'22950171 2017-08-20 04:46:59.208792 6712'22950171 2017-08-20 04:46:59.208792 $ ceph -s cluster health HEALTH_WARN 33 requests are blocked > 32 sec noscrub,nodeep-scrub flag(s) set monmap e9: 3 mons at {ceph-mon01=**:6789/0,ceph-mon02=**:6789/0,ceph-mon03=**:6789/0} election epoch 84, quorum 0,1,2 ceph-mon01,ceph-mon02,ceph-mon03 osdmap e6930: 36 osds: 36 up, 36 in flags noscrub,nodeep-scrub,sortbitwise,require_jewel_osds pgmap v17667617: 1408 pgs, 5 pools, 24779 GB data, 6494 kobjects 70497 GB used, 127 TB / 196 TB avail 1407 active+clean 1 active+clean+scrubbing Thanks, Ramazan > On 22 Aug 2017, at 18:52, Ranjan Ghosh wrote: > > Hi Ramazan, > > I'm no Ceph expert, but what I can say from my experience using Ceph is: > > 1) During "Scrubbing", Ceph can be extremely slow. This is probably where > your "blocked requests" are coming from. BTW: Perhaps you can even find out > which processes are currently blocking with: ps aux | grep "D". You might > even want to kill some of those and/or shutdown services in order to relieve > some stress from the machine until it recovers. > > 2) I usually have the following in my ceph.conf. This lets the scrubbing only > run between midnight and 6 AM (hopefully the time of least demand; adjust as > necessary) - and with the lowest priority. > > #Reduce impact of scrub. > osd_disk_thread_ioprio_priority = 7 > osd_disk_thread_ioprio_class = "idle" > osd_scrub_end_hour = 6 > > 3) The Scrubbing begin and end hour will always work. The low priority mode, > however, works (AFAIK!) only with CFQ I/O Scheduler. Show your current > scheduler like this (replace sda with your device): > > cat /sys/block/sda/queue/scheduler > > You can also echo to this file to set a different scheduler. > > > With these settings you can perhaps alleviate the problem so far, that the > scrubbing runs over many nights until it finished. Again, AFAIK, it doesnt > have to finish in one night. It will continue the next night and so on. > > The Ceph experts say scrubbing is important. Don't know why, but I just > believe them. They've built this complex stuff after all :-) > > Thus, you can use "noscrub"/"nodeepscrub" to quickly get a hung server back > to work, but you should not let it run like this forever and a day. > > Hope this helps at least a bit. > > BR, > > Ranjan > > > Am 22.08.2017 um 15:20 schrieb Ramazan Terzi: >> Hello, >> >> I have a Ceph Cluster with specifications below: >> 3 x Monitor node >> 6 x Storage Node (6 disk per Storage Node, 6TB SATA Disks, all disks have >> SSD journals) >> Distributed public and private networks. All NICs are 10Gbit/s >> osd pool default size = 3 >> osd pool default min size = 2 >> >> Ceph version is Jewel 10.2.6. >> >> My cluster is active and a lot of virtual machines running on it (Linux and >> Windows VM's, database clusters, web servers etc). >> >> During normal use, cluster slowly went into a state of blocked requests. >> Blocked requests periodically incrementing. All OSD's seems healthy. >> Benchmark, iowait, network tests, all of them succeed. >> >> Yerterday, 08:00: >> $ ceph health detail >> HEALTH_WARN 3 requests are blocked > 32 sec; 3 osds have slow requests >> 1 ops are blocked > 134218 sec on osd.31 >> 1 ops are blocked > 134218 sec on osd.3 >> 1 ops are blocked > 8388.61 sec on osd.29 >> 3 osds have slow requests >> >> Todat, 16:05: >> $ ceph health detail >> HEALTH_WARN 32 requests are blocked > 32 sec; 3 osds have slow requests >> 1 ops are blocked > 134218 sec on osd.31 >> 1 ops are blocked > 134218 sec on osd.3 >> 16 ops are blocked > 134218 sec on osd.29 >> 11 ops are blocked > 67108.9 sec on osd.29 >>
Re: [ceph-users] Blocked requests problem
Finally problem solved. First, I set noscrub, nodeep-scrub, norebalance, nobackfill, norecover, noup and nodown flags. Then I restarted the OSD which has problem. When OSD daemon started, blocked requests increased (up to 100) and some misplaced PGs appeared. Then I unset flags in order to noup, nodown, norecover, nobackfill, norebalance. In a little while, all misplaced PGs repaired. Then I unset noscrub and nodeep-scrub flags. And finally: HEALTH_OK. Thanks for your helps, Ramazan > On 22 Aug 2017, at 20:46, Ranjan Ghosh wrote: > > Hm. That's quite weird. On our cluster, when I set "noscrub", "nodeep-scrub", > scrubbing will always stop pretty quickly (a few minutes). I wonder why this > doesnt happen on your cluster. When exactly did you set the flag? Perhaps it > just needs some more time... Or there might be a disk problem why the > scrubbing never finishes. Perhaps it's really a good idea, just like you > proposed, to shutdown the corresponding OSDs. But that's just my thoughts. > Perhaps some Ceph pro can shed some light on the possible reasons, why a > scrubbing might get stuck and how to resolve this. > > > Am 22.08.2017 um 18:58 schrieb Ramazan Terzi: >> Hi Ranjan, >> >> Thanks for your reply. I did set scrub and nodeep-scrub flags. But active >> scrubbing operation can’t working properly. Scrubbing operation always in >> same pg (20.1e). >> >> $ ceph pg dump | grep scrub >> dumped all in format plain >> pg_stat objects mip degrmispunf bytes log disklog >> state state_stamp v reportedup up_primary >> acting acting_primary last_scrub scrub_stamp last_deep_scrub >> deep_scrub_stamp >> 20.1e25189 0 0 0 0 98359116362 3048 >> 3048active+clean+scrubbing 2017-08-21 04:55:13.354379 >> 6930'2393 6930:20949058 [29,31,3] 29 [29,31,3] 29 >>6712'22950171 2017-08-20 04:46:59.208792 6712'22950171 >> 2017-08-20 04:46:59.208792 >> >> >> $ ceph -s >> cluster >> health HEALTH_WARN >> 33 requests are blocked > 32 sec >> noscrub,nodeep-scrub flag(s) set >> monmap e9: 3 mons at >> {ceph-mon01=**:6789/0,ceph-mon02=**:6789/0,ceph-mon03=**:6789/0} >> election epoch 84, quorum 0,1,2 ceph-mon01,ceph-mon02,ceph-mon03 >> osdmap e6930: 36 osds: 36 up, 36 in >> flags noscrub,nodeep-scrub,sortbitwise,require_jewel_osds >> pgmap v17667617: 1408 pgs, 5 pools, 24779 GB data, 6494 kobjects >> 70497 GB used, 127 TB / 196 TB avail >> 1407 active+clean >>1 active+clean+scrubbing >> >> >> Thanks, >> Ramazan >> >> >>> On 22 Aug 2017, at 18:52, Ranjan Ghosh wrote: >>> >>> Hi Ramazan, >>> >>> I'm no Ceph expert, but what I can say from my experience using Ceph is: >>> >>> 1) During "Scrubbing", Ceph can be extremely slow. This is probably where >>> your "blocked requests" are coming from. BTW: Perhaps you can even find out >>> which processes are currently blocking with: ps aux | grep "D". You might >>> even want to kill some of those and/or shutdown services in order to >>> relieve some stress from the machine until it recovers. >>> >>> 2) I usually have the following in my ceph.conf. This lets the scrubbing >>> only run between midnight and 6 AM (hopefully the time of least demand; >>> adjust as necessary) - and with the lowest priority. >>> >>> #Reduce impact of scrub. >>> osd_disk_thread_ioprio_priority = 7 >>> osd_disk_thread_ioprio_class = "idle" >>> osd_scrub_end_hour = 6 >>> >>> 3) The Scrubbing begin and end hour will always work. The low priority >>> mode, however, works (AFAIK!) only with CFQ I/O Scheduler. Show your >>> current scheduler like this (replace sda with your device): >>> >>> cat /sys/block/sda/queue/scheduler >>> >>> You can also echo to this file to set a different scheduler. >>> >>> >>> With these settings you can perhaps alleviate the problem so far, that the >>> scrubbing runs over many nights until it finished. Again, AFAIK, it doesnt >>> have to finish in one night. It will continue the next night and so on. >>> >>> The Ceph experts say scrubbing is important. Don't know why, but I just >
[ceph-users] Adding New OSD Problem
Hello, I have a Ceph Cluster with specifications below: 3 x Monitor node 6 x Storage Node (6 disk per Storage Node, 6TB SATA Disks, all disks have SSD journals) Distributed public and private networks. All NICs are 10Gbit/s osd pool default size = 3 osd pool default min size = 2 Ceph version is Jewel 10.2.6. Current health status: cluster health HEALTH_OK monmap e9: 3 mons at {ceph-mon01=xxx:6789/0,ceph-mon02=xxx:6789/0,ceph-mon03=xxx:6789/0} election epoch 84, quorum 0,1,2 ceph-mon01,ceph-mon02,ceph-mon03 osdmap e1512: 36 osds: 36 up, 36 in flags sortbitwise,require_jewel_osds pgmap v7698673: 1408 pgs, 5 pools, 37365 GB data, 9436 kobjects 83871 GB used, 114 TB / 196 TB avail 1408 active+clean My cluster is active and a lot of virtual machines running on it (Linux and Windows VM's, database clusters, web servers etc). When I want to add a new storage node with 1 disk, I'm getting huge problems. With new osd, crushmap updated and Ceph Cluster turns into recovery mode. Everything is OK. But after a while, some runnings VM's became unmanageable. Servers become unresponsive one by one. Recovery process would take an average of 20 hours. For this reason, I removed the new osd. Recovery process completed and everythink become normal. When new osd added, health status: cluster health HEALTH_WARN 91 pgs backfill_wait 1 pgs bacfilling 28 pgs degraded 28 pgs recovery_wait 28 phs stuck degraded recovery 2195/18486602 objects degraded (0.012%) recovery 1279784/18486602 objects misplaced (6.923%) monmap e9: 3 mons at {ceph-mon01=xxx:6789/0,ceph-mon02=xxx:6789/0,ceph-mon03=xxx:6789/0} election epoch 84, quorum 0,1,2 ceph-mon01,ceph-mon02,ceph-mon03 osdmap e1512: 37 osds: 37 up, 37 in flags sortbitwise,require_jewel_osds pgmap v7698673: 1408 pgs, 5 pools, 37365 GB data, 9436 kobjects 83871 GB used, 114 TB / 201 TB avail 2195/18486602 objects degraded (0.012%) 1279784/18486602 objects misplaced (6.923%) 1286 active+clean 91 active+remapped+wait_backfill 28 active+recovery_wait+degraded 2 active+clean+scrubbing+deep 1 active+remapped+backfilling recovery io 430 MB/s, 119 objects/s client io 36174 B/s rrd, 5567 kB/s wr, 5 op/s rd, 700 op/s wr Some Ceph config parameters: osd_max_backfills = 1 osd_backfill_full_ratio = 0.85 osd_recovery_max_active = 3 osd_recovery_threads = 1 How I can add new OSD's safely? Best regards, Ramazan ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com