Hey Mark / Community These are the sequences of changes that seems to have fixed the ceph problem
1# Upgrading Disk controller firmware from 6.34 to 6.64 ( latest ) 2# Rebooting all nodes in order to make new firmware into effect Read and write operations are now normal as well as system load and CPU utilization - Vickey - On Wed, Sep 2, 2015 at 11:28 PM, Vickey Singh <vickey.singh22...@gmail.com> wrote: > Thank You Mark , please see my response below. > > On Wed, Sep 2, 2015 at 5:23 PM, Mark Nelson <mnel...@redhat.com> wrote: > >> On 09/02/2015 08:51 AM, Vickey Singh wrote: >> >>> Hello Ceph Experts >>> >>> I have a strange problem , when i am reading or writing to Ceph pool , >>> its not writing properly. Please notice Cur MB/s which is going up and >>> down >>> >>> --- Ceph Hammer 0.94.2 >>> -- CentOS 6, 2.6 >>> -- Ceph cluster is healthy >>> >> >> You might find that CentOS7 gives you better performance. In some cases >> we were seeing nearly 2X. > > > Wooo 2X , i would definitely plan for upgrade. Thanks > > >> >> >> >>> >>> One interesting thing is when every i start rados bench command for read >>> or write CPU Idle % goes down ~10 and System load is increasing like >>> anything. >>> >>> Hardware >>> >>> HpSL4540 >>> >> >> Please make sure the controller is on the newest firmware. There used to >> be a bug that would cause sequential write performance to bottleneck when >> writeback cache was enabled on the RAID controller. > > > Last month i have upgraded the firmwares for this hardware , so i hope > they are up to date. > > >> >> >> 32Core CPU >>> 196G Memory >>> 10G Network >>> >> >> Be sure to check the network too. We've seen a lot of cases where folks >> have been burned by one of the NICs acting funky. >> > > At a first view , Interface looks good and they are pushing data nicely ( > what ever they are getting ) > > >> >> >>> I don't think hardware is a problem. >>> >>> Please give me clues / pointers , how should i troubleshoot this problem. >>> >>> >>> >>> # rados bench -p glance-test 60 write >>> Maintaining 16 concurrent writes of 4194304 bytes for up to 60 seconds >>> or 0 objects >>> Object prefix: benchmark_data_pouta-s01.pouta.csc.fi_2173350 >>> sec Cur ops started finished avg MB/s cur MB/s last lat avg >>> lat >>> 0 0 0 0 0 0 - >>> 0 >>> 1 16 20 4 15.99 16 0.12308 >>> 0.10001 >>> 2 16 37 21 41.9841 68 1.79104 >>> 0.827021 >>> 3 16 68 52 69.3122 124 0.084304 >>> 0.854829 >>> 4 16 114 98 97.9746 184 0.12285 >>> 0.614507 >>> 5 16 188 172 137.568 296 0.210669 >>> 0.449784 >>> 6 16 248 232 154.634 240 0.090418 >>> 0.390647 >>> 7 16 305 289 165.11 228 0.069769 >>> 0.347957 >>> 8 16 331 315 157.471 104 0.026247 >>> 0.3345 >>> 9 16 361 345 153.306 120 0.082861 >>> 0.320711 >>> 10 16 380 364 145.575 76 0.027964 >>> 0.310004 >>> 11 16 393 377 137.067 52 3.73332 >>> 0.393318 >>> 12 16 448 432 143.971 220 0.334664 >>> 0.415606 >>> 13 16 476 460 141.508 112 0.271096 >>> 0.406574 >>> 14 16 497 481 137.399 84 0.257794 >>> 0.412006 >>> 15 16 507 491 130.906 40 1.49351 >>> 0.428057 >>> 16 16 529 513 115.042 88 0.399384 >>> 0.48009 >>> 17 16 533 517 94.6286 16 5.50641 >>> 0.507804 >>> 18 16 537 521 83.405 16 4.42682 >>> 0.549951 >>> 19 16 538 522 80.349 4 11.2052 >>> 0.570363 >>> 2015-09-02 09:26:18.398641min lat: 0.023851 max lat: 11.2052 avg lat: >>> 0.570363 >>> sec Cur ops started finished avg MB/s cur MB/s last lat avg >>> lat >>> 20 16 538 522 77.3611 0 - >>> 0.570363 >>> 21 16 540 524 74.8825 4 8.88847 >>> 0.591767 >>> 22 16 542 526 72.5748 8 1.41627 >>> 0.593555 >>> 23 16 543 527 70.2873 4 8.0856 >>> 0.607771 >>> 24 16 555 539 69.5674 48 0.145199 >>> 0.781685 >>> 25 16 560 544 68.0177 20 1.4342 >>> 0.787017 >>> 26 16 564 548 66.4241 16 0.451905 >>> 0.78765 >>> 27 16 566 550 64.7055 8 0.611129 >>> 0.787898 >>> 28 16 570 554 63.3138 16 2.51086 >>> 0.797067 >>> 29 16 570 554 61.5549 0 - >>> 0.797067 >>> 30 16 572 556 60.1071 4 7.71382 >>> 0.830697 >>> 31 16 577 561 59.0515 20 23.3501 >>> 0.916368 >>> 32 16 590 574 58.8705 52 0.336684 >>> 0.956958 >>> 33 16 591 575 57.4986 4 1.92811 >>> 0.958647 >>> 34 16 591 575 56.0961 0 - >>> 0.958647 >>> 35 16 591 575 54.7603 0 - >>> 0.958647 >>> 36 16 597 581 54.0447 8 0.187351 >>> 1.00313 >>> 37 16 625 609 52.8394 112 2.12256 >>> 1.09256 >>> 38 16 631 615 52.227 24 1.57413 >>> 1.10206 >>> 39 16 638 622 51.7232 28 4.41663 >>> 1.15086 >>> 2015-09-02 09:26:40.510623min lat: 0.023851 max lat: 27.6704 avg lat: >>> 1.15657 >>> sec Cur ops started finished avg MB/s cur MB/s last lat avg >>> lat >>> 40 16 652 636 51.8102 56 0.113345 >>> 1.15657 >>> 41 16 682 666 53.1443 120 0.041251 >>> 1.17813 >>> 42 16 685 669 52.3395 12 0.501285 >>> 1.17421 >>> 43 15 690 675 51.7955 24 2.26605 >>> 1.18357 >>> 44 16 728 712 53.6062 148 0.589826 >>> 1.17478 >>> 45 16 728 712 52.6158 0 - >>> 1.17478 >>> 46 16 728 712 51.6613 0 - >>> 1.17478 >>> 47 16 728 712 50.7407 0 - >>> 1.17478 >>> 48 16 772 756 52.9332 44 0.234811 >>> 1.1946 >>> 49 16 835 819 56.3577 252 5.67087 >>> 1.12063 >>> 50 16 890 874 59.1252 220 0.230806 >>> 1.06778 >>> 51 16 896 880 58.5409 24 0.382471 >>> 1.06121 >>> 52 16 896 880 57.5832 0 - >>> 1.06121 >>> 53 16 896 880 56.6562 0 - >>> 1.06121 >>> 54 16 896 880 55.7587 0 - >>> 1.06121 >>> 55 16 897 881 54.9515 1 4.88333 >>> 1.06554 >>> 56 16 897 881 54.1077 0 - >>> 1.06554 >>> 57 16 897 881 53.2894 0 - >>> 1.06554 >>> 58 16 897 881 51.9335 0 - >>> 1.06554 >>> 59 16 897 881 51.1792 0 - >>> 1.06554 >>> 2015-09-02 09:27:01.267301min lat: 0.01405 max lat: 27.6704 avg lat: >>> 1.06554 >>> sec Cur ops started finished avg MB/s cur MB/s last lat avg >>> lat >>> 60 16 897 881 50.4445 0 - >>> 1.06554 >>> >>> >>> >>> >>> >>> cluster 98d89661-f616-49eb-9ccf-84d720e179c0 >>> health HEALTH_OK >>> monmap e3: 3 mons at >>> {s01=10.100.50.1:6789/0,s02=10.100.50.2:6789/0,s03=1 >>> <http://10.100.50.1:6789/0,s02=10.100.50.2:6789/0,s03=1> >>> 0.100.50.3:6789/0 <http://0.100.50.3:6789/0>}, election epoch 666, >>> quorum 0,1,2 s01,s02,s03 >>> * osdmap e121039: 240 osds: 240 up, 240 in* >>> pgmap v850698: 7232 pgs, 31 pools, 439 GB data, 43090 kobjects >>> 2635 GB used, 867 TB / 870 TB avail >>> 7226 active+clean >>> 6 active+clean+scrubbing+deep >>> >> >> Note the last line there. You'll likely want to try your test again when >> scrubbing is complete. Also, you may want to try this script: >> > > Yeah i have tried few times when cluster is perfectly healthy ( not doing > scrubbing / repairs ) > > >> >> https://github.com/ceph/cbt/blob/master/tools/readpgdump.py >> >> You can invoke it like: >> >> ceph pg dump | ./readpgdump.py >> >> That will give you a bunch of information about the pools on your >> system. I'm a little concerned about how many PGs your glance-test pool >> may have given your totals above. >> > > Thanks for the link i would do that and also run rados bench for other > pools ( where PG is higher ) > > > Now here are my some observations > > 1# When the cluster is not doing anything , Health_ok , with no > background scrubbing / repairing. Also all system resources CPU/MEM/NET are > mostly idle. In this Case when i start rados bench ( write / rand / seq ) , > after suddenly a few seconds > > --- rados bench output drops from ~500M to few 10M > --- At the same time CPU busy 90% and System load bumps UP > > Once rados bench completes > > --- After few minutes System resources becomes IDLE > > 2# Sometime some PG becomes unclean for a few minutes while rados bench > runs and then then quickly they becomes active+clean > > > I am out of clues , so any help from community that leads me to think in > right direction , would be helpful. > > > - Vickey - > > >> >> >>> >>> >>> >>> _______________________________________________ >>> ceph-users mailing list >>> ceph-users@lists.ceph.com >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> >>> _______________________________________________ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com