Hello Guys We have a fresh 'luminous' ( 12.2.0 ) (32ce2a3ae5239ee33d6150705cdb24d43bab910c) luminous (rc) ( installed using ceph-ansible )
the cluster contains 6 * Intel server board S2600WTTR ( 96 osds and 3 mons ) We have 6 nodes ( Intel server board S2600WTTR ) , Mem - 64G , CPU -> Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz , 32 cores . Each server has 16 * 1.6TB Dell SSD drives ( SSDSC2BB016T7R ) , total of 96 osds , 3 mons The main usage is rbd's for our OpenStack environment ( Okata ) We're at the beginning of our production tests and it looks like the osd's are too busy although we don't generate too much iops at this stage ( almost nothing ) All ceph-osds using 50% of CPU usage and I can't figure out why are they so busy : top - 07:41:55 up 49 days, 2:54, 2 users, load average: 6.85, 6.40, 6.37 Tasks: 518 total, 1 running, 517 sleeping, 0 stopped, 0 zombie %Cpu(s): 14.8 us, 4.3 sy, 0.0 ni, 80.3 id, 0.0 wa, 0.0 hi, 0.6 si, 0.0 st KiB Mem : 65853584 total, 23953788 free, 40342680 used, 1557116 buff/cache KiB Swap: 3997692 total, 3997692 free, 0 used. 18020584 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 36713 ceph 20 0 3869588 2.826g 28896 S 47.2 4.5 6079:20 ceph-osd 53981 ceph 20 0 3998732 2.666g 28628 S 45.8 4.2 5939:28 ceph-osd 55879 ceph 20 0 3707004 2.286g 28844 S 44.2 3.6 5854:29 ceph-osd 46026 ceph 20 0 3631136 1.930g 29100 S 43.2 3.1 6008:50 ceph-osd 39021 ceph 20 0 4091452 2.698g 28936 S 42.9 4.3 5687:39 ceph-osd 47210 ceph 20 0 3598572 1.871g 29092 S 42.9 3.0 5759:19 ceph-osd 52763 ceph 20 0 3843216 2.410g 28896 S 42.2 3.8 5540:11 ceph-osd 49317 ceph 20 0 3794760 2.142g 28932 S 41.5 3.4 5872:24 ceph-osd 42653 ceph 20 0 3915476 2.489g 28840 S 41.2 4.0 5605:13 ceph-osd 41560 ceph 20 0 3460900 1.801g 28660 S 38.5 2.9 5128:01 ceph-osd 50675 ceph 20 0 3590288 1.827g 28840 S 37.9 2.9 5196:58 ceph-osd 37897 ceph 20 0 4034180 2.814g 29000 S 34.9 4.5 4789:10 ceph-osd 50237 ceph 20 0 3379780 1.930g 28892 S 34.6 3.1 4846:36 ceph-osd 48608 ceph 20 0 3893684 2.721g 28880 S 33.9 4.3 4752:43 ceph-osd 40323 ceph 20 0 4227864 2.959g 28800 S 33.6 4.7 4712:36 ceph-osd 44638 ceph 20 0 3656780 2.437g 28896 S 33.2 3.9 4793:58 ceph-osd 61639 ceph 20 0 527512 114300 20988 S 2.7 0.2 2722:03 ceph-mgr 31586 ceph 20 0 765672 304140 21816 S 0.7 0.5 409:06.09 ceph-mon 68 root 20 0 0 0 0 S 0.3 0.0 3:09.69 ksoftirqd/12 strace doesn't show anything suspicious root@ecprdbcph10-opens:~# strace -p 36713 strace: Process 36713 attached futex(0x563343c56764, FUTEX_WAIT_PRIVATE, 1, NUL Ceph logs don't reveal anything? Is this "normal" behavior in Luminous? Looking out in older threads I can only find a thread about time gaps which is not our case Thanks, Alon
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com