[ceph-users] Re: benchmark Ceph

2020-09-14 Thread rainning
What is your Ceph version? From the test results you posted, your environment's performance is okay in regard of your setup. But there are definitely many things that can be tuned to get you better number. I normally use top, iostat, pidstat, vmstat, dstat, iperf3, blktrace, netmon, ceph admin

[ceph-users] Re: benchmark Ceph

2020-09-14 Thread rainning
Can you post the fio results with the ioengine using libaio? From what you posted, it seems to me that the read test hit cache. And the write performance was not good, the latency was too high (~35.4ms) while the numjobs and iodepth both were 1. Did you monitor system stat on both side (VM/Compu

[ceph-users] Re: Nautilus slow using "ceph tell osd.* bench"

2020-08-06 Thread rainning
Hi Jim, when you do reweighting, balancing will be triggered, how did you set it back? Setting back immediately or waiting for balancing to complete? I did try both on my cluster and couldn't see osd bench changed significantly like yours (actually no changes), however, my cluster is 12.2.12, no

[ceph-users] Re: Nautilus slow using "ceph tell osd.* bench"

2020-08-05 Thread rainning
Hi Jim, did you check system stat (e.g. iostat, top, etc.) on both osds when you ran osd bench? Those might be able to give you some clues. Moreover, did you compare both osds' configurations? -- Original -- From:

[ceph-users] Fw:Re: "ceph daemon osd.x ops" shows different number from "ceph osd status "

2020-07-20 Thread rainning
aha, thanks very much for pointing out, Anthony! Just a summary for the screenshot pasted in my previous email. Based on my understanding, "ceph daemon osd.x ops" or "ceph daemon osd.x dump_ops_in_flight" shows the ops currently being processed in the osd.x. I also noticed that there is anothe

[ceph-users] "ceph daemon osd.x ops" shows different number from "ceph osd status "

2020-07-20 Thread rainning
"ceph daemon osd.x ops" shows ops currently in flight, the number is different from "ceph osd status

[ceph-users] high commit_latency and apply_latency

2020-07-16 Thread rainning
we have a cluster with very low load, however, "ceph osd perf" shows high commit_latency and apply_latency. root@stor-mgt01:~# ceph -s   cluster:     id: 3d1ec789-829d-4e0f-b707-9363356a68f1     health: HEALTH_WARN     application not enabled on 3 pool(s)     services:     mon: 3 d

[ceph-users] ??????Re: osd bench with or without a separate WAL device deployed

2020-07-15 Thread rainning
72360.338257,     "iops": 2773.626104 } root@stor-mgt01:~# ceph tell osd.30 bench 98304000 32768 {     "bytes_written": 98304000,     "blocksize": 32768,     "elapsed_sec": 2.908703,     "bytes_per_sec": 33796507.598640,     "iops": 1031.387561 } root@stor-mgt01:~# ceph tell osd.30 bench 49152000 16384 {     "bytes_written": 49152000,     "blocksize": 16384,     "elapsed_sec": 3.907744,     "bytes_per_sec": 12578102.861185,     "iops": 767.706473 } --  -- ??: "rainning"

[ceph-users] ??????Re: osd bench with or without a separate WAL device deployed

2020-07-15 Thread rainning
Hi Zhenshi, I did try with bigger block size. Interestingly, the one whose 4KB osd bench was lower performed slightly better in 4MB osd bench. Let me try some other bigger block sizes, e.g. 16K, 64K, 128K, 1M etc, to see if there is any pattern. Moreover, I did compare two SSDs, they respec

[ceph-users] Re: osd bench with or without a separate WAL device deployed

2020-07-15 Thread rainning
one more thing it seems that WAL does have more impact on small write.  ---Original--- From: "Zhenshi Zhou"

[ceph-users] Re: osd bench with or without a separate WAL device deployed

2020-07-15 Thread rainning
Hi Zhenshi, thanks very much for the reply. Yes I know it is ood that the bluestore is deployed only with a separate db device  but no a WAL device. The cluster was deployed in k8s using rook. I was told it was because the rook we used didn't support that. Moreover, the comparison was made on

[ceph-users] osd bench with or without a separate WAL device deployed

2020-07-15 Thread rainning
Hi all, I am wondering if there is any performance comparison done on osd bench with and without a separate WAL device deployed given that there is always a separate db device deployed on SSD in both cases. The reason I am asking this question is that we have two clusters and osds in one hav