Re: [ceph-users] Ceph performance is too good (impossible..)...

Somnath Roy Sun, 11 Dec 2016 19:06:38 -0800

I generally do a 1M seq write to fill up the device. Block size doesn’t matter 
here but bigger block size is faster to fill up and that’s why people use that.

From: V Plus [mailto:v.plussh...@gmail.com]
Sent: Sunday, December 11, 2016 7:03 PM
To: Somnath Roy
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Ceph performance is too good (impossible..)...

Thanks!

One more question, what do you mean by "bigger" ?
Do you mean that bigger block size (say, I will run read test with bs=4K, then 
I need to first write the rbd with bs>4K?)? or size that is big enough to cover 
the area where the test will be executed?

On Sun, Dec 11, 2016 at 9:54 PM, Somnath Roy 
<somnath....@sandisk.com<mailto:somnath....@sandisk.com>> wrote:
A block needs to be written before read otherwise you will get funny result. 
For example, in case of flash (depending on how FW is implemented) , it will 
mostly return you 0 if a block is not written. Now, I have seen some flash FW 
is really inefficient on manufacturing this data (say 0) if not written and 
some are really fast.
So, to get predictable result you should be always reading a block that is 
written. In a device say half of the block is written and you are doing a full 
device random reads , you will get unpredictable/spiky/imbalanced result.
Same with rbd as well, consider it as a storage device and behavior would be 
similar. So, it is always recommended to precondition (fill up) a rbd image 
with bigger block seq write before you do any synthetic test on that. Now, for 
filestore backend added advantage of preconditioning rbd will be the files in 
the filesystem will be created beforehand.

Thanks & Regards
Somnath

From: V Plus [mailto:v.plussh...@gmail.com<mailto:v.plussh...@gmail.com>]
Sent: Sunday, December 11, 2016 6:01 PM
To: Somnath Roy
Cc: ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>
Subject: Re: [ceph-users] Ceph performance is too good (impossible..)...

Thanks Somnath!
As you recommended, I executed:
dd if=/dev/zero bs=1M count=4096 of=/dev/rbd0
dd if=/dev/zero bs=1M count=4096 of=/dev/rbd1

Then the output results look more reasonable!
Could you tell me why??

Btw, the purpose of my run is to test the performance of rbd in ceph. Does my 
case mean that before every test, I have to "initialize" all the images???

Great thanks!!

On Sun, Dec 11, 2016 at 8:47 PM, Somnath Roy 
<somnath....@sandisk.com<mailto:somnath....@sandisk.com>> wrote:
Fill up the image with big write (say 1M) first before reading and you should 
see sane throughput.

Thanks & Regards
Somnath
From: ceph-users 
[mailto:ceph-users-boun...@lists.ceph.com<mailto:ceph-users-boun...@lists.ceph.com>]
 On Behalf Of V Plus
Sent: Sunday, December 11, 2016 5:44 PM
To: ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>
Subject: [ceph-users] Ceph performance is too good (impossible..)...

Hi Guys,
we have a ceph cluster with 6 machines (6 OSD per host).
1. I created 2 images in Ceph, and map them to another host A (outside the Ceph 
cluster). On host A, I got /dev/rbd0 and /dev/rbd1.
2. I start two fio job to perform READ test on rbd0 and rbd1. (fio job 
descriptions can be found below)
"sudo fio fioA.job -output a.txt & sudo  fio fioB.job -output b.txt  & wait"
3. After the test, in a.txt, we got bw=1162.7MB/s, in b.txt, we get 
bw=3579.6MB/s.
The results do NOT make sense because there is only one NIC on host A, and its 
limit is 10 Gbps (1.25GB/s).

I suspect it is because of the cache setting.
But I am sure that in file /etc/ceph/ceph.conf on host A,I already added:
[client]
rbd cache = false

Could anyone give me a hint what is missing? why....
Thank you very much.

fioA.job:
[A]
direct=1
group_reporting=1
unified_rw_reporting=1
size=100%
time_based=1
filename=/dev/rbd0
rw=read
bs=4MB
numjobs=16
ramp_time=10
runtime=20

fioB.job:
[B]
direct=1
group_reporting=1
unified_rw_reporting=1
size=100%
time_based=1
filename=/dev/rbd1
rw=read
bs=4MB
numjobs=16
ramp_time=10
runtime=20

Thanks...
PLEASE NOTE: The information contained in this electronic mail message is 
intended only for the use of the designated recipient(s) named above. If the 
reader of this message is not the intended recipient, you are hereby notified 
that you have received this message in error and that any review, 
dissemination, distribution, or copying of this message is strictly prohibited. 
If you have received this communication in error, please notify the sender by 
telephone or e-mail (as shown above) immediately and destroy any and all copies 
of this message in your possession (whether hard copies or electronically 
stored copies).

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Ceph performance is too good (impossible..)...

Reply via email to