Uh! It is strange. You said you have already cleared caches both on
client and osd node, the data must directly come from the disk. Wait
other's ideas

2015-02-03 11:44 GMT+08:00 Bruce McFarland <bruce.mcfarl...@taec.toshiba.com>:
> Yes I'm using and the kernel rbd in Ubuntu 14.04 which makes calls into 
> libceph
>
> root@essperf3:/etc/ceph# lsmod | grep rbd
> rbd                    63707  1
> libceph               225026  1 rbd
> root@essperf3:/etc/ceph#
>
> I'm doing raw device IO with either fio or vdbench (preferred tool) and there 
> is no filesystem on top of /dev/rbd1. Yes I did invalidate the kmem pages by 
> writing to the drop_caches and I've also allocated huge pages to be the max 
> allowable based on free memory. The huge page allocation should minimize any 
> system caches. I have a, relatively, small storage pool since this is a 
> development environment and there is only ~ 4TB total and the rbd image is 
> 3TB. On my lab system with 320TB I don't see this problem since the data set 
> is orders of magnitude larger than available system cache.
>
> Maybe I'll should try and test after removing DIMMs from the client system 
> and physically disabling kernel caching.
>
> -----Original Message-----
> From: Nicheal [mailto:zay11...@gmail.com]
> Sent: Monday, February 02, 2015 7:35 PM
> To: Bruce McFarland
> Cc: ceph-us...@ceph.com; Prashanth Nednoor
> Subject: Re: [ceph-users] RBD caching on 4K reads???
>
> It seems you use the kernel rbd. So rbd_cache does not work, which is just 
> designed for librbd. Kernel rbd is directly using the system page cache. You 
> said that you have already run like echo 3 > /proc/sys/vm/drop_cache to 
> invalidate all pages cached in kernel. So do you test the /dev/rbd1 based on 
> any filesystem, such ext4 or xfs?
> If so, and you run the test tool like fio, first with a write test and 
> file_size = 10G. Then a file(10G) is created by fio but with lots of holes in 
> the file, and your read test may read those holes so that filesystem can tell 
> thay contain nothing and there is no need to access the physical disk to get 
> data. You may check the fiemap of the file to see whether it contains holes 
> or you just remove the file and recreate the file by a read test.
>
> Ning Yao
>
> 2015-01-31 4:51 GMT+08:00 Bruce McFarland <bruce.mcfarl...@taec.toshiba.com>:
>> I have a cluster and have created a rbd device - /dev/rbd1. It shows
>> up as expected with ‘rbd –image test info’ and rbd showmapped. I have
>> been looking at cluster performance with the usual Linux block device
>> tools – fio and vdbench. When I look at writes and large block
>> sequential reads I’m seeing what I’d expect with performance limited
>> by either my cluster interconnect bandwidth or the backend device
>> throughput speeds – 1 GE frontend and cluster network and 7200rpm SATA
>> OSDs with 1 SSD/osd for journal. Everything looks good EXCEPT 4K
>> random reads. There is caching occurring somewhere in my system that I 
>> haven’t been able to detect and suppress - yet.
>>
>>
>>
>> I’ve set ‘rbd_cache=false’ in the [client] section of ceph.conf on the
>> client, monitor, and storage nodes. I’ve flushed the system caches on
>> the client and storage nodes before test run ie vm.drop_caches=3 and
>> set the huge pages to the maximum available to consume free system
>> memory so that it can’t be used for system cache . I’ve also disabled
>> read-ahead on all of the HDD/OSDs.
>>
>>
>>
>> When I run a 4k randon read workload on the client the most I could
>> expect would be ~100iops/osd x number of osd’s – I’m seeing an order
>> of magnitude greater than that AND running IOSTAT on the storage nodes
>> show no read activity on the OSD disks.
>>
>>
>>
>> Any ideas on what I’ve overlooked? There appears to be some read-ahead
>> caching that I’ve missed.
>>
>>
>>
>> Thanks,
>>
>> Bruce
>>
>>
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to