Re: [ceph-users] CephFS read IO caching, where it is happining?

Shinobu Kinjo Tue, 07 Feb 2017 22:15:02 -0800

On Wed, Feb 8, 2017 at 3:05 PM, Ahmed Khuraidah <abushi...@gmail.com> wrote:


> Hi Shinobu, I am using SUSE packages in scope of their latest SUSE
> Enterprise Storage 4 and following documentation (method of deployment:
> ceph-deploy)
> But, I was able reproduce this issue on Ubuntu 14.04 with Ceph
> repositories (also latest Jewel and ceph-deploy) as well.
>

Community Ceph packages are running on ubuntu box, right?
If so, please do `ceph -v` on ubuntu box.

And also please provide us with same issue which you hit on suse box.


>
> On Wed, Feb 8, 2017 at 3:03 AM, Shinobu Kinjo <ski...@redhat.com> wrote:
>
>> Are you using opensource Ceph packages or suse ones?
>>
>> On Sat, Feb 4, 2017 at 3:54 PM, Ahmed Khuraidah <abushi...@gmail.com>
>> wrote:
>>
>>> I Have opened ticket on http://tracker.ceph.com/
>>>
>>> http://tracker.ceph.com/issues/18816
>>>
>>>
>>> My client and server kernels are the same, here is info:
>>> # lsb_release -a
>>> LSB Version:    n/a
>>> Distributor ID: SUSE
>>> Description:    SUSE Linux Enterprise Server 12 SP2
>>> Release:        12.2
>>> Codename:       n/a
>>> # uname -a
>>> Linux cephnode 4.4.38-93-default #1 SMP Wed Dec 14 12:59:43 UTC 2016
>>> (2d3e9d4) x86_64 x86_64 x86_64 GNU/Linux
>>>
>>>
>>> Thanks
>>>
>>> On Fri, Feb 3, 2017 at 1:59 PM, John Spray <jsp...@redhat.com> wrote:
>>>
>>>> On Fri, Feb 3, 2017 at 8:07 AM, Ahmed Khuraidah <abushi...@gmail.com>
>>>> wrote:
>>>> > Thank you guys,
>>>> >
>>>> > I tried to add option "exec_prerun=echo 3 > /proc/sys/vm/drop_caches"
>>>> as
>>>> > well as "exec_prerun=echo 3 | sudo tee /proc/sys/vm/drop_caches", but
>>>> > despite FIO corresponds that command was executed, there are no
>>>> changes.
>>>> >
>>>> > But, I caught very strange another behavior. If I will run my FIO test
>>>> > (speaking about 3G file case) twice, after the first run FIO will
>>>> create my
>>>> > file and print a lot of IOps as described already, but if- before
>>>> second
>>>> > run- drop cache (by root echo 3 > /proc/sys/vm/drop_caches) I broke
>>>> will end
>>>> > with broken MDS:
>>>> >
>>>> > --- begin dump of recent events ---
>>>> >      0> 2017-02-03 02:34:41.974639 7f7e8ec5e700 -1 *** Caught signal
>>>> > (Aborted) **
>>>> >  in thread 7f7e8ec5e700 thread_name:ms_dispatch
>>>> >
>>>> >  ceph version 10.2.4-211-g12b091b (12b091b4a40947aa43919e71a318e
>>>> d0dcedc8734)
>>>> >  1: (()+0x5142a2) [0x557c51e092a2]
>>>> >  2: (()+0x10b00) [0x7f7e95df2b00]
>>>> >  3: (gsignal()+0x37) [0x7f7e93ccb8d7]
>>>> >  4: (abort()+0x13a) [0x7f7e93ccccaa]
>>>> >  5: (ceph::__ceph_assert_fail(char const*, char const*, int, char
>>>> > const*)+0x265) [0x557c51f133d5]
>>>> >  6: (MutationImpl::~MutationImpl()+0x28e) [0x557c51bb9e1e]
>>>> >  7: (std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_relea
>>>> se()+0x39)
>>>> > [0x557c51b2ccf9]
>>>> >  8: (Locker::check_inode_max_size(CInode*, bool, bool, unsigned
>>>> long, bool,
>>>> > unsigned long, utime_t)+0x9a7) [0x557c51ca2757]
>>>> >  9: (Locker::remove_client_cap(CInode*, client_t)+0xb1)
>>>> [0x557c51ca38f1]
>>>> >  10: (Locker::_do_cap_release(client_t, inodeno_t, unsigned long,
>>>> unsigned
>>>> > int, unsigned int)+0x90d) [0x557c51ca424d]
>>>> >  11: (Locker::handle_client_cap_release(MClientCapRelease*)+0x1cc)
>>>> > [0x557c51ca449c]
>>>> >  12: (MDSRank::handle_deferrable_message(Message*)+0xc1c)
>>>> [0x557c51b33d3c]
>>>> >  13: (MDSRank::_dispatch(Message*, bool)+0x1e1) [0x557c51b3c991]
>>>> >  14: (MDSRankDispatcher::ms_dispatch(Message*)+0x15) [0x557c51b3dae5]
>>>> >  15: (MDSDaemon::ms_dispatch(Message*)+0xc3) [0x557c51b25703]
>>>> >  16: (DispatchQueue::entry()+0x78b) [0x557c5200d06b]
>>>> >  17: (DispatchQueue::DispatchThread::entry()+0xd) [0x557c51ee5dcd]
>>>> >  18: (()+0x8734) [0x7f7e95dea734]
>>>> >  19: (clone()+0x6d) [0x7f7e93d80d3d]
>>>> >  NOTE: a copy of the executable, or `objdump -rdS <executable>` is
>>>> needed to
>>>> > interpret this.
>>>>
>>>> Oops!  Please could you open a ticket on tracker.ceph.com, with this
>>>> backtrace, the client versions, any non-default config settings, and
>>>> the series of operations that led up to it.
>>>>
>>>> Thanks,
>>>> John
>>>>
>>>> > "
>>>> >
>>>> > On Thu, Feb 2, 2017 at 9:30 PM, Shinobu Kinjo <ski...@redhat.com>
>>>> wrote:
>>>> >>
>>>> >> You may want to add this in your FIO recipe.
>>>> >>
>>>> >>  * exec_prerun=echo 3 > /proc/sys/vm/drop_caches
>>>> >>
>>>> >> Regards,
>>>> >>
>>>> >> On Fri, Feb 3, 2017 at 12:36 AM, Wido den Hollander <w...@42on.com>
>>>> wrote:
>>>> >> >
>>>> >> >> Op 2 februari 2017 om 15:35 schreef Ahmed Khuraidah
>>>> >> >> <abushi...@gmail.com>:
>>>> >> >>
>>>> >> >>
>>>> >> >> Hi all,
>>>> >> >>
>>>> >> >> I am still confused about my CephFS sandbox.
>>>> >> >>
>>>> >> >> When I am performing simple FIO test into single file with size
>>>> of 3G I
>>>> >> >> have too many IOps:
>>>> >> >>
>>>> >> >> cephnode:~ # fio payloadrandread64k3G
>>>> >> >> test: (g=0): rw=randread, bs=64K-64K/64K-64K/64K-64K,
>>>> ioengine=libaio,
>>>> >> >> iodepth=2
>>>> >> >> fio-2.13
>>>> >> >> Starting 1 process
>>>> >> >> test: Laying out IO file(s) (1 file(s) / 3072MB)
>>>> >> >> Jobs: 1 (f=1): [r(1)] [100.0% done] [277.8MB/0KB/0KB /s] [4444/0/0
>>>> >> >> iops]
>>>> >> >> [eta 00m:00s]
>>>> >> >> test: (groupid=0, jobs=1): err= 0: pid=3714: Thu Feb  2 07:07:01
>>>> 2017
>>>> >> >>   read : io=3072.0MB, bw=181101KB/s, iops=2829, runt= 17370msec
>>>> >> >>     slat (usec): min=4, max=386, avg=12.49, stdev= 6.90
>>>> >> >>     clat (usec): min=202, max=5673.5K, avg=690.81, stdev=361
>>>> >> >>
>>>> >> >>
>>>> >> >> But if I will change size to file to 320G, looks like I skip the
>>>> cache:
>>>> >> >>
>>>> >> >> cephnode:~ # fio payloadrandread64k320G
>>>> >> >> test: (g=0): rw=randread, bs=64K-64K/64K-64K/64K-64K,
>>>> ioengine=libaio,
>>>> >> >> iodepth=2
>>>> >> >> fio-2.13
>>>> >> >> Starting 1 process
>>>> >> >> Jobs: 1 (f=1): [r(1)] [100.0% done] [4740KB/0KB/0KB /s] [74/0/0
>>>> iops]
>>>> >> >> [eta
>>>> >> >> 00m:00s]
>>>> >> >> test: (groupid=0, jobs=1): err= 0: pid=3624: Thu Feb  2 06:51:09
>>>> 2017
>>>> >> >>   read : io=3410.9MB, bw=11641KB/s, iops=181, runt=300033msec
>>>> >> >>     slat (usec): min=4, max=442, avg=14.43, stdev=10.07
>>>> >> >>     clat (usec): min=98, max=286265, avg=10976.32, stdev=14904.82
>>>> >> >>
>>>> >> >>
>>>> >> >> For random write test such behavior not exists, there are almost
>>>> the
>>>> >> >> same
>>>> >> >> results - around 100 IOps.
>>>> >> >>
>>>> >> >> So my question: could please somebody clarify where this caching
>>>> likely
>>>> >> >> happens and how to manage it?
>>>> >> >>
>>>> >> >
>>>> >> > The page cache of your kernel. The kernel will cache the file in
>>>> memory
>>>> >> > and perform read operations from there.
>>>> >> >
>>>> >> > Best way is to reboot your client between test runs. Although you
>>>> can
>>>> >> > drop kernel caches I always reboot to make sure nothing is cached
>>>> locally.
>>>> >> >
>>>> >> > Wido
>>>> >> >
>>>> >> >> P.S.
>>>> >> >> This is latest SLES/Jewel based onenode setup which has:
>>>> >> >> 1 MON, 1 MDS (both data and metadata pools on SATA drive) and 1
>>>> OSD
>>>> >> >> (XFS on
>>>> >> >> SATA and journal on SSD).
>>>> >> >> My FIO config file:
>>>> >> >> direct=1
>>>> >> >> buffered=0
>>>> >> >> ioengine=libaio
>>>> >> >> iodepth=2
>>>> >> >> runtime=300
>>>> >> >>
>>>> >> >> Thanks
>>>> >> >> _______________________________________________
>>>> >> >> ceph-users mailing list
>>>> >> >> ceph-users@lists.ceph.com
>>>> >> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>> >> > _______________________________________________
>>>> >> > ceph-users mailing list
>>>> >> > ceph-users@lists.ceph.com
>>>> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>> >
>>>> >
>>>> >
>>>> > _______________________________________________
>>>> > ceph-users mailing list
>>>> > ceph-users@lists.ceph.com
>>>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>> >
>>>>
>>>
>>>
>>
>

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] CephFS read IO caching, where it is happining?

Reply via email to