Re: [ceph-users] very different performance on two volumes in the same pool

Irek Fasikhov Mon, 27 Apr 2015 05:54:08 -0700

Hi, Nikola.

https://www.mail-archive.com/ceph-users@lists.ceph.com/msg19152.html


2015-04-27 14:17 GMT+03:00 Nikola Ciprich <nikola.cipr...@linuxbox.cz>:

> Hello Somnath,
> > Thanks for the perf data..It seems innocuous..I am not seeing single
> tcmalloc trace, are you running with tcmalloc by the way ?
>
> according to ldd, it seems I have it compiled in, yes:
> [root@vfnphav1a ~]# ldd /usr/bin/ceph-osd
> .
> .
> libtcmalloc.so.4 => /usr/lib64/libtcmalloc.so.4 (0x00007f7a3756e000)
> .
> .
>
>
> > What about my other question, is the performance of slow volume
> increasing if you stop IO on the other volume ?
> I don't have any other cpeh users, actually whole cluster is idle..
>
> > Are you using default ceph.conf ? Probably, you want to try with
> different osd_op_num_shards (may be = 10 , based on your osd server config)
> and osd_op_num_threads_per_shard (may be = 1). Also, you may want to see
> the effect by doing osd_enable_op_tracker = false
>
> I guess I'm using pretty default settings, few changes probably not much
> related:
>
> [osd]
> osd crush update on start = false
>
> [client]
> rbd cache = true
> rbd cache writethrough until flush = true
>
> [mon]
> debug paxos = 0
>
>
>
> I now tried setting
> throttler perf counter = false
> osd enable op tracker = false
> osd_op_num_threads_per_shard = 1
> osd_op_num_shards = 10
>
> and restarting all ceph servers.. but it seems to make no big difference..
>
>
> >
> > Are you seeing similar resource consumption on both the servers while IO
> is going on ?
> yes, on all three nodes, ceph-osd seems to be consuming lots of CPU during
> benchmark.
>
> >
> > Need some information about your client, are the volumes exposed with
> krbd or running with librbd environment ? If krbd and with same physical
> box, hope you mapped the images with 'noshare' enabled.
>
> I'm using fio with ceph engine, so I guess none rbd related stuff is in
> use here?
>
>
> >
> > Too many questions :-)  But, this may give some indication what is going
> on there.
> :-) hopefully my answers are not too confused, I'm still pretty new to
> ceph..
>
> BR
>
> nik
>
>
> >
> > Thanks & Regards
> > Somnath
> >
> > -----Original Message-----
> > From: Nikola Ciprich [mailto:nikola.cipr...@linuxbox.cz]
> > Sent: Sunday, April 26, 2015 7:32 AM
> > To: Somnath Roy
> > Cc: ceph-users@lists.ceph.com; n...@linuxbox.cz
> > Subject: Re: [ceph-users] very different performance on two volumes in
> the same pool
> >
> > Hello Somnath,
> >
> > On Fri, Apr 24, 2015 at 04:23:19PM +0000, Somnath Roy wrote:
> > > This could be again because of tcmalloc issue I reported earlier.
> > >
> > > Two things to observe.
> > >
> > > 1. Is the performance improving if you stop IO on other volume ? If
> so, it could be different issue.
> > there is no other IO.. only cephfs mounted, but no users of it.
> >
> > >
> > > 2. Run perf top in the OSD node and see if tcmalloc traces are popping
> up.
> >
> > don't see anything special:
> >
> >   3.34%  libc-2.12.so                  [.] _int_malloc
> >   2.87%  libc-2.12.so                  [.] _int_free
> >   2.79%  [vdso]                        [.] __vdso_gettimeofday
> >   2.67%  libsoftokn3.so                [.] 0x000000000001fad9
> >   2.34%  libfreeblpriv3.so             [.] 0x00000000000355e6
> >   2.33%  libpthread-2.12.so            [.] pthread_mutex_unlock
> >   2.19%  libpthread-2.12.so            [.] pthread_mutex_lock
> >   1.80%  libc-2.12.so                  [.] malloc
> >   1.43%  [kernel]                      [k] do_raw_spin_lock
> >   1.42%  libc-2.12.so                  [.] memcpy
> >   1.23%  [kernel]                      [k] __switch_to
> >   1.19%  [kernel]                      [k]
> acpi_processor_ffh_cstate_enter
> >   1.09%  libc-2.12.so                  [.] malloc_consolidate
> >   1.08%  [kernel]                      [k] __schedule
> >   1.05%  libtcmalloc.so.4.1.0          [.] 0x0000000000017e6f
> >   0.98%  libc-2.12.so                  [.] vfprintf
> >   0.83%  libstdc++.so.6.0.13           [.] std::basic_ostream<char,
> std::char_traits<char> >& std::__ostream_insert<char,
> std::char_traits<char> >(std::basic_ostream<char,
> >   0.76%  libstdc++.so.6.0.13           [.] 0x000000000008092a
> >   0.73%  libc-2.12.so                  [.] __memset_sse2
> >   0.72%  libc-2.12.so                  [.] __strlen_sse42
> >   0.70%  libstdc++.so.6.0.13           [.] std::basic_streambuf<char,
> std::char_traits<char> >::xsputn(char const*, long)
> >   0.68%  libpthread-2.12.so            [.] pthread_mutex_trylock
> >   0.67%  librados.so.2.0.0             [.] ceph_crc32c_sctp
> >   0.63%  libpython2.6.so.1.0           [.] 0x000000000007d823
> >   0.55%  libnss3.so                    [.] 0x0000000000056d2a
> >   0.52%  libc-2.12.so                  [.] free
> >   0.50%  libstdc++.so.6.0.13           [.] std::basic_string<char,
> std::char_traits<char>, std::allocator<char> >::basic_string(std::string
> const&)
> >
> > should I check anything else?
> > BR
> > nik
> >
> >
> > >
> > > Thanks & Regards
> > > Somnath
> > >
> > > -----Original Message-----
> > > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf
> Of Nikola Ciprich
> > > Sent: Friday, April 24, 2015 7:10 AM
> > > To: ceph-users@lists.ceph.com
> > > Cc: n...@linuxbox.cz
> > > Subject: [ceph-users] very different performance on two volumes in the
> same pool
> > >
> > > Hello,
> > >
> > > I'm trying to solve a bit mysterious situation:
> > >
> > > I've got 3 nodes CEPH cluster, and pool made of 3 OSDs (each on one
> node), OSDs are 1TB SSD drives.
> > >
> > > pool has 3 replicas set. I'm measuring random IO performance using fio:
> > >
> > > fio  --randrepeat=1 --ioengine=rbd --direct=1 --gtod_reduce=1
> --name=test --pool=ssd3r --rbdname=${rbdname} --invalidate=1 --bs=4k
> --iodepth=64 --readwrite=randread --output=randio.log
> > >
> > > it's giving very nice performance of ~ 186K IOPS for random read.
> > >
> > > the problem is, I've got one volume on which it fives only ~20K IOPS
> and I can't figure why. It's created using python, so I first suspected it
> can be similar to missing layerign problem I was consulting here few days
> ago, but when I tried reproducing it, I'm beting ~180K IOPS even for
> another volumes created using python.
> > >
> > > so there is only this one problematic, others are fine. Since there is
> only one SSD in each box and I'm using 3 replicas, there should not be any
> difference in physical storage used between volumes..
> > >
> > > I'm using hammer, 0.94.1, fio 2.2.6.
> > >
> > > here's RBD info:
> > >
> > > "slow" volume:
> > >
> > > [root@vfnphav1a fio]# rbd info ssd3r/vmtst23-6 rbd image 'vmtst23-6':
> > >     size 30720 MB in 7680 objects
> > >     order 22 (4096 kB objects)
> > >     block_name_prefix: rbd_data.1376d82ae8944a
> > >     format: 2
> > >     features:
> > >     flags:
> > >
> > > "fast" volume:
> > > [root@vfnphav1a fio]# rbd info ssd3r/vmtst23-7 rbd image 'vmtst23-7':
> > >     size 30720 MB in 7680 objects
> > >     order 22 (4096 kB objects)
> > >     block_name_prefix: rbd_data.13d01d2ae8944a
> > >     format: 2
> > >     features:
> > >     flags:
> > >
> > > any idea on what could be wrong here?
> > >
> > > thanks a lot in advance!
> > >
> > > BR
> > >
> > > nik
> > >
> > > --
> > > -------------------------------------
> > > Ing. Nikola CIPRICH
> > > LinuxBox.cz, s.r.o.
> > > 28.rijna 168, 709 00 Ostrava
> > >
> > > tel.:   +420 591 166 214
> > > fax:    +420 596 621 273
> > > mobil:  +420 777 093 799
> > > www.linuxbox.cz
> > >
> > > mobil servis: +420 737 238 656
> > > email servis: ser...@linuxbox.cz
> > > -------------------------------------
> > >
> > > ________________________________
> > >
> > > PLEASE NOTE: The information contained in this electronic mail message
> is intended only for the use of the designated recipient(s) named above. If
> the reader of this message is not the intended recipient, you are hereby
> notified that you have received this message in error and that any review,
> dissemination, distribution, or copying of this message is strictly
> prohibited. If you have received this communication in error, please notify
> the sender by telephone or e-mail (as shown above) immediately and destroy
> any and all copies of this message in your possession (whether hard copies
> or electronically stored copies).
> > >
> > >
> >
> > --
> > -------------------------------------
> > Ing. Nikola CIPRICH
> > LinuxBox.cz, s.r.o.
> > 28. rijna 168, 709 00 Ostrava
> >
> > tel.:   +420 591 166 214
> > fax:    +420 596 621 273
> > mobil:  +420 777 093 799
> >
> > www.linuxbox.cz
> >
> > mobil servis: +420 737 238 656
> > email servis: ser...@linuxbox.cz
> > -------------------------------------
> >
>
> --
> -------------------------------------
> Ing. Nikola CIPRICH
> LinuxBox.cz, s.r.o.
> 28.rijna 168, 709 00 Ostrava
>
> tel.:   +420 591 166 214
> fax:    +420 596 621 273
> mobil:  +420 777 093 799
> www.linuxbox.cz
>
> mobil servis: +420 737 238 656
> email servis: ser...@linuxbox.cz
> -------------------------------------
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>


-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] very different performance on two volumes in the same pool

Reply via email to