Hi, Currently, we do not have a separated cluster network and our setup is: - 3 nodes for OSD with 1Gbps links. Each node is running a unique OSD daemon. Although we plan to increase the number of OSDs per host. - 3 virtual machines also with 1Gbps links, where each vm is running one monitor daemon (two of them are running a metadata server too). - The two clients used for testing purposes are also 2 vms.
In each run of FIO tool, we do the following steps (all of them in the client): 1.- Create an rbd image of 1Gb within a pool and map this image to a block device 2.- Create the ext4 filesystem in this block device 3.- Unmap the device from the client 4.- Before testing, drop caches (echo 3 | tee /proc/sys/vm/drop_caches && sync) 5.- Perform the fio test, setting the pool and name of the rbd image. In each run, the block size used is changed. 6.- Remove the image from the pool Thanks in advance! On Wed, Oct 5, 2016 at 2:57 PM, Will.Boege <will.bo...@target.com> wrote: > What does your network setup look like? Do you have a separate cluster > network? > > Can you explain how you are performing the FIO test? Are you mounting a > volume through krbd and testing that from a different server? > > On Oct 5, 2016, at 3:11 AM, Mario Rodríguez Molins < > mariorodrig...@tuenti.com> wrote: > > Hello, > > We are setting a new cluster of Ceph and doing some benchmarks on it. > At this moment, our cluster consists of: > - 3 nodes for OSD. In our current configuration one daemon per node. > - 3 nodes for monitors (MON). In two of these nodes, there is a metadata > server (MDS). > > Benchmarks are performed with tools that ceph/rados provides us as well as > with fio benchmark tool. > Our benchmark tests are based on this tutorial: > http://tracker.ceph.com/projects/ceph/wiki/Benchmark_Ceph_Cl > uster_Performance. > > Using fio benchmark tool, we are having some issues. After some > executions, the fio process gets stuck with futex_wait_queue_me call: > # cat /proc/14413/stack > [<ffffffffa7af6622>] futex_wait_queue_me+0xd2/0x140 > [<ffffffffa7af74bf>] futex_wait+0xff/0x260 > [<ffffffffa7aa3a6d>] wake_up_q+0x2d/0x60 > [<ffffffffa7af7d11>] futex_requeue+0x2c1/0x930 > [<ffffffffa7af8fd1>] do_futex+0x2b1/0xb20 > [<ffffffffa7badfb1>] handle_mm_fault+0x14e1/0x1cd0 > [<ffffffffa7aa48e8>] wake_up_new_task+0x108/0x1a0 > [<ffffffffa7af98c3>] SyS_futex+0x83/0x180 > [<ffffffffa7a63981>] __do_page_fault+0x221/0x510 > [<ffffffffa7fda736>] system_call_fast_compare_end+0xc/0x96 > [<ffffffffffffffff>] 0xffffffffffffffff > > Logs of osd and mon daemons do not show any information or error about > what the problem could be. > > Executing strace command to trace the execution of the fio process show > the following: > > [pid 14416] futex(0x7fffdffa16fc, FUTEX_WAIT_BITSET|FUTEX_CLOCK_REALTIME, > 632809, {1475609725, 98199000}, ffffffff) = -1 ETIMEDOUT (Connection timed > out) > [pid 14416] gettimeofday({1475609725, 98347}, NULL) = 0 > [pid 14416] futex(0x7fffdffa16d0, FUTEX_WAKE, 1) = 0 > [pid 14416] clock_gettime(CLOCK_MONOTONIC_RAW, {125063, 345690227}) = 0 > [pid 14416] futex(0x7fffdffa16fc, FUTEX_WAIT_BITSET|FUTEX_CLOCK_REALTIME, > 632811, {1475609725, 348199000}, ffffffff <unfinished ...> > [pid 14429] <... futex resumed> ) = -1 ETIMEDOUT (Connection timed > out) > [pid 14429] clock_gettime(CLOCK_REALTIME, {1475609725, 127563261}) = 0 > [pid 14429] futex(0x7cefc8, FUTEX_WAKE_PRIVATE, 1) = 0 > [pid 14429] futex(0x7cf01c, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, > 79103, {1475609727, 127563261}, ffffffff <unfinished ...> > [pid 14416] <... futex resumed> ) = -1 ETIMEDOUT (Connection timed > out) > [pid 14416] gettimeofday({1475609725, 348403}, NULL) = 0 > [pid 14416] futex(0x7fffdffa16d0, FUTEX_WAKE, 1) = 0 > [pid 14416] clock_gettime(CLOCK_MONOTONIC_RAW, {125063, 595788486}) = 0 > [pid 14416] futex(0x7fffdffa16fc, FUTEX_WAIT_BITSET|FUTEX_CLOCK_REALTIME, > 632813, {1475609725, 598199000}, ffffffff) = -1 ETIMEDOUT (Connection timed > out) > [pid 14416] gettimeofday({1475609725, 598360}, NULL) = 0 > [pid 14416] futex(0x7fffdffa16d0, FUTEX_WAKE, 1) = 0 > [pid 14416] clock_gettime(CLOCK_MONOTONIC_RAW, {125063, 845712817}) = 0 > [pid 14416] futex(0x7fffdffa16fc, FUTEX_WAIT_BITSET|FUTEX_CLOCK_REALTIME, > 632815, {1475609725, 848199000}, ffffffff) = -1 ETIMEDOUT (Connection timed > out) > [pid 14416] gettimeofday({1475609725, 848353}, NULL) = 0 > [pid 14416] futex(0x7fffdffa16d0, FUTEX_WAKE, 1) = 0 > [pid 14416] clock_gettime(CLOCK_MONOTONIC_RAW, {125064, 95705677}) = 0 > [pid 14416] futex(0x7fffdffa16fc, FUTEX_WAIT_BITSET|FUTEX_CLOCK_REALTIME, > 632817, {1475609726, 98199000}, ffffffff) = -1 ETIMEDOUT (Connection timed > out) > [pid 14416] gettimeofday({1475609726, 98359}, NULL) = 0 > [pid 14416] futex(0x7fffdffa16d0, FUTEX_WAKE, 1) = 0 > [pid 14416] clock_gettime(CLOCK_MONOTONIC_RAW, {125064, 345711731}) = 0 > [pid 14416] futex(0x7fffdffa16fc, FUTEX_WAIT_BITSET|FUTEX_CLOCK_REALTIME, > 632819, {1475609726, 348199000}, ffffffff <unfinished ...> > [pid 14418] <... futex resumed> ) = -1 ETIMEDOUT (Connection timed > out) > [pid 14418] futex(0x7c1f08, FUTEX_WAKE_PRIVATE, 1) = 0 > [pid 14418] clock_gettime(CLOCK_REALTIME, {1475609726, 103526543}) = 0 > [pid 14418] futex(0x7c1f5c, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, > 31641, {1475609731, 103526543}, ffffffff <unfinished ...> > [pid 14419] <... futex resumed> ) = -1 ETIMEDOUT (Connection timed > out) > > .... > > [pid 14423] clock_gettime(CLOCK_REALTIME, {1475609728, 730557149}) = 0 > [pid 14423] clock_gettime(CLOCK_REALTIME, {1475609728, 730727417}) = 0 > [pid 14423] futex(0x7c8c34, FUTEX_CMP_REQUEUE_PRIVATE, 1, 2147483647, > 0x7c8b60, 15902 <unfinished ...> > [pid 14425] <... futex resumed> ) = 0 > [pid 14425] futex(0x7c8b60, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...> > [pid 14423] <... futex resumed> ) = 1 > [pid 14423] futex(0x7c8b60, FUTEX_WAKE_PRIVATE, 1 <unfinished ...> > [pid 14425] <... futex resumed> ) = 0 > [pid 14425] futex(0x7c8b60, FUTEX_WAKE_PRIVATE, 1) = 0 > [pid 14425] clock_gettime(CLOCK_REALTIME, {1475609728, 731160249}) = 0 > [pid 14425] sendmsg(3, {msg_name(0)=NULL, msg_iov(2)=[{"\16", 1}, > {"\200\4\364W\271\236\224+", 8}], msg_controllen=0, msg_flags=0}, > MSG_NOSIGNAL) = 9 > [pid 14425] futex(0x7c8c34, FUTEX_WAIT_PRIVATE, 15903, NULL <unfinished > ...> > [pid 14423] <... futex resumed> ) = 1 > [pid 14423] clock_gettime(CLOCK_REALTIME, {1475609728, 731811246}) = 0 > [pid 14423] futex(0x775430, FUTEX_WAKE_PRIVATE, 1) = 0 > [pid 14423] futex(0x775494, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, > 15823, {1475609738, 731811246}, ffffffff <unfinished ...> > [pid 14426] <... restart_syscall resumed> ) = 1 > [pid 14426] recvfrom(3, "\17\200\4\364W\271\236\224+", 4096, MSG_DONTWAIT, > NULL, NULL) = 9 > [pid 14426] clock_gettime(CLOCK_REALTIME, {1475609728, 732608460}) = 0 > [pid 14426] poll([{fd=3, events=POLLIN|0x2000}], 1, 900000 <unfinished ...> > [pid 14417] <... futex resumed> ) = 0 > [pid 14417] futex(0x771e28, FUTEX_WAKE_PRIVATE, 1) = 0 > [pid 14417] futex(0x771eac, FUTEX_WAIT_PRIVATE, 32223, NULL <unfinished > ...> > [pid 14416] <... futex resumed> ) = -1 ETIMEDOUT (Connection timed > out) > > > This issue has appeared in our two clients. These two clients are running > Debian Jessie, each one with a different kernel: > - kernel 3.16.7-ckt25-2+deb8u3 > - kernel 4.7.2-1~bpo8+1 > And the following version of the packages have been used in both clients: > - Ceph cluster 10.2.2 & FIO 2.1.11-2 > - Ceph cluster 10.2.3 & FIO 2.1.11-2 > - Ceph cluster 10.2.3 & FIO 2.14 > > We launch fio tool varying different settings such block size and > operation type. > This is a simplified snippet of the shell script used: > > for operation in read write randread randwrite; do > > for rbd in 4K 64K 1M 4M; do > for bs in 4k 64k 1M 4M ; do > # create rbd image with block size $rbd > # drop caches > > fio --name=global \ > --ioengine=rbd \ > --clientname=admin \ > --pool=scbench \ > --rbdname=image01 \ > --bs=${bs} \ > --name=rbd_iodeph32 \ > --iodepth=32 \ > --rw=${operation} \ > --output-format=json > > sleep 10 > # delete rbd image > done > done > done > > > > Any ideas why it could be happening ? Are we missing some settings in fio > tool ? > > Regards, > > > -- > > *Mario Rodríguez* > SRE > mariorodrig...@tuenti.com > > +34 914 294 039 — 645 756 437 > C/ Gran Vía, nº 28, 6ª planta — 28013 Madrid > Tuenti Technologies, S.L. > www.tuenti.com > > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > -- *Mario Rodríguez* SRE mariorodrig...@tuenti.com +34 914 294 039 — 645 756 437 C/ Gran Vía, nº 28, 6ª planta — 28013 Madrid Tuenti Technologies, S.L. www.tuenti.com
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com