Re: [ceph-users] Hammer reduce recovery impact

2015-09-10 Thread GuangYang
If we are talking about requests being blocked 60+ seconds, those tunings might not help (they help a lot for average latency during recovering/backfilling). It would be interesting to see the logs for those blocked requests at OSD side (they have level 0), pattern to search might be "slow reque

Re: [ceph-users] Hammer reduce recovery impact

2015-09-10 Thread Paweł Sadowski
On 09/10/2015 10:56 PM, Robert LeBlanc wrote: > Things I've tried: > > * Lowered nr_requests on the spindles from 1000 to 100. This reduced > the max latency sometimes up to 3000 ms down to a max of 500-700 ms. > it has also reduced the huge swings in latency, but has also reduced > throughput som

Re: [ceph-users] bad perf for librbd vs krbd using FIO

2015-09-10 Thread Somnath Roy
That’s probably because the krbd version you are using doesn’t have the TCP_NODELAY patch. We have submitted it (and you can build it from latest rbd source) , but, I am not sure when it will be in linux mainline. Thanks & Regards Somnath From: Rafael Lopez [mailto:rafael.lo...@monash.edu] Sent

Re: [ceph-users] bad perf for librbd vs krbd using FIO

2015-09-10 Thread Rafael Lopez
Ok I ran the two tests again with direct=1, smaller block size (4k) and smaller total io (100m), disabled cache at ceph.conf side on client by adding: [client] rbd cache = false rbd cache max dirty = 0 rbd cache size = 0 rbd cache target dirty = 0 The result seems to have swapped around, now the

Re: [ceph-users] higher read iop/s for single thread

2015-09-10 Thread Stefan Priebe
Am 10.09.2015 um 16:26 schrieb Haomai Wang: Actually we can reach 700us per 4k write IO for single io depth(2 copy, E52650, 10Gib, intel s3700). So I think 400 read iops shouldn't be a unbridgeable problem. How did you meassure it? CPU is critical for ssd backend, so what's your cpu model?

Re: [ceph-users] bad perf for librbd vs krbd using FIO

2015-09-10 Thread Somnath Roy
Only changing client side ceph.conf and rerunning the tests is sufficient. Thanks & Regards Somnath From: Rafael Lopez [mailto:rafael.lo...@monash.edu] Sent: Thursday, September 10, 2015 8:58 PM To: Somnath Roy Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] bad perf for librbd vs krbd us

Re: [ceph-users] bad perf for librbd vs krbd using FIO

2015-09-10 Thread Rafael Lopez
Hi Christian, Will try direct=1 and block size, cheers. Re: librbd, I'm not using VM yet, only using FIO with the RBD ioengine (ioengine=rbd in fio job file). Actually, I was seeing some unexpected IO on a VM which is what prompted me to start doing tests on the hypervisor host. Raf On 11 Septe

Re: [ceph-users] bad perf for librbd vs krbd using FIO

2015-09-10 Thread Rafael Lopez
Thanks for the quick reply Somnath, will give this a try. In order to set the rbd cache settings, is it a matter of updating the ceph.conf file on the client only prior to running the test, or do I need to inject args to all OSDs ? Raf On 11 September 2015 at 13:39, Somnath Roy wrote: > It ma

Re: [ceph-users] bad perf for librbd vs krbd using FIO

2015-09-10 Thread Christian Balzer
Hello, On Fri, 11 Sep 2015 13:24:24 +1000 Rafael Lopez wrote: > Hi all, > > I am seeing a big discrepancy between librbd and kRBD/ext4 performance > using FIO with single RBD image. RBD images are coming from same RBD > pool, same size and settings for both. The librbd results are quite bad > b

Re: [ceph-users] bad perf for librbd vs krbd using FIO

2015-09-10 Thread Somnath Roy
It may be due to rbd cache effect.. Try the following.. Run your test with direct = 1 both the cases and rbd_cache = false (disable all other rbd cache option as well). This should give you similar result like krbd. In direct =1 case, we saw ~10-20% degradation if we make rbd_cache = true. But

[ceph-users] bad perf for librbd vs krbd using FIO

2015-09-10 Thread Rafael Lopez
Hi all, I am seeing a big discrepancy between librbd and kRBD/ext4 performance using FIO with single RBD image. RBD images are coming from same RBD pool, same size and settings for both. The librbd results are quite bad by comparison, and in addition if I scale up the kRBD FIO job with more jobs/t

[ceph-users] 答复: ceph shows health_ok but cluster completely jacked up

2015-09-10 Thread Duanweijun
you can use s3cmd http://www.cnblogs.com/zhyg6516/archive/2011/09/02/2163933.html or use s3curl.pl baidu yi xia example // get bucket index with --debug ./s3curl.pl --debug --id=personal -- http://node110/admin/bucket?index\&bucket=bkt-test s3curl: Found the url: host=node110; port=; u

[ceph-users] ceph shows health_ok but cluster completely jacked up

2015-09-10 Thread Xu (Simon) Chen
Hi all, I am using ceph 0.94.1. Recently, I ran into a somewhat serious issue. "ceph -s" reports everything ok, all PGs active+clean, no blocked requests, etc. However, everything on top (VM's rbd disks) is completely jacked up. VM dmesg reporting blocked io requests, and reboot would just stuck.

[ceph-users] How to use query string of s3 Restful API to use RADOSGW

2015-09-10 Thread Fulin Sun
Hi, ceph experts Newbie here. Just want to try ceph object gateway and use s3 restful api for some performance test. We had configured and started radosgw according to this : http://ceph.com/docs/master/radosgw/config/ And we had successfully ran the python test for s3 access. Question is

[ceph-users] 答复: 答复: Unable to create bucket using S3 or Swift API in Ceph RADOSGW

2015-09-10 Thread Guce
Use a browser to access the host, can you see this information? - http://s3.amazonaws.com/doc/2006-03-01/";> - anonymous Or you can try modify /etc/ceph/ceph.conf add civetweb config, and restart radosgw [client.radosgw.gateway] ... rgw frontends = "civetweb port=80" ... /e

Re: [ceph-users] Hammer reduce recovery impact

2015-09-10 Thread Christian Balzer
Hello, On Thu, 10 Sep 2015 16:16:10 -0600 Robert LeBlanc wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA256 > > Do the recovery options kick in when there is only backfill going on? > Aside from having these set just in case as your cluster (and one of mine) is clearly at the limits of

Re: [ceph-users] Hammer reduce recovery impact

2015-09-10 Thread Lionel Bouton
Le 11/09/2015 01:24, Lincoln Bryant a écrit : > On 9/10/2015 5:39 PM, Lionel Bouton wrote: >> For example deep-scrubs were a problem on our installation when at >> times there were several going on. We implemented a scheduler that >> enforces limits on simultaneous deep-scrubs and these problems ar

Re: [ceph-users] Hammer reduce recovery impact

2015-09-10 Thread Lincoln Bryant
On 9/10/2015 5:39 PM, Lionel Bouton wrote: For example deep-scrubs were a problem on our installation when at times there were several going on. We implemented a scheduler that enforces limits on simultaneous deep-scrubs and these problems are gone. Hi Lionel, Out of curiosity, how many was "

Re: [ceph-users] Question on cephfs recovery tools

2015-09-10 Thread Shinobu Kinjo
>> c./ After recovering the cluster, I though I was in a cephfs situation where >> I had >> c.1 files with holes (because of lost PGs and objects in the data pool) >> c.2 files without metadata (because of lost PGs and objects in the >> metadata pool) > > What does "files without metadata"

Re: [ceph-users] Hammer reduce recovery impact

2015-09-10 Thread Lionel Bouton
Le 11/09/2015 00:20, Robert LeBlanc a écrit : > I don't think the script will help our situation as it is just setting > osd_max_backfill from 1 to 0. It looks like that change doesn't go > into effect until after it finishes the PG. That was what I was afraid of. Note that it should help a little

Re: [ceph-users] Hammer reduce recovery impact

2015-09-10 Thread Somnath Roy
I am not an expert on that, but, probably these settings will help backfill to go slow and thus less degradation on client IO. You may want to try.. Thanks & Regards Somnath -Original Message- From: Robert LeBlanc [mailto:rob...@leblancnet.us] Sent: Thursday, September 10, 2015 3:16 PM

Re: [ceph-users] Hammer reduce recovery impact

2015-09-10 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 I don't think the script will help our situation as it is just setting osd_max_backfill from 1 to 0. It looks like that change doesn't go into effect until after it finishes the PG. It would be nice if backfill/recovery would skip the journal, but th

Re: [ceph-users] Hammer reduce recovery impact

2015-09-10 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Do the recovery options kick in when there is only backfill going on? - Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Thu, Sep 10, 2015 at 3:01 PM, Somnath Roy wrote: > Try all these.. > > os

Re: [ceph-users] Hammer reduce recovery impact

2015-09-10 Thread Lionel Bouton
Le 10/09/2015 22:56, Robert LeBlanc a écrit : > We are trying to add some additional OSDs to our cluster, but the > impact of the backfilling has been very disruptive to client I/O and > we have been trying to figure out how to reduce the impact. We have > seen some client I/O blocked for more than

Re: [ceph-users] Hammer reduce recovery impact

2015-09-10 Thread Somnath Roy
Try all these.. osd recovery max active = 1 osd max backfills = 1 osd recovery threads = 1 osd recovery op priority = 1 Thanks & Regards Somnath -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Robert LeBlanc Sent: Thursday, September 10, 2015

[ceph-users] Hammer reduce recovery impact

2015-09-10 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 We are trying to add some additional OSDs to our cluster, but the impact of the backfilling has been very disruptive to client I/O and we have been trying to figure out how to reduce the impact. We have seen some client I/O blocked for more than 60 s

[ceph-users] 9 PGs stay incomplete

2015-09-10 Thread Wido den Hollander
Hi, I'm running into a issue with Ceph 0.94.2/3 where after doing a recovery test 9 PGs stay incomplete: osdmap e78770: 2294 osds: 2294 up, 2294 in pgmap v1972391: 51840 pgs, 7 pools, 220 TB data, 185 Mobjects 755 TB used, 14468 TB / 15224 TB avail 51831 active+clean

Re: [ceph-users] CephFS and caching

2015-09-10 Thread Kyle Hutson
A 'rados -p cachepool ls' takes about 3 hours - not exactly useful. I'm intrigued that you say a single read may not promote it into the cache. My understanding is that if you have an EC-backed pool the clients can't talk to them directly, which means they would necessarily be promoted to the cach

[ceph-users] rados bench seq throttling

2015-09-10 Thread Deneau, Tom
Running 9.0.3 rados bench on a 9.0.3 cluster... In the following experiments this cluster is only 2 osd nodes, 6 osds each and a separate mon node (and a separate client running rados bench). I have two pools populated with 4M objects. The pools are replicated x2 with identical parameters. The o

Re: [ceph-users] higher read iop/s for single thread

2015-09-10 Thread Stefan Priebe
Am 10.09.2015 um 16:26 schrieb Haomai Wang: Actually we can reach 700us per 4k write IO for single io depth(2 copy, E52650, 10Gib, intel s3700). So I think 400 read iops shouldn't be a unbridgeable problem. CPU is critical for ssd backend, so what's your cpu model? Intel(R) Xeon(R) CPU E5-165

Re: [ceph-users] higher read iop/s for single thread

2015-09-10 Thread Stefan Priebe
Am 10.09.2015 um 17:20 schrieb Mark Nelson: I'm not sure you will be able to get there with firefly. I've gotten close to 1ms after lots of tuning on hammer, but 0.5ms is probably not likely to happen without all of the new work that Sandisk/Fujitsu/Intel/Others have been doing to improve the d

Re: [ceph-users] Straw2 kernel version?

2015-09-10 Thread Josh Durgin
On 09/10/2015 11:53 AM, Robert LeBlanc wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA256 My notes show that it should have landed in 4.1, but I also have written down that it wasn't merged yet. Just trying to get a confirmation on the version that it did land in. Yes, it landed in 4.1. J

Re: [ceph-users] Straw2 kernel version?

2015-09-10 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 My notes show that it should have landed in 4.1, but I also have written down that it wasn't merged yet. Just trying to get a confirmation on the version that it did land in. - Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904

Re: [ceph-users] Straw2 kernel version?

2015-09-10 Thread Lincoln Bryant
Hi Robert, I believe kernel versions 4.1 and beyond support straw2. —Lincoln > On Sep 10, 2015, at 1:43 PM, Robert LeBlanc wrote: > > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA256 > > Has straw2 landed in the kernel and if so which version? > > Thanks, > - > Robert LeBla

[ceph-users] Straw2 kernel version?

2015-09-10 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Has straw2 landed in the kernel and if so which version? Thanks, - Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 -BEGIN PGP SIGNATURE- Version: Mailvelope v1.0.2 Comment: https://www.mailv

Re: [ceph-users] How to use cgroup to bind ceph-osd to a specific cpu core?

2015-09-10 Thread Jelle de Jong
Hello Jan, I want to test your pincpus I got from github. I have a 2x CPU (X5550) with 4 core 16 threads system I have four OSD (4x WD1003FBYX) with SSD (SHFS37A) journal . I got three nodes like that. I am not sure how to configure prz-pincpus.conf # prz-pincpus.conf https://paste.debian.net/pl

Re: [ceph-users] higher read iop/s for single thread

2015-09-10 Thread Mark Nelson
I'm not sure you will be able to get there with firefly. I've gotten close to 1ms after lots of tuning on hammer, but 0.5ms is probably not likely to happen without all of the new work that Sandisk/Fujitsu/Intel/Others have been doing to improve the data path. Your best bet is probably going

Re: [ceph-users] higher read iop/s for single thread

2015-09-10 Thread Jan Schermer
Yes, I read that and I investigated all those areas, on my Dumpling the gains are pretty low. If I remember correctly Hammer didn't improve synchronous (journal) writes at all. Or at least it didn't when I read that...? So is it actually that much faster? Did something change in Hammer in recent

Re: [ceph-users] higher read iop/s for single thread

2015-09-10 Thread Haomai Wang
On Thu, Sep 10, 2015 at 10:41 PM, Jan Schermer wrote: > What did you tune? Did you have to make a human sacrifice? :) Which > release? > The last proper benchmark numbers I saw were from hammer and the latencies > were basically still the same, about 2ms for write. > No sacrifice, actually I div

Re: [ceph-users] higher read iop/s for single thread

2015-09-10 Thread Andrija Panic
We also get 2ms for writes, INtel S3500 Journals (5 journals on 1 SSD) and 4TB OSDs... On 10 September 2015 at 16:41, Jan Schermer wrote: > What did you tune? Did you have to make a human sacrifice? :) Which > release? > The last proper benchmark numbers I saw were from hammer and the latencies

Re: [ceph-users] higher read iop/s for single thread

2015-09-10 Thread Jan Schermer
What did you tune? Did you have to make a human sacrifice? :) Which release? The last proper benchmark numbers I saw were from hammer and the latencies were basically still the same, about 2ms for write. Jan > On 10 Sep 2015, at 16:38, Haomai Wang wrote: > > > > On Thu, Sep 10, 2015 at 10:3

Re: [ceph-users] higher read iop/s for single thread

2015-09-10 Thread Haomai Wang
On Thu, Sep 10, 2015 at 10:36 PM, Jan Schermer wrote: > > On 10 Sep 2015, at 16:26, Haomai Wang wrote: > > Actually we can reach 700us per 4k write IO for single io depth(2 copy, > E52650, 10Gib, intel s3700). So I think 400 read iops shouldn't be a > unbridgeable problem. > > > Flushed to disk?

Re: [ceph-users] higher read iop/s for single thread

2015-09-10 Thread Jan Schermer
> On 10 Sep 2015, at 16:26, Haomai Wang wrote: > > Actually we can reach 700us per 4k write IO for single io depth(2 copy, > E52650, 10Gib, intel s3700). So I think 400 read iops shouldn't be a > unbridgeable problem. > Flushed to disk? > CPU is critical for ssd backend, so what's your cpu

Re: [ceph-users] RBD with iSCSI

2015-09-10 Thread Jake Young
On Wed, Sep 9, 2015 at 8:13 AM, Daleep Bais wrote: > Hi, > > I am following steps from URL > *http://www.sebastien-han.fr/blog/2014/07/07/start-with-the-rbd-support-for-tgt/ > * > to create a RBD pool and share t

Re: [ceph-users] higher read iop/s for single thread

2015-09-10 Thread Haomai Wang
Actually we can reach 700us per 4k write IO for single io depth(2 copy, E52650, 10Gib, intel s3700). So I think 400 read iops shouldn't be a unbridgeable problem. CPU is critical for ssd backend, so what's your cpu model? On Thu, Sep 10, 2015 at 9:48 PM, Jan Schermer wrote: > It's certainly not

Re: [ceph-users] higher read iop/s for single thread

2015-09-10 Thread Jan Schermer
It's certainly not a problem with DRBD (yeah, it's something completely different but it's used for all kinds of workloads including things like replicated tablespaces for databases). It won't be a problem with VSAN (again, a bit different, but most people just want something like that) It surel

Re: [ceph-users] higher read iop/s for single thread

2015-09-10 Thread Andrija Panic
"enough 4k read iop/s for multithreaded apps (around 23 000) with qemu 2.2.1." That is very nice number if I'm allowed to comment - may I know what is your setup (in 2 lines, hardware, number of OSDs) ? Thanks On 10 September 2015 at 15:39, Jan Schermer wrote: > Get faster CPUs (sorry, nothing

Re: [ceph-users] higher read iop/s for single thread

2015-09-10 Thread Jan Schermer
Get faster CPUs (sorry, nothing else comes to mind). What type of application is that and what exactly does it do? Basically you would have to cache it in rbd cache or pagecache in the VM but that only works if the reads repeat. Jan > On 10 Sep 2015, at 15:34, Stefan Priebe - Profihost AG >

Re: [ceph-users] higher read iop/s for single thread

2015-09-10 Thread Gregory Farnum
On Thu, Sep 10, 2015 at 2:34 PM, Stefan Priebe - Profihost AG wrote: > Hi, > > while we're happy running ceph firefly in production and also reach > enough 4k read iop/s for multithreaded apps (around 23 000) with qemu 2.2.1. > > We've now a customer having a single threaded application needing ar

[ceph-users] higher read iop/s for single thread

2015-09-10 Thread Stefan Priebe - Profihost AG
Hi, while we're happy running ceph firefly in production and also reach enough 4k read iop/s for multithreaded apps (around 23 000) with qemu 2.2.1. We've now a customer having a single threaded application needing around 2000 iop/s but we don't go above 600 iop/s in this case. Any tuning hints

Re: [ceph-users] Question on cephfs recovery tools

2015-09-10 Thread John Spray
On Thu, Sep 10, 2015 at 7:44 AM, Shinobu Kinjo wrote: >>> Finally the questions: >>> >>> a./ Under a situation as the one describe above, how can we safely terminate >>> cephfs in the clients? I have had situations where umount simply hangs and >>> there is no real way to unblock the situation unl

Re: [ceph-users] 答复: Unable to create bucket using S3 or Swift API in Ceph RADOSGW

2015-09-10 Thread Daleep Bais
Hi Guce, First of very sorry for delay in replying back, got stuck in something. I followed the steps as you told. I am getting Error : 405 method not allowed while trying to create bucket using S3 script. *root@ceph-node1:~/cluster# python s3test.py* *Traceback (most recent call last):* * File

Re: [ceph-users] Question on cephfs recovery tools

2015-09-10 Thread Shinobu Kinjo
>> Finally the questions: >> >> a./ Under a situation as the one describe above, how can we safely terminate >> cephfs in the clients? I have had situations where umount simply hangs and >> there is no real way to unblock the situation unless I reboot the client. If >> we have hundreds of clients,

Re: [ceph-users] CephFS/Fuse : detect package upgrade to remount

2015-09-10 Thread John Spray
On Tue, Sep 8, 2015 at 9:00 AM, Gregory Farnum wrote: > On Tue, Sep 8, 2015 at 2:33 PM, Florent B wrote: >> >> >> On 09/08/2015 03:26 PM, Gregory Farnum wrote: >>> On Fri, Sep 4, 2015 at 9:15 AM, Florent B wrote: Hi everyone, I would like to know if there is a way on Debian to det

Re: [ceph-users] Question on cephfs recovery tools

2015-09-10 Thread John Spray
On Wed, Sep 9, 2015 at 2:31 AM, Goncalo Borges wrote: > Dear Ceph / CephFS gurus... > > Bare a bit with me while I give you a bit of context. Questions will appear > at the end. > > 1) I am currently running ceph 9.0.3 and I have install it to test the > cephfs recovery tools. > > 2) I've created

Re: [ceph-users] Ceph.conf

2015-09-10 Thread Shinobu Kinjo
Thank you for letting me know your thought, Abhishek!! > The Ceph Object Gateway will query Keystone periodically > for a list of revoked tokens. These requests are encoded > and signed. Also, Keystone may be configured to provide > self-signed tokens, which are also encoded and

Re: [ceph-users] Ceph.conf

2015-09-10 Thread Abhishek L
On Thu, Sep 10, 2015 at 2:51 PM, Shinobu Kinjo wrote: > Thank you for your really really quick reply, Greg. > > > Yes. A bunch shouldn't ever be set by users. > > Anyhow, this is one of my biggest concern right now -; > > rgw_keystone_admin_password = > > > MU

Re: [ceph-users] Ceph.conf

2015-09-10 Thread Shinobu Kinjo
Thank you for your really really quick reply, Greg. > Yes. A bunch shouldn't ever be set by users. Anyhow, this is one of my biggest concern right now -; rgw_keystone_admin_password = MUST not be there. Shinobu - Original Message - From: "Gregory

Re: [ceph-users] Ceph.conf

2015-09-10 Thread Gregory Farnum
On Thu, Sep 10, 2015 at 9:44 AM, Shinobu Kinjo wrote: > Hello, > > I'm seeing 859 parameters in the output of: > > $ ./ceph --show-config | wc -l > *** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH *** > 859 > > In: > > $ ./ceph --version > *** DEVELOPER MODE: se

[ceph-users] Ceph.conf

2015-09-10 Thread Shinobu Kinjo
Hello, I'm seeing 859 parameters in the output of: $ ./ceph --show-config | wc -l *** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH *** 859 In: $ ./ceph --version *** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH *** ceph version 9.0.2-1454-

Re: [ceph-users] RBD with iSCSI

2015-09-10 Thread Daleep Bais
Hello , Can anyone please suggest some path to check and resolve this issue? Thanks in advance. Daleep Singh Bais On Wed, Sep 9, 2015 at 5:43 PM, Daleep Bais wrote: > Hi, > > I am following steps from URL > *http://www.sebastien-han.fr/blog/2014/07/07/start-with-the-rbd-support-for-tgt/ > <