Re: [ceph-users] latency when OSD falls out of cluster

2013-07-12 Thread Wido den Hollander
Hi Edwin, On 07/12/2013 08:03 AM, Edwin Peer wrote: Hi there, We've been noticing nasty multi-second cluster wide latencies if an OSD drops out of an active cluster (due to power failure, or even being stopped cleanly). We've also seen this problem occur when an OSD is inserted back into the cl

Re: [ceph-users] Tuning options for 10GE ethernet and ceph

2013-07-12 Thread Mihály Árva-Tóth
2013/7/11 Mark Nelson > On 07/11/2013 10:27 AM, Mihály Árva-Tóth wrote: > >> 2013/7/11 Mark Nelson > > >> >> >> On 07/11/2013 10:04 AM, Mihály Árva-Tóth wrote: >> >> Hello, >> >> We are planning to use Intel 10 GE ethernet between nodes of >>

Re: [ceph-users] latency when OSD falls out of cluster

2013-07-12 Thread Edwin Peer
On 07/12/2013 09:21 AM, Wido den Hollander wrote: You will probably see that Peering Groups (PGs) go into a different state then active+clean. Indeed, the cluster goes into a health warning state and starts to resync the data for the affected OSDs. Nothing is missing, just degraded (redundan

Re: [ceph-users] Problems with tgt with ceph support

2013-07-12 Thread Toni F. [ackstorm]
Yes! it seems that i wasn't compiled the rbd support. System: State: ready debug: off LLDs: iscsi: ready Backing stores: bsg sg null ssc aio rdwr (bsoflags sync:direct) Device types: disk cd/dvd osd controller changer tape passthroug

Re: [ceph-users] Problems with tgt with ceph support

2013-07-12 Thread Toni F. [ackstorm]
It works! Thanks for all On 12/07/13 11:23, Toni F. [ackstorm] wrote: Yes! it seems that i wasn't compiled the rbd support. System: State: ready debug: off LLDs: iscsi: ready Backing stores: bsg sg null ssc aio rdwr (bsoflags sync:direct) Device types: d

Re: [ceph-users] OCFS2 or GFS2 for cluster filesystem?

2013-07-12 Thread Tom Verdaat
Hi Darryl, Would love to do that too but only if we can configure nova to do this automatically. Any chance you could dig up and share how you guys accomplished this? >From everything I've read so far Grizzly is not up for the task yet. If I can't set it in nova.conf then it probably won't work w

Re: [ceph-users] OCFS2 or GFS2 for cluster filesystem?

2013-07-12 Thread Alex Bligh
On 12 Jul 2013, at 13:21, Tom Verdaat wrote: > In the mean time I've done some more research and figured out that: > • There is a bunch of other cluster file systems but GFS2 and OCFS2 are > the only open source ones I could find, and I believe the only ones that are > integrated in the L

[ceph-users] ceph-deploy Intended Purpose

2013-07-12 Thread Edward Huyer
I'm working on deploying a multi-machine (possibly as many as 7) ceph (61.4) cluster for experimentation. I'm trying to deploy using ceph-deploy on Ubuntu, but it seems...flaky. For instance, I tried to deploy additional monitors and ran into the bug(?) where the additional monitors don't work

Re: [ceph-users] OCFS2 or GFS2 for cluster filesystem?

2013-07-12 Thread Wolfgang Hennerbichler
FYI: i'm using ocfs2 as you plan to (/var/Lib/nova/instances/) it is stable, but Performance isnt blasting. -- Sent from my mobile device On 12.07.2013, at 14:21, "Tom Verdaat" mailto:t...@server.biz>> wrote: Hi Darryl, Would love to do that too but only if we can configure nova to do this a

[ceph-users] slow request problem

2013-07-12 Thread Stefan Priebe - Profihost AG
Hello list, anyone else here who always has problems bringing back an offline OSD? Since cuttlefish i'm seeing slow requests for the first 2-5 minutes after bringing an OSD oinline again but that's so long that the VMs crash as they think their disk is offline... Under bobtail i never had any pro

Re: [ceph-users] Num of PGs

2013-07-12 Thread Mark Nelson
On 07/12/2013 01:45 AM, Stefan Priebe - Profihost AG wrote: Hello, is this calculation for the number of PGs correct? 36 OSDs, Replication Factor 3 36 * 100 / 3 => 1200 PGs But i then read that it should be an exponent of 2 so it should be 2048? At large numbers of PGs it may not matter ver

Re: [ceph-users] Num of PGs

2013-07-12 Thread Gandalf Corvotempesta
2013/7/12 Mark Nelson : > At large numbers of PGs it may not matter very much, but I don't think it > would hurt either! > > Basically this has to do with how ceph_stable_mod works. At > non-power-of-two values, the bucket counts aren't even, but that's only a > small part of the story and may ult

Re: [ceph-users] Num of PGs

2013-07-12 Thread Mark Nelson
On 07/12/2013 09:53 AM, Gandalf Corvotempesta wrote: 2013/7/12 Mark Nelson : At large numbers of PGs it may not matter very much, but I don't think it would hurt either! Basically this has to do with how ceph_stable_mod works. At non-power-of-two values, the bucket counts aren't even, but that

Re: [ceph-users] Cuttlefish VS Bobtail performance series

2013-07-12 Thread Mark Nelson
Part 4 has been released! Get your 4MB fio results while they are hot! http://ceph.com/performance-2/ceph-cuttlefish-vs-bobtail-part-4-4m-rbd-performance/ Mark On 07/11/2013 09:56 AM, Mark Nelson wrote: And We've now got part 3 out showing 128K FIO results: http://ceph.com/performance-2/ceph

Re: [ceph-users] Problems with tgt with ceph support

2013-07-12 Thread Toni F. [ackstorm]
It works, but the performance is very poor. 100MB/s or less Which are your performance experience? Regards On 12/07/13 13:56, Toni F. [ackstorm] wrote: It works! Thanks for all On 12/07/13 11:23, Toni F. [ackstorm] wrote: Yes! it seems that i wasn't compiled the rbd support. System: St

Re: [ceph-users] Problems with tgt with ceph support

2013-07-12 Thread Dan Mick
Ceph performance is a very very complicated subject. How does that compare to other access methods? Say, rbd import/export for an easy test? On Jul 12, 2013 8:22 AM, "Toni F. [ackstorm]" wrote: > It works, but the performance is very poor. 100MB/s or less > > Which are your performance experienc

Re: [ceph-users] Ceph-deploy

2013-07-12 Thread Scottix
Make sure you understand the ceph architecture http://ceph.com/docs/next/architecture/ and then go through the ceph-deploy docs here http://ceph.com/docs/master/rados/deployment/ceph-deploy-new/ On Thu, Jul 11, 2013 at 8:04 PM, SUNDAY A. OLUTAYO wrote: > I am on first exploration of ceph, I need

Re: [ceph-users] Num of PGs

2013-07-12 Thread Stefan Priebe - Profihost AG
Right now I have 4096. 36*100/3 => 1200. As recovery take ages I thought this might be the reason. Stefan This mail was sent with my iPhone. Am 12.07.2013 um 17:03 schrieb Mark Nelson : > On 07/12/2013 09:53 AM, Gandalf Corvotempesta wrote: >> 2013/7/12 Mark Nelson : >>> At large numbers of PG

Re: [ceph-users] Num of PGs

2013-07-12 Thread Mark Nelson
On 07/12/2013 02:19 PM, Stefan Priebe - Profihost AG wrote: Right now I have 4096. 36*100/3 => 1200. As recovery take ages I thought this might be the reason. Are you seeing any craziness on the mons? Stefan This mail was sent with my iPhone. Am 12.07.2013 um 17:03 schrieb Mark Nelson :

Re: [ceph-users] Num of PGs

2013-07-12 Thread Stefan Priebe - Profihost AG
Am 12.07.2013 um 21:23 schrieb Mark Nelson : > On 07/12/2013 02:19 PM, Stefan Priebe - Profihost AG wrote: >> Right now I have 4096. 36*100/3 => 1200. As recovery take ages I thought >> this might be the reason. > > Are you seeing any craziness on the mons? What could this be? Nothing noticed

Re: [ceph-users] Ceph-deploy

2013-07-12 Thread SUNDAY A. OLUTAYO
Thanks, I will go through the link Sent from my LG Mobile Scottix wrote: Make sure you understand the ceph architecture http://ceph.com/docs/next/architecture/ and then go through the ceph-deploy docs here http://ceph.com/docs/master/rados/deployment/ceph-deploy-new/ On Thu, Jul 11, 2013 at 8

Re: [ceph-users] ceph-deploy Intended Purpose

2013-07-12 Thread Neil Levine
It's the default tool for getting something up and running quickly for tests/PoC. If you don't want to make too many custom settings changes, are happy with SSH access to all boxes, then it's fine but if you want more granular control then we advise you use something like Chef or Puppet. There are

Re: [ceph-users] Including pool_id in the crush hash ? FLAG_HASHPSPOOL ?

2013-07-12 Thread Gregory Farnum
On Thu, Jul 11, 2013 at 6:06 AM, Sylvain Munaut wrote: > Hi, > > > I'd like the pool_id to be included in the hash used for the PG, to > try and improve the data distribution. (I have 10 pool). > > I see that there is a flag named FLAG_HASHPSPOOL. Is it possible to > enable it on existing pool ?

Re: [ceph-users] Possible bug with image.list_lockers()

2013-07-12 Thread Gregory Farnum
On Thu, Jul 11, 2013 at 4:38 PM, Mandell Degerness wrote: > I'm not certain what the correct behavior should be in this case, so > maybe it is not a bug, but here is what is happening: > > When an OSD becomes full, a process fails and we unmount the rbd > attempt to remove the lock associated with