Re: [ceph-users] Uneven OSD usage

2014-08-28 Thread Christian Balzer
Hello, On Fri, 29 Aug 2014 02:32:39 -0400 J David wrote: > On Thu, Aug 28, 2014 at 10:47 PM, Christian Balzer wrote: > >> There are 1328 PG's in the pool, so about 110 per OSD. > >> > > And just to be pedantic, the PGP_NUM is the same? > > Ah, "ceph status" reports 1328 pgs. But: > > $ sudo

Re: [ceph-users] 'incomplete' PGs: what does it mean?

2014-08-28 Thread John Morris
Greg, thanks for the tips in both this and the BTRFS_IOC_SNAP_CREATE thread. They were enough to get PGs 'incomplete' due to 'not enough OSDs hosting' resolved by rolling back to a btrfs snapshot. I promise to write a full post-mortem (embarrassing as it will be) after the cluster is fully health

Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS

2014-08-28 Thread Somnath Roy
Thanks Haomai ! Here is some of the data from my setup. -- Set up: 32 core cpu with H

Re: [ceph-users] Uneven OSD usage

2014-08-28 Thread J David
On Thu, Aug 28, 2014 at 10:47 PM, Christian Balzer wrote: >> There are 1328 PG's in the pool, so about 110 per OSD. >> > And just to be pedantic, the PGP_NUM is the same? Ah, "ceph status" reports 1328 pgs. But: $ sudo ceph osd pool get rbd pg_num pg_num: 1200 $ sudo ceph osd pool get rbd pgp_n

[ceph-users] question about monitor and paxos relationship

2014-08-28 Thread pragya jain
I have some basic question about monitor and paxos relationship: As the documents says, Ceph monitor contains cluster map, if there is any change in the state of the cluster, the change is updated in the cluster map. monitor use paxos algorithm to create the consensus among monitors to establish

Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS

2014-08-28 Thread Matt W. Benjamin
Hi, There's also an early-stage TCP transport implementation for Accelio, also EPOLL-based. (We haven't attempted to run Ceph protocols over it yet, to my knowledge, but it should be straightforward.) Regards, Matt - "Haomai Wang" wrote: > Hi Roy, > > > As for messenger level, I have

Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS

2014-08-28 Thread Haomai Wang
Another thread about it(http://comments.gmane.org/gmane.comp.file-systems.ceph.devel/19284) On Fri, Aug 29, 2014 at 11:01 AM, Haomai Wang wrote: > Hi Roy, > > I already scan your merged codes about "fdcache" and "optimizing for > lfn_find/lfn_open", could you give some performance improvement dat

Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS

2014-08-28 Thread Haomai Wang
Hi Roy, I already scan your merged codes about "fdcache" and "optimizing for lfn_find/lfn_open", could you give some performance improvement data about it? I fully agree with your orientation, do you have any update about it? As for messenger level, I have some very early works on it(https://gith

Re: [ceph-users] Uneven OSD usage

2014-08-28 Thread Christian Balzer
Hello, On Thu, 28 Aug 2014 19:49:59 -0400 J David wrote: > On Thu, Aug 28, 2014 at 7:00 PM, Robert LeBlanc > wrote: > > How many PGs do you have in your pool? This should be about 100/OSD. > > There are 1328 PG's in the pool, so about 110 per OSD. > And just to be pedantic, the PGP_NUM is the

Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS

2014-08-28 Thread Mark Kirkwood
On 29/08/14 14:06, Mark Kirkwood wrote: ... mounting (xfs) with nobarrier seems to get much better results. The run below is for a single osd on an xfs partition from an Intel 520. I'm using another 520 as a journal: ...and adding filestore_queue_max_ops = 2 improved IOPS a bit more:

Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS

2014-08-28 Thread Mark Kirkwood
On 29/08/14 04:11, Sebastien Han wrote: Hey all, See my fio template: [global] #logging #write_iops_log=write_iops_log #write_bw_log=write_bw_log #write_lat_log=write_lat_lo time_based runtime=60 ioengine=rbd clientname=admin pool=test rbdname=fio invalidate=0# mandatory #rw=randwrite r

Re: [ceph-users] Fwd: Ceph Filesystem - Production?

2014-08-28 Thread Yan, Zheng
On Fri, Aug 29, 2014 at 8:36 AM, James Devine wrote: > > On Thu, Aug 28, 2014 at 1:30 PM, Gregory Farnum wrote: >> >> On Thu, Aug 28, 2014 at 10:36 AM, Brian C. Huffman >> wrote: >> > Is Ceph Filesystem ready for production servers? >> > >> > The documentation says it's not, but I don't see that

[ceph-users] Fwd: Ceph Filesystem - Production?

2014-08-28 Thread James Devine
On Thu, Aug 28, 2014 at 1:30 PM, Gregory Farnum wrote: > On Thu, Aug 28, 2014 at 10:36 AM, Brian C. Huffman > wrote: > > Is Ceph Filesystem ready for production servers? > > > > The documentation says it's not, but I don't see that mentioned anywhere > > else. > > http://ceph.com/docs/master/cep

Re: [ceph-users] Uneven OSD usage

2014-08-28 Thread J David
On Thu, Aug 28, 2014 at 7:00 PM, Robert LeBlanc wrote: > How many PGs do you have in your pool? This should be about 100/OSD. There are 1328 PG's in the pool, so about 110 per OSD. Thanks! ___ ceph-users mailing list ceph-users@lists.ceph.com http://li

Re: [ceph-users] Uneven OSD usage

2014-08-28 Thread Robert LeBlanc
How many PGs do you have in your pool? This should be about 100/OSD. If it is too low, you could get an imbalance. I don't know the consequence of changing it on such a full cluster. The default values are only good for small test environments. Robert LeBlanc Sent from a mobile device please excu

Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS

2014-08-28 Thread Somnath Roy
Yes, what I saw the messenger level bottleneck is still huge ! Hopefully RDMA messenger will resolve that and the performance gain will be significant for Read (on SSDs). For write we need to uncover the OSD bottlenecks first to take advantage of the improved upstream. What I experienced that til

Re: [ceph-users] Best practice K/M-parameters EC pool

2014-08-28 Thread Mike Dawson
On 8/28/2014 4:17 PM, Craig Lewis wrote: My initial experience was similar to Mike's, causing a similar level of paranoia. :-) I'm dealing with RadosGW though, so I can tolerate higher latencies. I was running my cluster with noout and nodown set for weeks at a time. I'm sure Craig will agr

Re: [ceph-users] Best practice K/M-parameters EC pool

2014-08-28 Thread Craig Lewis
My initial experience was similar to Mike's, causing a similar level of paranoia. :-) I'm dealing with RadosGW though, so I can tolerate higher latencies. I was running my cluster with noout and nodown set for weeks at a time. Recovery of a single OSD might cause other OSDs to crash. In the pr

Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS

2014-08-28 Thread Andrey Korolyov
On Thu, Aug 28, 2014 at 10:48 PM, Somnath Roy wrote: > Nope, this will not be back ported to Firefly I guess. > > Thanks & Regards > Somnath > Thanks for sharing this, the first thing in thought when I looked at this thread, was your patches :) If Giant will incorporate them, both the RDMA suppo

Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS

2014-08-28 Thread Somnath Roy
Nope, this will not be back ported to Firefly I guess. Thanks & Regards Somnath -Original Message- From: David Moreau Simard [mailto:dmsim...@iweb.com] Sent: Thursday, August 28, 2014 11:32 AM To: Somnath Roy; Mark Nelson; ceph-users@lists.ceph.com Subject: Re: [ceph-users] [Single OSD pe

Re: [ceph-users] MSWin CephFS

2014-08-28 Thread Gregory Farnum
On Thu, Aug 28, 2014 at 10:41 AM, LaBarre, James (CTR) A6IT wrote: > Just out of curiosity, is there a way to mount a Ceph filesystem directly on > a MSWindows system (2008 R2 server)? Just wanted to try something out from > a VM. Nope, sorry. -Greg ___

Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS

2014-08-28 Thread David Moreau Simard
That's definitely interesting. Is this meant to be released in a dot release in Firefly or will they land in Giant ? -- David Moreau Simard Le 2014-08-28, 1:49 PM, « Somnath Roy » a écrit : >Yes, Mark, all of my changes are in ceph main now and we are getting >significant RR performance impro

Re: [ceph-users] Ceph Filesystem - Production?

2014-08-28 Thread Gregory Farnum
On Thu, Aug 28, 2014 at 10:36 AM, Brian C. Huffman wrote: > Is Ceph Filesystem ready for production servers? > > The documentation says it's not, but I don't see that mentioned anywhere > else. > http://ceph.com/docs/master/cephfs/ Everybody has their own standards, but Red Hat isn't supporting i

Re: [ceph-users] RAID underlying a Ceph config

2014-08-28 Thread Gregory Farnum
There aren't too many people running RAID under Ceph, as it's a second layer of redundancy that in normal circumstances is a bit pointless. But there are scenarios where it might be useful. You might check the list archives for the "anti-cephalopod question" thread. -Greg Software Engineer #42 @ ht

Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS

2014-08-28 Thread Somnath Roy
Yes, Mark, all of my changes are in ceph main now and we are getting significant RR performance improvement with that. Thanks & Regards Somnath -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Mark Nelson Sent: Thursday, August 28, 2014 10:43 A

Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS

2014-08-28 Thread Mark Nelson
On 08/28/2014 12:39 PM, Somnath Roy wrote: Hi Sebastian, If you are trying with the latest Ceph master, there are some changes we made that will be increasing your read performance from SSD a factor of ~5X if the ios are hitting the disks. Otherwise, the serving from memory the improvement is

[ceph-users] MSWin CephFS

2014-08-28 Thread LaBarre, James (CTR) A6IT
Just out of curiosity, is there a way to mount a Ceph filesystem directly on a MSWindows system (2008 R2 server)? Just wanted to try something out from a VM. -- CONFIDENTIALITY NOTICE: If you have received this email in

Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS

2014-08-28 Thread Somnath Roy
Hi Sebastian, If you are trying with the latest Ceph master, there are some changes we made that will be increasing your read performance from SSD a factor of ~5X if the ios are hitting the disks. Otherwise, the serving from memory the improvement is even more. The single OSD will be cpu bound w

Re: [ceph-users] Best practice K/M-parameters EC pool

2014-08-28 Thread Mike Dawson
On 8/28/2014 11:17 AM, Loic Dachary wrote: On 28/08/2014 16:29, Mike Dawson wrote: On 8/28/2014 12:23 AM, Christian Balzer wrote: On Wed, 27 Aug 2014 13:04:48 +0200 Loic Dachary wrote: On 27/08/2014 04:34, Christian Balzer wrote: Hello, On Tue, 26 Aug 2014 20:21:39 +0200 Loic Dachary w

[ceph-users] Ceph Filesystem - Production?

2014-08-28 Thread Brian C. Huffman
Is Ceph Filesystem ready for production servers? The documentation says it's not, but I don't see that mentioned anywhere else. http://ceph.com/docs/master/cephfs/ Thanks, Brian ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph

[ceph-users] RAID underlying a Ceph config

2014-08-28 Thread LaBarre, James (CTR) A6IT
Having heard some suggestions on RAID configuration under Gluster (we have someone else doing that evaluation, I'm doing the Ceph piece), I'm wondering what (if any) RAID configurations would be recommended for Ceph. I have the impression that striping data could counteract/undermine data repl

[ceph-users] Uneven OSD usage

2014-08-28 Thread J David
Hello, Is there any way to provoke a ceph cluster to level out its OSD usage? Currently, a cluster of 3 servers with 4 identical OSDs each is showing disparity of about 20% between the most-used OSD and the least-used OSD. This wouldn't be too big of a problem, but the most-used OSD is now at 86

[ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS

2014-08-28 Thread Sebastien Han
Hey all, It has been a while since the last thread performance related on the ML :p I’ve been running some experiment to see how much I can get from an SSD on a Ceph cluster. To achieve that I did something pretty simple: * Debian wheezy 7.6 * kernel from debian 3.14-0.bpo.2-amd64 * 1 cluster, 3

Re: [ceph-users] Best practice K/M-parameters EC pool

2014-08-28 Thread Loic Dachary
Hi Blair, On 28/08/2014 16:38, Blair Bethwaite wrote: > Hi Loic, > > Thanks for the reply and interesting discussion. I'm learning a lot :-) > On 26 August 2014 23:25, Loic Dachary wrote: >> Each time an OSD is lost, there is a 0.001*0.001 = 0.01% chance that two >> other disks are lost b

Re: [ceph-users] Best practice K/M-parameters EC pool

2014-08-28 Thread Christian Balzer
On Thu, 28 Aug 2014 10:29:20 -0400 Mike Dawson wrote: > On 8/28/2014 12:23 AM, Christian Balzer wrote: > > On Wed, 27 Aug 2014 13:04:48 +0200 Loic Dachary wrote: > > > >> > >> > >> On 27/08/2014 04:34, Christian Balzer wrote: > >>> > >>> Hello, > >>> > >>> On Tue, 26 Aug 2014 20:21:39 +0200 Loic D

Re: [ceph-users] Best practice K/M-parameters EC pool

2014-08-28 Thread Loic Dachary
On 28/08/2014 16:29, Mike Dawson wrote: > On 8/28/2014 12:23 AM, Christian Balzer wrote: >> On Wed, 27 Aug 2014 13:04:48 +0200 Loic Dachary wrote: >> >>> >>> >>> On 27/08/2014 04:34, Christian Balzer wrote: Hello, On Tue, 26 Aug 2014 20:21:39 +0200 Loic Dachary wrote: >>>

Re: [ceph-users] Best practice K/M-parameters EC pool

2014-08-28 Thread Blair Bethwaite
Hi Loic, Thanks for the reply and interesting discussion. On 26 August 2014 23:25, Loic Dachary wrote: > Each time an OSD is lost, there is a 0.001*0.001 = 0.01% chance that two > other disks are lost before recovery. Since the disk that failed initialy > participates in 100 PG, that is 0.

Re: [ceph-users] what does monitor data directory include?

2014-08-28 Thread Joao Eduardo Luis
On 08/28/2014 02:21 PM, yuelongguang wrote: hi, joao,mark nelson, both of you. where monmap is stored? how to dump monitor's data in /var/lib/ceph/mon/ceph-cephosd1-mona/store.db/? thanks monmap is stored in the monitor's db (default leveldb @ store.db). In all likelihood you won't have just

Re: [ceph-users] Best practice K/M-parameters EC pool

2014-08-28 Thread Mike Dawson
On 8/28/2014 12:23 AM, Christian Balzer wrote: On Wed, 27 Aug 2014 13:04:48 +0200 Loic Dachary wrote: On 27/08/2014 04:34, Christian Balzer wrote: Hello, On Tue, 26 Aug 2014 20:21:39 +0200 Loic Dachary wrote: Hi Craig, I assume the reason for the 48 hours recovery time is to keep the co

[ceph-users] Unable to create swift type sub user in Rados Gateway :: Ceph Firefly 0.85

2014-08-28 Thread Karan Singh
Hello Cephers I have two problems both related to Rados gateway swift user creation on FIREFLY Ceph version 0.80.5 Centos 6.5 , Kernel 2.6.32.-338 Problem 1 : I am unable to create a sub user for swift in radios gateway. Below is the output ( Before upgrading from Emperor to Firefly i w

Re: [ceph-users] what does monitor data directory include?

2014-08-28 Thread yuelongguang
hi, joao,mark nelson, both of you. where monmap is stored? how to dump monitor's data in /var/lib/ceph/mon/ceph-cephosd1-mona/store.db/? thanks At 2014-08-28 09:00:41, "Mark Nelson" wrote: >On 08/28/2014 07:48 AM, yuelongguang wrote: >> hi,all >> what is in directory, /var/lib/ceph/mo

Re: [ceph-users] what does monitor data directory include?

2014-08-28 Thread Mark Nelson
On 08/28/2014 07:48 AM, yuelongguang wrote: hi,all what is in directory, /var/lib/ceph/mon/ceph-cephosd1-mona/store.db/ how to dump? where monmap is stored? That directory is typically a leveldb store, though potentially could be rocksdb or maybe something else after firefly. You can use the

[ceph-users] what does monitor data directory include?

2014-08-28 Thread yuelongguang
hi,all what is in directory, /var/lib/ceph/mon/ceph-cephosd1-mona/store.db/ how to dump? where monmap is stored? thanks___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] ceph can not repair itself after accidental power down, half of pgs are peering

2014-08-28 Thread yuelongguang
the next day it returnes to normal. i have no idea. At 2014-08-27 00:38:29, "Michael" wrote: How far out are your clocks? It's showing a clock skew, if they're too far out it can cause issues with cephx. Otherwise you're probably going to need to check your cephx auth keys. -Michael On

Re: [ceph-users] Best practice K/M-parameters EC pool

2014-08-28 Thread Loic Dachary
On 28/08/2014 06:23, Christian Balzer wrote: > On Wed, 27 Aug 2014 13:04:48 +0200 Loic Dachary wrote: > >> >> >> On 27/08/2014 04:34, Christian Balzer wrote: >>> >>> Hello, >>> >>> On Tue, 26 Aug 2014 20:21:39 +0200 Loic Dachary wrote: >>> Hi Craig, I assume the reason for the 48

Re: [ceph-users] Prioritize Heartbeat packets

2014-08-28 Thread Daniel Swarbrick
On 28/08/14 02:56, Sage Weil wrote: > I seem to remember someone telling me there were hooks/hints you could > call that would tag either a socket or possibly data on that socket with a > label for use by iptables and such.. but I forget what it was. > Something like setsockopt() SO_MARK?