Re: [ceph-users] resizing the OSD

2014-09-05 Thread Alexandre DERUMIER
>>Is there a way to resize the OSD without bringing the cluster down? What is the HEALTH state of your cluster ? If it's OK, simply replace the osd disk by a bigger one ? - Mail original - De: "JIten Shah" À: ceph-us...@ceph.com Envoyé: Samedi 6 Septembre 2014 00:31:01 Objet: [c

Re: [ceph-users] Ceph Filesystem - Production?

2014-09-05 Thread JIten Shah
We ran into the same issue where we could not mount the filesystem on the clients because it had 3.9. Once we upgraded the kernel on the client node, we were able to mount it fine. FWIW, you need kernel 3.14 and above. --jiten On Sep 5, 2014, at 6:55 AM, James Devine wrote: > No messages in d

[ceph-users] resizing the OSD

2014-09-05 Thread JIten Shah
Hello Cephers, We created a ceph cluster with 100 OSD, 5 MON and 1 MSD and most of the stuff seems to be working fine but we are seeing some degrading on the osd's due to lack of space on the osd's. Is there a way to resize the OSD without bringing the cluster down? --jiten ___

Re: [ceph-users] region creation is failing

2014-09-05 Thread John Wilkins
librados indicates communication between radosgw and the Ceph Storage Cluster. So the authentication error is likely due to the key you have set up using this procedure: http://ceph.com/docs/master/radosgw/federated-config/#create-a-keyring Check to see if you have the keys you generated imported

[ceph-users] ceph add flag hashspool

2014-09-05 Thread Frantisek Drabecky
Hi guys, I have ceph storage firefly v. 0.8.5 on Debian server with kernel 3.2.0-4-amd64. We use ceph-fs client on Debian server with 3.2.0.4-amd64 kernel too. I had to remove HASHSPOOL flag from all pools and so it would by posible to mount ceph on my ceph client. Everything worked good. Toda

Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS

2014-09-05 Thread Wang, Warren
+1 to what Cedric said. Anything more than a few minutes of heavy sustained writes tended to get our solid state devices into a state where garbage collection could not keep up. Originally we used small SSDs and did not overprovision the journals by much. Manufacturers publish their SSD stats,

[ceph-users] Good way to monitor detailed latency/throughput

2014-09-05 Thread Josef Johansson
Hi, How do you guys monitor the cluster to find disks that behave bad, or VMs that impact the Ceph cluster? I'm looking for something where I could get a good bird-view of latency/throughput, that uses something easy like SNMP. Regards, Josef Johansson ___

Re: [ceph-users] Fwd: Ceph Filesystem - Production?

2014-09-05 Thread James Devine
No messages in dmesg, I've updated the two clients to 3.16, we'll see if that fixes this issue. On Fri, Sep 5, 2014 at 12:28 AM, Yan, Zheng wrote: > On Fri, Sep 5, 2014 at 8:42 AM, James Devine wrote: > > I'm using 3.13.0-35-generic on Ubuntu 14.04.1 > > > > Was there any kernel message when t

Re: [ceph-users] Need help : MDS cluster completely dead !

2014-09-05 Thread Florent Bautista
On 09/05/2014 02:16 PM, Yan, Zheng wrote: > On Fri, Sep 5, 2014 at 4:05 PM, Florent Bautista wrote: >> Firefly :) last release. >> >> After few days, second MDS is still "stopping" and consuming CPU >> sometimes... :) > Try restarting the stopping MDS and run "ceph mds stop 1" again. "service ce

Re: [ceph-users] Need help : MDS cluster completely dead !

2014-09-05 Thread Yan, Zheng
On Fri, Sep 5, 2014 at 4:05 PM, Florent Bautista wrote: > Firefly :) last release. > > After few days, second MDS is still "stopping" and consuming CPU > sometimes... :) Try restarting the stopping MDS and run "ceph mds stop 1" again. > > On 09/04/2014 09:13 AM, Yan, Zheng wrote: >> which versio

Re: [ceph-users] SSD journal deployment experiences

2014-09-05 Thread Dan Van Der Ster
> On 05 Sep 2014, at 11:04, Christian Balzer wrote: > > > Hello Dan, > > On Fri, 5 Sep 2014 07:46:12 + Dan Van Der Ster wrote: > >> Hi Christian, >> >>> On 05 Sep 2014, at 03:09, Christian Balzer wrote: >>> >>> >>> Hello, >>> >>> On Thu, 4 Sep 2014 14:49:39 -0700 Craig Lewis wrote: >

Re: [ceph-users] Huge issues with slow requests

2014-09-05 Thread Luis Periquito
Only time I saw such behaviour was when I was deleting a big chunk of data from the cluster: all the client activity was reduced, the op/s were almost non-existent and there was unjustified delays all over the cluster. But all the disks were somewhat busy in atop/iotstat. On 5 September 2014 09:5

Re: [ceph-users] SSD journal deployment experiences

2014-09-05 Thread Dan Van Der Ster
> On 05 Sep 2014, at 10:30, Nigel Williams wrote: > > On Fri, Sep 5, 2014 at 5:46 PM, Dan Van Der Ster > wrote: >>> On 05 Sep 2014, at 03:09, Christian Balzer wrote: >>> You might want to look into cache pools (and dedicated SSD servers with >>> fast controllers and CPUs) in your test cluster

Re: [ceph-users] SSD journal deployment experiences

2014-09-05 Thread Christian Balzer
Hello Dan, On Fri, 5 Sep 2014 07:46:12 + Dan Van Der Ster wrote: > Hi Christian, > > > On 05 Sep 2014, at 03:09, Christian Balzer wrote: > > > > > > Hello, > > > > On Thu, 4 Sep 2014 14:49:39 -0700 Craig Lewis wrote: > > > >> On Thu, Sep 4, 2014 at 9:21 AM, Dan Van Der Ster > >> wrote

Re: [ceph-users] Huge issues with slow requests

2014-09-05 Thread David
Hi, Indeed strange. That output was when we had issues, seems that most operations were blocked / slow requests. A ”baseline” output is more like today: 2014-09-05 10:44:29.123681 mon.0 [INF] pgmap v12582759: 6860 pgs: 6860 active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB avail; 92

Re: [ceph-users] SSD journal deployment experiences

2014-09-05 Thread Nigel Williams
On Fri, Sep 5, 2014 at 5:46 PM, Dan Van Der Ster wrote: >> On 05 Sep 2014, at 03:09, Christian Balzer wrote: >> You might want to look into cache pools (and dedicated SSD servers with >> fast controllers and CPUs) in your test cluster and for the future. >> Right now my impression is that there i

[ceph-users] region creation is failing

2014-09-05 Thread Santhosh Fernandes
Hi All, I am trying to configure Ceph with 2 OSD, one MON, One ADMIN, and One ObjectGW nodes. radosgw-admin region set --infile in.json --name client.radosgw.in-east-1 2014-09-05 13:48:45.133983 7f7dda4c57c0 0 librados: client.radosgw.in-east-1 authentication error (1) Operation not permitted co

Re: [ceph-users] Need help : MDS cluster completely dead !

2014-09-05 Thread Florent Bautista
Firefly :) last release. After few days, second MDS is still "stopping" and consuming CPU sometimes... :) On 09/04/2014 09:13 AM, Yan, Zheng wrote: > which version of MDS are you using? > > On Wed, Sep 3, 2014 at 10:48 PM, Florent Bautista wrote: >> Hi John and thank you for your answer. >> >> I

Re: [ceph-users] SSD journal deployment experiences

2014-09-05 Thread Dan Van Der Ster
Hi Christian, > On 05 Sep 2014, at 03:09, Christian Balzer wrote: > > > Hello, > > On Thu, 4 Sep 2014 14:49:39 -0700 Craig Lewis wrote: > >> On Thu, Sep 4, 2014 at 9:21 AM, Dan Van Der Ster >> wrote: >> >>> >>> >>> 1) How often are DC S3700's failing in your deployments? >>> >> >> None

Re: [ceph-users] [Ceph-community] Ceph Day Paris Schedule Posted

2014-09-05 Thread Loic Dachary
I think it will be english, unless the audience is 100% french speaking ;-) On 05/09/2014 08:12, Alexandre DERUMIER wrote: > I was waiting for the schedule, topics seem to be interesting. > > I'm going to register now :) > > BTW, are the speeches in french or english? (As I see loic,sebastian an

[ceph-users] ceph osd unexpected error

2014-09-05 Thread 廖建锋
Dear CEPH , Urgent question, I met a "FAILED assert(0 == "unexpected error")" yesterday , Now i have not way to start this OSDS I have attached my logs in the attachment, and some ceph configurations as below osd_pool_default_pgp_num = 300 osd_pool_default_size = 2 osd_pool_default_mi

Re: [ceph-users] Huge issues with slow requests

2014-09-05 Thread Christian Balzer
Hello, On Fri, 5 Sep 2014 08:26:47 +0200 David wrote: > Hi, > > Sorry for the lack of information yesterday, this was "solved" after > some 30 minutes, after having reloaded/restarted all osd daemons. > Unfortunately we couldn’t pin point it to a single OSD or drive, all > drives seemed ok, som