[ceph-users] Ceph metadata

2019-01-29 Thread F B
Hi, I'm looking for some details about the limits of the metadata used by Ceph. I founded some restrictions from XFS : - Max total keys/values size : 64 kB - Max key size : 255 bytes Does Ceph has limits for this metadata ? Thanks in advance ! Fabien BELLEGO

Re: [ceph-users] Ceph on Public IP

2018-01-09 Thread nithish B
Hello John, Thank you for the clarification. I am using Google cloud platform for this setup and I don't think I can assign a public ip directly to an interface there. Hence the question. Thanks On Jan 8, 2018 1:51 PM, "John Petrini" wrote: > ceph will always bind to the local IP. It can't bin

[ceph-users] Ceph on Public IP

2018-01-07 Thread nithish B
://docs.ceph.com/docs/ master/rados/configuration/network-config-ref/ but I still face the issue. Any directions in this regard will be helpful. Thanks & Regards, Nitish B. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/

[ceph-users] Failed to read JournalPointer - MDS error (mds rank 0 is damaged)

2017-04-29 Thread Martin B Nielsen
Hi, We're using ceph 10.2.5 and cephfs. We had a weird monitor (mon0r0) which had some sort of meltdown as current active mds node. The monitor node called elections on/off over ~1 hour, sometimes with 5-10min between. On every occasion mds was also doing a replay, reconnect, rejoin => active (

Re: [ceph-users] unauthorized to list radosgw swift container objects

2016-09-12 Thread B, Naga Venkata
If somebody hit this issue, this can be resolved by creating subuser as radosgw-admin subuser create --uid=s3User --subuser="s3User:swiftUser" --access=full Thanks & Regards, Naga Venkata From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of B, Naga Venkat

Re: [ceph-users] swiftclient call radosgw, it always response 401 Unauthorized

2016-09-12 Thread B, Naga Venkata
The parameters you configured for keystone in ceph.conf are correct? Can you provide your radosgw configuration in ceph.conf? And include radosgw.log after radosgw service restart and during swift list. Thanks & Regards, Naga Venkata From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On

[ceph-users] unauthorized to list radosgw swift container objects

2016-09-09 Thread B, Naga Venkata
Hi all, After upgrade from firefly(0.80.7) to hammer(0.94.7), I am unable to list objects in containers for radosgw swift user and I am able to list containers for the same user. I have created the user using radosgw-admin user create --subuser=s3User:swiftUser --display-name="First User" --ke

Re: [ceph-users] 403-Forbidden error using radosgw

2015-07-21 Thread B, Naga Venkata
Hi Lakshmi, Is your issues solved, can you please let me know if you solved this, bcoz I am also having same issue. Thanks & Regards, Naga Venkata ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.

Re: [ceph-users] 403-Forbidden error using radosgw

2015-06-30 Thread B, Naga Venkata
I am also having same issue can somebody help me out. But for me it is "HTTP/1.1 404 Not Found". ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] radosgw socket is not created

2015-06-23 Thread B, Naga Venkata
Then use http://docs.ceph.com/docs/v0.80.5/radosgw/config/, it will work for hammer also. Thanks & Regards, venkat From: Makkelie, R (ITCDCC) - KLM [mailto:ramon.makke...@klm.com] Sent: Tuesday, June 23, 2015 1:47 PM To: B, Naga Venkata Cc: ceph-users@lists.ceph.com Subject: Re: radosgw so

Re: [ceph-users] radosgw socket is not created

2015-06-22 Thread B, Naga Venkata
Follow this doc http://docs.ceph.com/docs/v0.80.5/radosgw/config/ if you are using firefly. Thanks & Regards, venkat From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Makkelie, R (ITCDCC) - KLM Sent: Monday, June 22, 2015 8:22 PM To: ceph-users@lists.ceph.com Subject: [ce

Re: [ceph-users] Hammer 0.94.2: Error when running commands on CEPH admin node

2015-06-18 Thread B, Naga Venkata
Do you have admin keyring in /etc/ceph directory? From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Teclus Dsouza -X (teclus - TECH MAHINDRA LIM at Cisco) Sent: Thursday, June 18, 2015 10:35 PM To: ceph-users@lists.ceph.com Subject: [ceph-users] Hammer 0.94.2: Error when ru

Re: [ceph-users] 403-Forbidden error using radosgw

2015-06-18 Thread B, Naga Venkata
I am also having same issue can somebody help me out. But for me it is "HTTP/1.1 404 Not Found". ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] query on ceph-deploy command

2015-06-11 Thread Vivek B
Hi, I am trying to deploy ceph-hammer on 4 nodes(admin, monitor and 2 OSD's). MY servers are behind a proxy server, so when I need to run an apt-get update I need to export our proxy server. When I run the command "ceph-deploy install osd1 osd2 mon1", since all three nodes are behind the proxy th

Re: [ceph-users] Find out the location of OSD Journal

2015-05-07 Thread Martin B Nielsen
Hi, Inside your mounted osd there is a symlink - journal - pointing to a file or disk/partition used with it. Cheers, Martin On Thu, May 7, 2015 at 11:06 AM, Patrik Plank wrote: > Hi, > > > i cant remember on which drive I install which OSD journal :-|| > Is there any command to show this? >

Re: [ceph-users] CephFS: delayed objects deletion ?

2015-03-16 Thread Florent B
On 03/16/2015 05:14 PM, John Spray wrote: > With CephFS we have a special definition of "old" that is anything > that doesn't have the very latest bug fixes ;-) > > There have definitely been fixes to stray file handling[1] between > giant and hammer. Since with giant you're using a version that i

Re: [ceph-users] Extreme slowness in SSD cluster with 3 nodes and 9 OSD with 3.16-3 kernel

2015-02-28 Thread Martin B Nielsen
the osds with Samsung journal drive compared with the Intel drive on the > same server. Something like 2-3ms for Intel vs 40-50ms for Samsungs. > > At some point we had enough with Samsungs and scrapped them. > > Andrei > > -- > > *From: *"

Re: [ceph-users] Extreme slowness in SSD cluster with 3 nodes and 9 OSD with 3.16-3 kernel

2015-02-28 Thread Martin B Nielsen
Hi, I cannot recognize that picture; we've been using samsumg 840 pro in production for almost 2 years now - and have had 1 fail. We run a 8node mixed ssd/platter cluster with 4x samsung 840 pro (500gb) in each so that is 32x ssd. They've written ~25TB data in avg each. Using the dd you had ins

[ceph-users] Radosgw keeps writing to specific OSDs wile there other free OSDs

2015-02-21 Thread B L
Hi Ceph community, I’m trying to upload some file with 5GB size, through radosgw, I have 9 OSDs deployed on 3 machines, and my cluster is healthy. The problem is: the 5GB file is being uploaded to osd.0 and osd.1 ,which are near full, while the other OSDs have more space that can have this file

[ceph-users] My PG is UP and Acting, yet it is unclean

2015-02-17 Thread B L
Hi All, I have a group of PGs that are up and acting, yet they are not clean, and causing the cluster to be in a warning mode, i.e. non-health. This is my cluster status: $ ceph -s cluster 17bea68b-1634-4cd1-8b2a-00a60ef4761d health HEALTH_WARN 203 pgs stuck unclean; recovery 6/132 obje

Re: [ceph-users] Having problem to start Radosgw

2015-02-16 Thread B L
so I had to suffer a little, since it was my first experience to install RGW and add it to the cluster. Now we can run it like: sudo service radosgw start — or — sudo /etc/init.d/radosgw start And everything should work .. Thanks Yehuda for your support .. Beanos! > On Feb 15, 2015,

Re: [ceph-users] Having problem to start Radosgw

2015-02-14 Thread B L
. > > Yehuda > > - Original Message - >> From: "B L" >> To: "Yehuda Sadeh-Weinraub" >> Cc: ceph-users@lists.ceph.com >> Sent: Saturday, February 14, 2015 2:56:54 PM >> Subject: Re: [ceph-users] Having problem to start Radosgw &g

Re: [ceph-users] Having problem to start Radosgw

2015-02-14 Thread B L
ithout using RGW) Best! > On Feb 15, 2015, at 12:39 AM, B L wrote: > > That’s what I usually do to check if rgw is running with no problems: sudo > radosgw -c ceph.conf -d > > I already pumped up the log level, but I can’t see any change or verbosity > level increase o

Re: [ceph-users] Having problem to start Radosgw

2015-02-14 Thread B L
w if I can do something more .. Now I have 2 questions: 1- what RADOS user you refer to? 2- How would I know that I use wrong cephx keys unless I see authentication error or relevant warning? Thanks! Beanos > On Feb 14, 2015, at 11:29 PM, Yehuda Sadeh-Weinraub wrote: > > > > F

Re: [ceph-users] Having problem to start Radosgw

2015-02-14 Thread B L
Hello Yehyda, The strace command you referred to me, shows this: https://gist.github.com/anonymous/8e9f1ced485996a263bb Additionally, I traced this log file: /var/log/radosgw/ceph-client.radosgw.gateway it has the following: 2015-02-12

Re: [ceph-users] Having problem to start Radosgw

2015-02-14 Thread B L
Shall I run it like this: sudo radosgw -c ceph.conf -d strace -F -T -tt -o/tmp/strace.out radosgw -f > On Feb 14, 2015, at 6:55 PM, Yehuda Sadeh-Weinraub wrote: > > strace -F -T -tt -o/tmp/strace.out radosgw -f ___ ceph-users mailing list ceph-users

Re: [ceph-users] Having problem to start Radosgw

2015-02-14 Thread B L
<https://gist.github.com/anonymous/90b77c168ed0606db03d> Please let me know if you need something else? Best! > On Feb 14, 2015, at 6:22 PM, Yehuda Sadeh-Weinraub wrote: > > > > - Original Message - >> From: "B L" >> To: ceph-users@lists.ceph.c

[ceph-users] Having problem to start Radosgw

2015-02-13 Thread B L
Hi all, I’m having a problem to start radosgw, giving me error that I can’t diagnose: $ radosgw -c ceph.conf -d 2015-02-14 07:46:58.435802 7f9d739557c0 0 ceph version 0.80.7 (6c0127fcb58008793d3c8b62d925bc91963672a3), process radosgw, pid 27609 2015-02-14 07:46:58.437284 7f9d739557c0 -1 asok(0

[ceph-users] Can't add RadosGW keyring to the cluster

2015-02-12 Thread B L
Hi all, Trying to do this: ceph -k ceph.client.admin.keyring auth add client.radosgw.gateway -i ceph.client.radosgw.keyring Getting this error: Error EINVAL: entity client.radosgw.gateway exists but key does not match What can this be?? Thanks! Beanos___

Re: [ceph-users] Placement Groups fail on fresh Ceph cluster installation with all OSDs up and in

2015-02-11 Thread B L
). > > > Best wishes, > Vickie > > 2015-02-10 22:25 GMT+08:00 B L <mailto:super.itera...@gmail.com>>: > Thanks for everyone!! > > After applying the re-weighting command (ceph osd crush reweight osd.0 > 0.0095), my cluster is getting healthy now :)) > &

Re: [ceph-users] Placement Groups fail on fresh Ceph cluster installation with all OSDs up and in

2015-02-10 Thread B L
t; Regards, > Vikhyat > > On 02/10/2015 07:31 PM, B L wrote: >> Thanks Vikhyat, >> >> As suggested .. >> >> ceph@ceph-node1:/home/ubuntu$ ceph osd crush reweight 0.0095 osd.0 >> >> Invalid command: osd.0 doesn't represent a float >> osd

Re: [ceph-users] Placement Groups fail on fresh Ceph cluster installation with all OSDs up and in

2015-02-10 Thread B L
Thanks Vikhyat, As suggested .. ceph@ceph-node1:/home/ubuntu$ ceph osd crush reweight 0.0095 osd.0 Invalid command: osd.0 doesn't represent a float osd crush reweight : change 's weight to in crush map Error EINVAL: invalid command What do you think > On Feb 10, 2015, at 3:18 PM, Vikhy

Re: [ceph-users] Placement Groups fail on fresh Ceph cluster installation with all OSDs up and in

2015-02-10 Thread B L
D. This mean, you OSD must be 10GB > or greater! > > > Udo > > Am 10.02.2015 12:22, schrieb B L: >> Hi Vickie, >> >> My OSD tree looks like this: >> >> ceph@ceph-node3:/home/ubuntu$ ceph osd tree >> # idweighttype nameup/downreweight >>

Re: [ceph-users] Placement Groups fail on fresh Ceph cluster installation with all OSDs up and in

2015-02-10 Thread B L
ose changes mean 2- How changing replication size can cause the cluster to be un healthy Thanks Vickie! Beanos > On Feb 10, 2015, at 1:28 PM, B L wrote: > > I changed the size and min_size as you suggested while opening the ceph -w on > a different window, and I got this: &g

Re: [ceph-users] Placement Groups fail on fresh Ceph cluster installation with all OSDs up and in

2015-02-10 Thread B L
11:23:40.769794 mon.0 [INF] pgmap v94: 256 pgs: 256 active+degraded; 0 bytes data, 200 MB used, 18165 MB / 18365 MB avail 2015-02-10 11:23:45.530713 mon.0 [INF] pgmap v95: 256 pgs: 256 active+degraded; 0 bytes data, 200 MB used, 18165 MB / 18365 MB avail > On Feb 10, 2015, at 1:24 PM, B L

Re: [ceph-users] Placement Groups fail on fresh Ceph cluster installation with all OSDs up and in

2015-02-10 Thread B L
I will try to change the replication size now as you suggested .. but how is that related to the non-healthy cluster? > On Feb 10, 2015, at 1:22 PM, B L wrote: > > Hi Vickie, > > My OSD tree looks like this: > > ceph@ceph-node3:/home/ubuntu$ ceph osd tree > # id we

Re: [ceph-users] Placement Groups fail on fresh Ceph cluster installation with all OSDs up and in

2015-02-10 Thread B L
..@gmail.com>>: > Hi Beanos: > So you have 3 OSD servers and each of them have 2 disks. > I have a question. What result of "ceph osd tree". Look like the osd status > is "down". > > > Best wishes, > Vickie > > 2015-02-10 19:00 GMT+08:00

Re: [ceph-users] Placement Groups fail on fresh Ceph cluster installation with all OSDs up and in

2015-02-10 Thread B L
08a-8022-6397c78032be osd.5 up in weight 1 up_from 22 up_thru 0 down_at 0 last_clean_interval [0,0) 172.31.3.56:6805/7019 172.31.3.56:6806/7019 172.31.3.56:6807/7019 172.31.3.56:6808/7019 exists,up da67b604-b32a-44a0-9920-df0774ad2ef3 > On Feb 10, 2015, at 12:55 PM, B L wrote: > >

Re: [ceph-users] Placement Groups fail on fresh Ceph cluster installation with all OSDs up and in

2015-02-10 Thread B L
> On Feb 10, 2015, at 12:37 PM, B L wrote: > > Hi Vickie, > > Thanks for your reply! > > You can find the dump in this link: > > https://gist.github.com/anonymous/706d4a1ec81c93fd1eca > <https://gist.github.com/anonymous/706d4a1ec81c93fd1eca> > >

[ceph-users] Placement Groups fail on fresh Ceph cluster installation with all OSDs up and in

2015-02-10 Thread B L
Having problem with my fresh non-healthy cluster, my cluster status summary shows this: ceph@ceph-node1:~$ ceph -s cluster 17bea68b-1634-4cd1-8b2a-00a60ef4761d health HEALTH_WARN 256 pgs incomplete; 256 pgs stuck inactive; 256 pgs stuck unclean; pool data pg_num 128 > pgp_num 64 m

Re: [ceph-users] error adding OSD to crushmap

2015-01-14 Thread Martin B Nielsen
Hi Luis, I might remember wrong, but don't you need to actually create the osd first? (ceph osd create) Then you can use assign it a position using cli crushrules. Like Jason said, can you send the ceph osd tree output? Cheers, Martin On Mon, Jan 12, 2015 at 1:45 PM, Luis Periquito wrote: >

Re: [ceph-users] Deleting buckets and objects fails to reduce reported cluster usage

2014-11-27 Thread b
On 2014-11-27 11:36, Yehuda Sadeh wrote: On Wed, Nov 26, 2014 at 3:49 PM, b wrote: On 2014-11-27 10:21, Yehuda Sadeh wrote: On Wed, Nov 26, 2014 at 3:09 PM, b wrote: On 2014-11-27 09:38, Yehuda Sadeh wrote: On Wed, Nov 26, 2014 at 2:32 PM, b wrote: I've been deleting a bucket

[ceph-users] S3CMD and Ceph

2014-11-26 Thread b
I'm having some issues with a user in ceph using S3 Browser and S3cmd It was previously working. I can no longer use s3cmd to list the contents of a bucket, i am getting 403 and 405 errors When using S3browser, I can see the contents of the bucket, I can upload files, but i cannot create addit

Re: [ceph-users] Deleting buckets and objects fails to reduce reported cluster usage

2014-11-26 Thread b
On 2014-11-27 11:36, Yehuda Sadeh wrote: On Wed, Nov 26, 2014 at 3:49 PM, b wrote: On 2014-11-27 10:21, Yehuda Sadeh wrote: On Wed, Nov 26, 2014 at 3:09 PM, b wrote: On 2014-11-27 09:38, Yehuda Sadeh wrote: On Wed, Nov 26, 2014 at 2:32 PM, b wrote: I've been deleting a bucket

Re: [ceph-users] Deleting buckets and objects fails to reduce reported cluster usage

2014-11-26 Thread b
On 2014-11-27 11:36, Yehuda Sadeh wrote: On Wed, Nov 26, 2014 at 3:49 PM, b wrote: On 2014-11-27 10:21, Yehuda Sadeh wrote: On Wed, Nov 26, 2014 at 3:09 PM, b wrote: On 2014-11-27 09:38, Yehuda Sadeh wrote: On Wed, Nov 26, 2014 at 2:32 PM, b wrote: I've been deleting a bucket

Re: [ceph-users] Deleting buckets and objects fails to reduce reported cluster usage

2014-11-26 Thread b
On 2014-11-27 10:21, Yehuda Sadeh wrote: On Wed, Nov 26, 2014 at 3:09 PM, b wrote: On 2014-11-27 09:38, Yehuda Sadeh wrote: On Wed, Nov 26, 2014 at 2:32 PM, b wrote: I've been deleting a bucket which originally had 60TB of data in it, with our cluster doing only 1 replication, the

Re: [ceph-users] Deleting buckets and objects fails to reduce reported cluster usage

2014-11-26 Thread b
On 2014-11-27 09:38, Yehuda Sadeh wrote: On Wed, Nov 26, 2014 at 2:32 PM, b wrote: I've been deleting a bucket which originally had 60TB of data in it, with our cluster doing only 1 replication, the total usage was 120TB. I've been deleting the objects slowly using S3 browser, and

[ceph-users] Deleting buckets and objects fails to reduce reported cluster usage

2014-11-26 Thread b
I've been deleting a bucket which originally had 60TB of data in it, with our cluster doing only 1 replication, the total usage was 120TB. I've been deleting the objects slowly using S3 browser, and I can see the bucket usage is now down to around 2.5TB or 5TB with duplication, but the usage i

Re: [ceph-users] SSD MTBF

2014-10-07 Thread Martin B Nielsen
A bit late getting back on this one. On Wed, Oct 1, 2014 at 5:05 PM, Christian Balzer wrote: > > smartctl states something like > > Wear = 092%, Hours = 12883, Datawritten = 15321.83 TB avg on those. I > > think that is ~30TB/day if I'm doing the calc right. > > > Something very much does not ad

Re: [ceph-users] SSD MTBF

2014-10-01 Thread Martin B Nielsen
Hi, We settled on Samsung pro 840 240GB drives 1½ year ago and we've been happy so far. We've over-provisioned them a lot (left 120GB unpartitioned). We have 16x 240GB and 32x 500GB - we've lost 1x 500GB so far. smartctl states something like Wear = 092%, Hours = 12883, Datawritten = 15321.83 TB

Re: [ceph-users] resizing the OSD

2014-09-09 Thread Martin B Nielsen
Hi, Or did you mean some OSD are near full while others are under-utilized? On Sat, Sep 6, 2014 at 5:04 PM, Christian Balzer wrote: > > Hello, > > On Fri, 05 Sep 2014 15:31:01 -0700 JIten Shah wrote: > > > Hello Cephers, > > > > We created a ceph cluster with 100 OSD, 5 MON and 1 MSD and most o

Re: [ceph-users] Huge issues with slow requests

2014-09-04 Thread Martin B Nielsen
Just echoing what Christian said. Also, iirc the "currently waiting for subobs on [" could also mean a problem on those as it waits for ack from them (I might remember wrong). If that is the case you might want to check in on osd 13 & 37 as well. With the cluster load and size you should not hav

Re: [ceph-users] SSD journal deployment experiences

2014-09-04 Thread Martin B Nielsen
On Thu, Sep 4, 2014 at 10:23 PM, Dan van der Ster wrote: > Hi Martin, > > September 4 2014 10:07 PM, "Martin B Nielsen" wrote: > > Hi Dan, > > > > We took a different approach (and our cluster is tiny compared to many > others) - we have two pools; > &

Re: [ceph-users] SSD journal deployment experiences

2014-09-04 Thread Martin B Nielsen
Hi Dan, We took a different approach (and our cluster is tiny compared to many others) - we have two pools; normal and ssd. We use 14 disks in each osd-server; 8 platter and 4 ssd for ceph, and 2 ssd for OS/journals. We partitioned the two OS ssd as raid1 using about half the space for the OS and

Re: [ceph-users] One stuck PG

2014-09-04 Thread Martin B Nielsen
Hi Erwin, Did you try and restart the primary osd for that pg (24) - sometimes it needs a little ..nudge that way. Otherwise what does ceph pg dump say about that pg? Cheers, Martin On Thu, Sep 4, 2014 at 9:00 AM, Erwin Lubbers wrote: > Hi, > > My cluster is giving one stuck pg which seems t

[ceph-users] Sometimes Monitors failed to join the cluster

2014-08-13 Thread Liucheng (B)
I deployed Ceph with chef, but sometimes Monitors failed to join the cluster. The setup steps: First, I deployed monitors in two hosts(lc001 and lc003) and I succeeded. Then, I added two Monitors (lc002 and lc004)to the cluster about 30 minutes later. I used the same ceph-cookbook,but three diffe

Re: [ceph-users] Ceph Not getting into a clean state

2014-05-09 Thread Martin B Nielsen
Hi, I experienced exactly the same with 14.04 and the 0.79 release. It was a fresh clean install with default crushmap and ceph-deploy install as pr. the quick-start guide. Oddly enough changing replica size (incl min_size) from 3 - 2 (and 2->1) and back again it worked. I didn't have time to l

Re: [ceph-users] Red Hat to acquire Inktank

2014-05-01 Thread Martin B Nielsen
First off, congrats to inktank! I'm sure having Redhat backing the project it will see even quicker development. My only worry is support for future non-RHEL platforms; like many others we've built our ceph stack around ubuntu and I'm just hoping it won't deteriorate into something like how it is

[ceph-users] radosgw return http 500 internal error

2014-04-08 Thread Liucheng (B)
Hello! I am a newer to ceph, and I setup ceph object gateway following the guide (http://ceph.com/docs/master/radosgw/config/). But when started the radosgw daemon,two warnings came out: WARNING: libcurl doesn't support curl_multi_wait() WARNING: cross zone / region transfer performance may be a

Re: [ceph-users] Debian 7 : fuse unable to resolve monitor hostname

2014-04-04 Thread Florent B
ed to FUSE. Patch included to my mail. Is it something that can be included in Ceph official sources ? On 04/04/2014 12:33 PM, Florent B wrote: > Hi all, > > My machines are all running Debian Wheezy. > > After a few days using kernel driver to mount my Ceph pools (with > bac

Re: [ceph-users] Live database files on Ceph

2014-04-04 Thread Martin B Nielsen
Hi, We're running mysql in multi-master cluster (galera), mysql standalones, postgresql, mssql and oracle db's on ceph RBD via QEMU/KVM. As someone else pointed out it is usually faster with ceph, but sometimes you'll get some odd slow reads. Latency is our biggest enemy. Oracle comes with an aw

Re: [ceph-users] MDS crash when client goes to sleep

2014-04-03 Thread Florent B
tp://ceph.com > > > On Wed, Apr 2, 2014 at 6:34 AM, Florent B wrote: >> Can someone confirm that this issue is also in Emperor release (0.72) ? >> >> I think I have the same problem than hjcho616 : Debian Wheezy with 3.13 >> backports, and MDS dying when a client shutd

Re: [ceph-users] Setting root directory in fstab with Fuse

2014-04-03 Thread Florent B
tion here instead. If you search the list > archives it'll probably turn up; this is basically the only reason we > ever discuss "-r". ;) > Software Engineer #42 @ http://inktank.com | http://ceph.com > > > On Wed, Apr 2, 2014 at 5:10 AM, Florent B wrote: >> Hi

Re: [ceph-users] MDS debugging

2014-03-31 Thread Martin B Nielsen
14-03-31 13:36:38.049286 mon.0 [INF] pgmap v265872: 1300 pgs: 1300 > active+clean; 19872 GB data, 59953 GB used, 74117 GB / 130 TB avail; 4069 > B/s rd, 363 kB/s wr, 24 op/s > 2014-03-31 13:36:39.057680 mon.0 [INF] pgmap v265873: 1300 pgs: 1300 > active+clean; 19872 GB data, 59953 GB used

Re: [ceph-users] help, add mon failed lead to cluster failure

2014-03-26 Thread Martin B Nielsen
Hi, I experienced this from time to time with older releases of ceph, but haven't stumbled upon it for some time. Often I had to revert to the older state by using: http://ceph.com/docs/master/rados/operations/add-or-rm-mons/#removing-monitors-from-an-unhealthy-cluster and dump the monlist, find

Re: [ceph-users] OSD Restarts cause excessively high load average and "requests are blocked > 32 sec"

2014-03-23 Thread Martin B Nielsen
Hi, I can see ~17% hardware interrupts which I find a little high - can you make sure all load is spread over all your cores (/proc/interrupts)? What about disk util once you restart them? Are they all 100% utilized or is it 'only' mostly cpu-bound? Also you're running a monitor on this node - h

Re: [ceph-users] Fluctuating I/O speed degrading over time

2014-03-08 Thread Martin B Nielsen
x27;ll try accessing the ticket monday to get all the details if it is still there. Cheers, Martin > > Looking forward to your reply, thank you. > > Cheers. > > > On Fri, Mar 7, 2014 at 6:10 PM, Martin B Nielsen wrote: > >> Hi, >> >> I'd probably start b

Re: [ceph-users] Fluctuating I/O speed degrading over time

2014-03-07 Thread Martin B Nielsen
Hi, I'd probably start by looking at your nodes and check if the SSDs are saturated or if they have high write access times. If any of that is true, does that account for all SSD or just some of them? Maybe some of the disks needs a trim. Maybe test them individually directly on the cluster. If y

Re: [ceph-users] Trying to rescue a lost quorum

2014-03-01 Thread Martin B Nielsen
se versions... or > am I reading this wrong? > > KR, > Marc > > On 28/02/2014 01:32, Gregory Farnum wrote: > > On Thu, Feb 27, 2014 at 4:25 PM, Marc wrote: > >> Hi, > >> > >> I was handed a Ceph cluster that had just lost quorum due to 2/3 mons > >

Re: [ceph-users] questions about monitor data and ceph recovery

2014-02-25 Thread Martin B Nielsen
Hi Pavel, Will try and answer some of your questions: My first question will be about monitor data directory. How much space I > need to reserve for it? Can monitor-fs be corrupted if monitor goes out of > storage space? > We have about 20GB partitions for monitors - they really don't use much s

Re: [ceph-users] pages stuck unclean (but remapped)

2014-02-23 Thread Martin B Nielsen
p v1904820: 1500 pgs, 1 pools, 10531 GB data, 2670 kobjects > 18708 GB used, 26758 GB / 45467 GB avail > 42959/5511127 objects degraded (0.779%) > 1481 active+clean > 19 active+remapped+backfilling > client io 1457 B/s wr, 0 o

Re: [ceph-users] can one slow hardisk slow whole cluster down?

2014-01-29 Thread Martin B Nielsen
Hi, At least it used to be like that - I'm not sure if that has changed. I believe this is also part why it is adviced to go with the same kind of hw and setup if possible. Since at least rbd images are spread in objects throughout the cluster you'll prob. have to wait for a slow disk when readin

Re: [ceph-users] many blocked requests when recovery

2013-12-09 Thread Martin B Nielsen
Hi, You didn't state what version of ceph or kvm/qemu you're using. I think it wasn't until qemu 1.5.0 (1.4.2+?) that an async patch from inktank was accepted into mainstream which significantly helps in situations like this. If not using that on top of not limiting recovery threads you'll prob.

Re: [ceph-users] Big or small node ?

2013-11-20 Thread Martin B Nielsen
Hi, I'd almost always go with more lesser beefy nodes than bigger ones. You're much more vulnerable if the big one(s) die and replication will not impact your cluster as much. I also find it easier to extend a cluster with smaller nodes. At least it feels like you can increase in more smooth rate

Re: [ceph-users] HDD bad sector, pg inconsistent, no object remapping

2013-11-13 Thread Martin B Nielsen
Probably common sense but I was bitten by this once in a likewise situation.. If you run 3x replica and distribute them over 3x hosts (is that default now?) make sure that the disks on the host with the failed disk have space for it - the remaining two disks will have to hold the content of the fa

Re: [ceph-users] SSD question

2013-10-21 Thread Martin B Nielsen
Hi, Plus reads will still come from your non-SSD disks unless you're using something like flashcache in front and as Greg said, having much more IOPS available for your db often makes a difference (depending on load, usage etc ofc). We're using Samsung Pro 840 256GB pretty much like Martin descri

Re: [ceph-users] Ceph with high disk densities?

2013-10-07 Thread Martin B Nielsen
Hi Scott, Just some observations from here. We run 8 nodes, 2U units with 12x OSD each (4x 500GB ssd, 8x 4TB platter) attached to 2x LSI 2308 cards. Each node uses an intel E5-2620 with 32G mem. Granted, we only have like 25 VM (some fairly io-hungry, both iops and throughput-wise though) on tha

Re: [ceph-users] Hardware recommendations

2013-08-26 Thread Martin B Nielsen
Hi Shain, Those R515 seem to mimic our servers (2U supermicro w. 12x 3.5" bays and 2x 2.5" in the rear for OS). Since we need a mix of SSD & platter we have 8x 4TB drives and 4x 500GB SSD + 2x 250GB SSD for OS in each node (2x 8-port LSI 2308 in IT-mode) We've partitioned 10GB from each 4x 500GB

Re: [ceph-users] performance questions

2013-08-20 Thread Martin B Nielsen
Hi Jeff, I would be surprised as well - we initially tested on a 2-replica cluster with 8 nodes having 12 osd each - and went to 3-replica as we re-built the cluster. The performance seems to be where I'd expect it (doing consistent writes in a rbd VM @ ~400MB/sec on 10GbE which I'd expect is eit

Re: [ceph-users] Ceph instead of RAID

2013-08-13 Thread Martin B Nielsen
Hi, I'd just like to echo what Wolfgang said about ceph being a complex system. I initially started out testing ceph with a setup much like yours. And while it overall performed ok, it was not as good as sw raid on the same machine. Also, as Mark said you'll have at very best half write speeds b

Re: [ceph-users] Rebuild the monitor infrastructure

2013-04-23 Thread Martin B Nielsen
Hi Bryan, I asked the same question a few months ago: http://lists.ceph.com/pipermail/ceph-users-ceph.com/2013-February/000221.html But basically, that is pretty bad; you'll be stuck on your own and would need to get in contact with Inktank - they might be able to help rebuild a monitor for you.

Re: [ceph-users] Format 2 Image support in the RBD driver

2013-04-18 Thread Olivier B.
If I well understand the roadmap ( http://tracker.ceph.com/projects/ceph/roadmap ), it's planed for Ceph v0.62B : Le jeudi 18 avril 2013 à 09:28 -0400, Whelan, Ryan a écrit : > I've not been following the list for long, so forgive me if this has been > covered, but is there a plan for image 2 su

Re: [ceph-users] Ceph error: active+clean+scrubbing+deep

2013-04-16 Thread Martin B Nielsen
Hi Kakito, You def. _want_ scrubbing to happen! http://ceph.com/docs/master/rados/configuration/osd-config-ref/#scrubbing If you feel it kills your system you can tweak some of the values; like: osd scrub load threshold osd scrub max interval osd deep scrub interval I have no experience in chan

Re: [ceph-users] How to calculate the capacity of a ceph cluster

2013-03-13 Thread Martin B Nielsen
Hi Ashish, Yep, that would be the correct way to do it. If you already have a cluster running, a ceph -s will also show usage, ie like: >ceph -s pgmap v1842777: 8064 pgs: 8064 active+clean; 1069 GB data, 2144 GB used, 7930 GB / 10074 GB avail; 3569B/s wr, 0op/s This is a small test-cluster with

Re: [ceph-users] debug_osd on/off on an active ceph cluster

2013-03-07 Thread Martin B Nielsen
Hi Charles, http://ceph.com/docs/master/rados/configuration/ceph-conf/#ceph-runtime-config has a great example. For all daemons of a type use * ( ceph osd tell \* injectargs '--debug-osd 20 --debug-ms 1' ) More about loglevels here: http://ceph.com/docs/master/rados/configuration/ceph-conf/#logs

Re: [ceph-users] Using different storage types on same osd hosts?

2013-03-06 Thread Martin B Nielsen
Hi, We did the opposite here; adding some SSD in free slots after having a normal cluster running with SATA. We just created a new pool for them and separated the two types. I used this as a template: http://ceph.com/docs/master/rados/operations/crush-map/?highlight=ssd#placing-different-pools-on

Re: [ceph-users] pgs inconsistent, scrub errors

2013-02-25 Thread Martin B Nielsen
## ceph -w >health HEALTH_ERR 1 pgs inconsistent; 1 pgs stuck unclean; recovery > 1/4240325 degraded (0.000%); 1 scrub errors >monmap e1: 3 mons at > {a=172.16.0.25:6789/0,b=172.16.0.24:6789/0,c=172.16.0.27:6789/0}, > election epoch 38, quorum 0,1,2 a,b,c >osdmap e102