[ceph-users] adding osd node best practice

2015-03-07 Thread tombo
Hi guys, I have few question regarding adding another osd node to cluster. I already have in production cluster with 7 mon, 72 osd, we are using mainly librados to interact with saved objects in ceph. Our osds are 3TB WD discs and they reside on two servers (36 osds per server) so long story

Re: [ceph-users] adding osd node best practice

2015-03-08 Thread tombo
Hi On 08.03.2015 04:32, Anthony D'Atri wrote: > 1) That's an awful lot of mons. Are they VM's or something? My sense is that mons >5 have diminishing returns at best. We have application cluster and ceph as storage solution, cluster consists of six servers, so we've installed monitor on e

[ceph-users] increase pg num

2015-03-10 Thread tombo
Hello, I'm running debian 8 with ceph 0.80.6-1 firefly in production and I need to double the count of pgs. I've found that it was experimantal feature, is it safe now? Is there still --allow-experimental-feature switch for ceph osd pool set {pool-name} pg_num {pg_num} command ? Thanks

Re: [ceph-users] increase pg num

2015-03-10 Thread tombo
Thanks for reply, On 10.03.2015 19:52, Weeks, Jacob (RIS-BCT) wrote: > I am not sure about v0.80.6-1 but in v0.80.7 the --allow-experimental-feature option is not required. I have increased pg_num and pgp_num in v0.80.7 without any issues. on how big cluster, how long did it take to recove

[ceph-users] osd replication

2015-03-12 Thread tombo
Hello, I need to understand how replication is accomplished or who is taking care of replication, osd itsef? Because we are using librados to read/write to cluster. If librados is not doing parallel writes according desired number of object copies, it could happen that objects are in journal

[ceph-users] rados object latency

2015-04-07 Thread tombo
Hi guys, I'm invastigating rados object latency ( with /usr/bin/time -f"%e" rados -p chunks get $chk /dev/shm/test/test.file ). Objects are around 7MB +-1MB on size. Results shows that 0.50% of object are fetched from cluster in 1-4 seconds and the rest of objects are good, below 1sec (test is

[ceph-users] cephx error - renew key

2015-06-09 Thread tombo
Hello guys, today we had one storage (19xosd) down for 4 hours and now we are observing different problems and when I tried to restart one osd, I got error related to cephx 2015-06-09 21:09:49.983522 7fded00c7700 0 auth: could not find secret_id=6238 2015-06-09 21:09:49.983585 7fded00c7700