[ceph-users] 转发: how to fix the mds damaged issue

2016-07-03 Thread Lihang
root@BoreNode2:~# ceph -v ceph version 10.2.0 发件人: lihang 12398 (RD) 发送时间: 2016年7月3日 14:47 收件人: ceph-users@lists.ceph.com 抄送: Ceph Development; 'uker...@gmail.com'; zhengbin 08747 (RD); xusangdi 11976 (RD) 主题: how to fix the mds damaged issue Hi, my ceph cluster mds is damaged and the cluster is

Re: [ceph-users] How many nodes/OSD can fail

2016-07-03 Thread Willi Fehler
Hello David, so in a 3 node Cluster how should I set min_size if I want that 2 nodes could fail? Regards - Willi Am 28.06.16 um 13:07 schrieb David: Hi, This is probably the min_size on your cephfs data and/or metadata pool. I believe the default is 2, if you have less than 2 replicas ava

Re: [ceph-users] How many nodes/OSD can fail

2016-07-03 Thread Sean Redmond
It would need to be set to 1 On 3 Jul 2016 8:17 a.m., "Willi Fehler" wrote: > Hello David, > > so in a 3 node Cluster how should I set min_size if I want that 2 nodes > could fail? > > Regards - Willi > > Am 28.06.16 um 13:07 schrieb David: > > Hi, > > This is probably the min_size on your cephfs

Re: [ceph-users] How many nodes/OSD can fail

2016-07-03 Thread Willi Fehler
Hello Sean, I've powered down 2 nodes. So 6 of 9 OSD are down. But my client can't write and read anymore from my Ceph mount. Also 'ceph -s' hangs. pool 1 'cephfs_data' replicated size 3 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 300 pgp_num 300 last_change 447 flags hashpspool c

Re: [ceph-users] How many nodes/OSD can fail

2016-07-03 Thread Tu Holmes
Where are your mon nodes? Were you mixing mon and OSD together? Are 2 of the mon nodes down as well? On Jul 3, 2016 12:53 AM, "Willi Fehler" wrote: > Hello Sean, > > I've powered down 2 nodes. So 6 of 9 OSD are down. But my client can't > write and read anymore from my Ceph mount. Also 'ceph -s

Re: [ceph-users] How many nodes/OSD can fail

2016-07-03 Thread Willi Fehler
Hello Tu, yes that's correct. The mon nodes run as well on the OSD nodes. So I have 3 nodes in total. OSD, MDS and Mon on each Node. Regards - Willi Am 03.07.16 um 09:56 schrieb Tu Holmes: Where are your mon nodes? Were you mixing mon and OSD together? Are 2 of the mon nodes down as well?

Re: [ceph-users] How many nodes/OSD can fail

2016-07-03 Thread Sean Redmond
Hi, You will need 2 mons to be online. Thanks On 3 Jul 2016 8:58 a.m., "Willi Fehler" wrote: > Hello Tu, > > yes that's correct. The mon nodes run as well on the OSD nodes. So I have > > 3 nodes in total. OSD, MDS and Mon on each Node. > > Regards - Willi > > Am 03.07.16 um 09:56 schrieb Tu Hol

Re: [ceph-users] How many nodes/OSD can fail

2016-07-03 Thread Willi Fehler
Hello Sean, great. Thank you for your feedback. Have a nice sunday. Regards - Willi Am 03.07.16 um 10:00 schrieb Sean Redmond: Hi, You will need 2 mons to be online. Thanks On 3 Jul 2016 8:58 a.m., "Willi Fehler" > wrote: Hello Tu, yes that's cor

Re: [ceph-users] How many nodes/OSD can fail

2016-07-03 Thread Tu Holmes
I am kind of a newbie but I thought you needed 2 mons working at a minimum. You should split those away onto some really budget hardware. //Tu Hello Tu, yes that's correct. The mon nodes run as well on the OSD nodes. So I have 3 nodes in total. OSD, MDS and Mon on each Node. Regards - Willi A

[ceph-users] Ceph Rebalance Issue

2016-07-03 Thread Roozbeh Shafiee
Hi list, A few days ago one of my OSDs failed and I dropped out that but afterwards I got HEALTH_WARN until now. After turing off the OSD, the self-healing system started to rebalance data between other OSDs. My question is: At the end of rebalancing, the process doesn’t complete and I get this

Re: [ceph-users] Ceph Rebalance Issue

2016-07-03 Thread Wido den Hollander
> Op 3 juli 2016 om 10:34 schreef Roozbeh Shafiee : > > > Hi list, > > A few days ago one of my OSDs failed and I dropped out that but afterwards I > got > HEALTH_WARN until now. After turing off the OSD, the self-healing system > started > to rebalance data between other OSDs. > > My questi

Re: [ceph-users] Ceph Rebalance Issue

2016-07-03 Thread Roozbeh Shafiee
Thanks for quick response, Wido the "ceph -s" output has pasted here: http://pastie.org/10897747 and this is output of “ceph health detail”: http://pastebin.com/vMeURWC9 Thank you > On Jul 3, 2016, at 1:10 PM, Wido den Hollander wrote: > > >> Op 3 juli 2016 om 10:34 schreef Roozbeh Shafiee :

Re: [ceph-users] Ceph Rebalance Issue

2016-07-03 Thread Wido den Hollander
> Op 3 juli 2016 om 10:50 schreef Roozbeh Shafiee : > > > Thanks for quick response, Wido > > the "ceph -s" output has pasted here: > http://pastie.org/10897747 > > and this is output of “ceph health detail”: > http://pastebin.com/vMeURWC9 > It seems the cluster is still backfilling PGs and

Re: [ceph-users] Ceph Rebalance Issue

2016-07-03 Thread Roozbeh Shafiee
Yes, you’re right but I have 0 object/s recovery last night. when I changed pg/pgp from 1400 to 2048, rebalancing speeded up but the percentage of rebalancing backed to 53%. I have this situation again n again since I dropped out failed OSD when I increase pg/pgp but each time rebalancing stopp

Re: [ceph-users] Ceph Rebalance Issue

2016-07-03 Thread Wido den Hollander
> Op 3 juli 2016 om 11:02 schreef Roozbeh Shafiee : > > > Yes, you’re right but I have 0 object/s recovery last night. when I changed > pg/pgp from 1400 > to 2048, rebalancing speeded up but the percentage of rebalancing backed to > 53%. > Why did you change that? I would not change that val

Re: [ceph-users] Ceph Rebalance Issue

2016-07-03 Thread Roozbeh Shafiee
Actually I tried all the ways which I found them on Ceph Docs and mailing lists but non of them had no effect. As a last resort I changed pg/pgp. Anyway… What can I do as the best way to solve this problem? Thanks > On Jul 3, 2016, at 1:43 PM, Wido den Hollander wrote: > > >> Op 3 juli 2016

[ceph-users] RADOSGW buckets via NFS?

2016-07-03 Thread Sean Redmond
Hi, I noticed in the jewel release notes: "You can now access radosgw buckets via NFS (experimental)." Are there any docs that explain the configuration of NFS to access RADOSGW buckets? Thanks ___ ceph-users mailing list ceph-users@lists.ceph.com htt

[ceph-users] cluster failing to recover

2016-07-03 Thread Matyas Koszik
Hi, I recently upgraded to jewel (10.2.2) and now I'm confronted with a rather strange behavior: recovey does not progress in the way it should. If I restart the osds on a host, it'll get a bit better (or worse), like this: 50 pgs undersized recovery 43775/7057285 objects degraded (0.620%) recov

Re: [ceph-users] cluster failing to recover

2016-07-03 Thread Oliver Dzombic
Hi, please provide: ceph health detail ceph osd tree -- Mit freundlichen Gruessen / Best regards Oliver Dzombic IP-Interactive mailto:i...@ip-interactive.de Anschrift: IP Interactive UG ( haftungsbeschraenkt ) Zum Sonnenberg 1-3 63571 Gelnhausen HRB 93402 beim Amtsgericht Hanau Geschäftsf

Re: [ceph-users] cluster failing to recover

2016-07-03 Thread Oliver Dzombic
Hi, did you already do something ( replacing drives or changing something ) ? You have 11 scrub errors, and ~ 11x inconsistent pg's The inconsistent pg's, for example: pg 4.3a7 is stuck unclean for 629.766502, current state active+recovery_wait+degraded+inconsistent, last acting [10,21] are no

Re: [ceph-users] Fwd: Ceph installation and integration with Openstack

2016-07-03 Thread Gaurav Goyal
Dear All, I need your kind help please. I am new and want to understand the ceph installation concept as per my lab setup. Regards Gaurav Goyal On 02-Jul-2016 7:27 pm, "Gaurav Goyal" wrote: > Dear Ceph Users, > > I am very new to Ceph product and want to gain some knowledge for my lab > setup.

[ceph-users] Fwd: Ceph installation and integration with Openstack

2016-07-03 Thread Gaurav Goyal
Dear Ceph Users, I am very new to Ceph product and want to gain some knowledge for my lab setup. Situation is --> I have installed openstack setup (Liberty) for my lab. Host 1 --> Controller + Compute1 Host 2 --> Compute 2 DELL SAN storage is attached to both hosts as [root@OSKVM1 ~]# iscsiad

Re: [ceph-users] RADOSGW buckets via NFS?

2016-07-03 Thread Brad Hubbard
On Sun, Jul 3, 2016 at 9:07 PM, Sean Redmond wrote: > Hi, > > I noticed in the jewel release notes: > > "You can now access radosgw buckets via NFS (experimental)." > > Are there any docs that explain the configuration of NFS to access RADOSGW > buckets? Here's what I found. http://tracker.ceph.

[ceph-users] Active MON aborts on Jewel 10.2.2 with FAILED assert(info.state == MDSMap::STATE_STANDBY

2016-07-03 Thread Bill Sharer
I was working on a rolling upgrade on Gentoo to Jewel 10.2.2 from 10.2.0. However now I can't get a monitor quorum going again because as soon as I get one, the mon which wins the election blows out with an assertion failure. Here's my status at the moment kroll110.2.2ceph mon.0 and

[ceph-users] Radosgw performance degradation

2016-07-03 Thread Andrey Komarov
Hi guys. I am currently encountering strange problem with radosgw. My setup is: 3 mons, 40OSD 4Tb hdd + 8G SSD journal each on 4 servers and 2 RGW. I have rgw_override_bucket_index_max_shards = 2048 in config and 100 buckets with ~3M objects each. Objects are relatively small: from 1k to 100k. S