Re: [ceph-users] Upgrading 2K OSDs from Hammer to Jewel. Our experience

2017-03-12 Thread Matyas Koszik
On Sat, 11 Mar 2017, Udo Lembke wrote: > On 11.03.2017 12:21, cephmailingl...@mosibi.nl wrote: > > ... > > > > > > e) find /var/lib/ceph/ ! -uid 64045 -print0|xargs -0 chown ceph:ceph > > ... the 'find' in step e found so much files that xargs (the shell) > > could not handle it (too many a

Re: [ceph-users] pgs stuck unclean

2017-02-17 Thread Matyas Koszik
he PG exists here on this replica. > > I'd guess you only have 3 hosts and are trying to place all your > replicas on independent boxes. Bobtail tunables have trouble with that > and you're going to need to pay the cost of moving to more modern > ones. > -Greg > > On

Re: [ceph-users] pgs stuck unclean

2017-02-17 Thread Matyas Koszik
tions/crush-map/#editing-a-crush-map > > - https://github.com/ceph/ceph/blob/master/doc/man/8/crushtool.rst > > On Sat, Feb 18, 2017 at 5:25 AM, Matyas Koszik wrote: > > > > I have size=2 and 3 independent nodes. I'm happy to try firefly tunables, > > but

Re: [ceph-users] pgs stuck unclean

2017-02-17 Thread Matyas Koszik
le with that > and you're going to need to pay the cost of moving to more modern > ones. > -Greg > > On Fri, Feb 17, 2017 at 5:30 AM, Matyas Koszik wrote: > > > > > > I'm not sure what variable should I be looking at exactly, but after > > readi

Re: [ceph-users] pgs stuck unclean

2017-02-17 Thread Matyas Koszik
I'd guess you only have 3 hosts and are trying to place all your > > replicas on independent boxes. Bobtail tunables have trouble with that > > and you're going to need to pay the cost of moving to more modern > > ones. > > -Greg > > > > On Fri, Feb 17

Re: [ceph-users] pgs stuck unclean

2017-02-17 Thread Matyas Koszik
/atw.hu/~koszik/ceph/ceph-osd.26.log.6245 Matyas On Fri, 17 Feb 2017, Tomasz Kuzemko wrote: > If the PG cannot be queried I would bet on OSD message throttler. Check with > "ceph --admin-daemon PATH_TO_ADMIN_SOCK perf dump" on each OSD which is > holding this PG if message th

[ceph-users] pgs stuck unclean

2017-02-16 Thread Matyas Koszik
Hi, It seems that my ceph cluster is in an erroneous state of which I cannot see right now how to get out of. The status is the following: health HEALTH_WARN 25 pgs degraded 1 pgs stale 26 pgs stuck unclean 25 pgs undersized recovery 23578/9450442 objects degr

Re: [ceph-users] Monitor question

2016-07-07 Thread Matyas Koszik
Hi, That error message is normal, it just says your monitor is down (which it is). If you have added the second monitor in your ceph.conf, then it'll try contacting that, and if it's up and reachable, this will succeed, so after that scary error message you should see the normal reply as well. T

Re: [ceph-users] layer3 network

2016-07-07 Thread Matyas Koszik
ronment. > > On 7 July 2016 at 11:36, Nick Fisk wrote: > > > > -Original Message- > > > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf > > Of Matyas Koszik > > > Sent: 07 July 2016 11:26 > > > To: ceph-users@l

[ceph-users] layer3 network

2016-07-07 Thread Matyas Koszik
Hi, My setup uses a layer3 network, where each node has two connections (/31s), equipped with a loopback address and redundancy is provided via OSPF. In this setup it is important to use the loopback address as source for outgoing connections, since the interface addresses are not protected from

Re: [ceph-users] cluster failing to recover

2016-07-05 Thread Matyas Koszik
e. It was like this: http://pastebin.com/UjSjVsJ0 Matyas On Tue, 5 Jul 2016, Sean Redmond wrote: > Hi, > > What happened to the missing 2 OSD's? > > 53 osds: 51 up, 51 in > > Thanks > > On Tue, Jul 5, 2016 at 4:04 PM, Matyas Koszik wrote: > > > >

Re: [ceph-users] cluster failing to recover

2016-07-05 Thread Matyas Koszik
571 Gelnhausen > > HRB 93402 beim Amtsgericht Hanau > Geschäftsführung: Oliver Dzombic > > Steuer Nr.: 35 236 3622 1 > UST ID: DE274086107 > > > Am 03.07.2016 um 23:59 schrieb Matyas Koszik: > > > > Hi, > > > > I've continued restart

[ceph-users] cluster failing to recover

2016-07-03 Thread Matyas Koszik
Hi, I recently upgraded to jewel (10.2.2) and now I'm confronted with a rather strange behavior: recovey does not progress in the way it should. If I restart the osds on a host, it'll get a bit better (or worse), like this: 50 pgs undersized recovery 43775/7057285 objects degraded (0.620%) recov