Re: [ceph-users] Incomplete pgs and no data movement ( cluster appears readonly )

2018-01-10 Thread Brent Kennedy
-Original Message- From: Gregory Farnum [mailto:gfar...@redhat.com] Sent: Wednesday, January 10, 2018 3:15 PM To: Brent Kennedy Cc: Janne Johansson ; Ceph Users Subject: Re: [ceph-users] Incomplete pgs and no data movement ( cluster appears readonly ) On Wed, Jan 10, 2018 at 11:14 AM

Re: [ceph-users] Incomplete pgs and no data movement ( cluster appears readonly )

2018-01-10 Thread Gregory Farnum
r was a hot mess at that point, its possible it was > borked and therefore the pg is borked. I am trying to avoid deleting the > data as there is data in the OSDs that are online. > > > > -Brent > > > > > > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.

Re: [ceph-users] Incomplete pgs and no data movement ( cluster appears readonly )

2018-01-10 Thread Brent Kennedy
nnedy Sent: Wednesday, January 10, 2018 12:20 PM To: 'Janne Johansson' Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Incomplete pgs and no data movement ( cluster appears readonly ) I change “mon max pg per osd” to 5000 because when I changed it to zero, which was supposed to disa

Re: [ceph-users] Incomplete pgs and no data movement ( cluster appears readonly )

2018-01-10 Thread Brent Kennedy
:00 AM To: Brent Kennedy Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Incomplete pgs and no data movement ( cluster appears readonly ) 2018-01-10 8:51 GMT+01:00 Brent Kennedy mailto:bkenn...@cfl.rr.com> >: As per a previous thread, my pgs are set too high. I tried adj

Re: [ceph-users] Incomplete pgs and no data movement ( cluster appears readonly )

2018-01-10 Thread Janne Johansson
2018-01-10 8:51 GMT+01:00 Brent Kennedy : > As per a previous thread, my pgs are set too high. I tried adjusting the > “mon max pg per osd” up higher and higher, which did clear the > error(restarted monitors and managers each time), but it seems that data > simply wont move around the cluster.

Re: [ceph-users] Incomplete pgs on ceph which is partly on Bluestore

2017-11-14 Thread Ольга Ухина
Sorry, I've not mentioned, the ceph version is Luminous 12.2.1 С уважением, Ухина Ольга Моб. тел.: 8(905)-566-46-62 2017-11-14 15:30 GMT+03:00 Ольга Ухина : > Hi! I've a ceph installation where one host with OSDs are on Blustore and > three other are on FileStore, it worked till deleting this f

Re: [ceph-users] Incomplete PGs, how do I get them back without data loss?

2016-05-12 Thread george.vasilakakos
9:26 To: Vasilakakos, George (STFC,RAL,SC) Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Incomplete PGs, how do I get them back without data loss? On Wed, May 11, 2016 at 6:53 PM, wrote: > Hey Dan, > > This is on Hammer 0.94.5. osd.52 was always on a problematic machine and when

Re: [ceph-users] Incomplete PGs, how do I get them back without data loss?

2016-05-12 Thread Dan van der Ster
:28 > To: Vasilakakos, George (STFC,RAL,SC) > Cc: ceph-users@lists.ceph.com > Subject: Re: [ceph-users] Incomplete PGs, how do I get them back without data > loss? > > Hi George, > > Which version of Ceph is this? > I've never had incompete pgs stuck like this b

Re: [ceph-users] Incomplete PGs, how do I get them back without data loss?

2016-05-11 Thread george.vasilakakos
016 17:28 To: Vasilakakos, George (STFC,RAL,SC) Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Incomplete PGs, how do I get them back without data loss? Hi George, Which version of Ceph is this? I've never had incompete pgs stuck like this before. AFAIK it means that osd.52 would need t

Re: [ceph-users] Incomplete PGs, how do I get them back without data loss?

2016-05-11 Thread Dan van der Ster
Hi George, Which version of Ceph is this? I've never had incompete pgs stuck like this before. AFAIK it means that osd.52 would need to be brought up before you can restore those PGs. Perhaps you'll need ceph-objectstore-tool to help dump osd.52 and bring up its data elsewhere. A quick check on t

Re: [ceph-users] Incomplete PGs

2014-12-04 Thread Aaron Bassett
I have a small update to this: After an even closer reading of an offending pg's query I noticed the following: "peer": "4", "pgid": "19.6e", "last_update": "51072'48910307", "last_complete": "51072'48910307", "log_tail": "50495'48906592", The log tail seems to have lagged behind the last_updat

Re: [ceph-users] 'incomplete' PGs: what does it mean?

2014-09-07 Thread John Morris
Thanks Greg, that helped get the last stuck PGs back online, and everything looks normal again. Here's the promised post-mortem. It might contain only a little of value to developers, but certainly a bunch of face-palming for readers (and a big red face for me). This mess started during a r

Re: [ceph-users] 'incomplete' PGs: what does it mean?

2014-09-07 Thread John Morris
Thanks Greg, that helped get the last stuck PGs back online, and everything looks normal again. Here's the promised post-mortem. It might contain only a little of value to developers, but certainly a bunch of face-palming for readers (and a big red face for me). This mess started during a r

Re: [ceph-users] 'incomplete' PGs: what does it mean?

2014-08-29 Thread Gregory Farnum
Hmm, so you've got PGs which are out-of-date on disk (by virtue of being an older snapshot?) but still have records of them being newer in the OSD journal? That's a new failure node for me and I don't think we have any tools designed for solving. If you can *back up the disk* before doing this, I t

Re: [ceph-users] 'incomplete' PGs: what does it mean?

2014-08-28 Thread John Morris
Greg, thanks for the tips in both this and the BTRFS_IOC_SNAP_CREATE thread. They were enough to get PGs 'incomplete' due to 'not enough OSDs hosting' resolved by rolling back to a btrfs snapshot. I promise to write a full post-mortem (embarrassing as it will be) after the cluster is fully health

Re: [ceph-users] 'incomplete' PGs: what does it mean?

2014-08-27 Thread Gregory Farnum
On Tue, Aug 26, 2014 at 10:46 PM, John Morris wrote: > In the docs [1], 'incomplete' is defined thusly: > > Ceph detects that a placement group is missing a necessary period of > history from its log. If you see this state, report a bug, and try > to start any failed OSDs that may contain th