-Original Message-
From: Gregory Farnum [mailto:gfar...@redhat.com]
Sent: Wednesday, January 10, 2018 3:15 PM
To: Brent Kennedy
Cc: Janne Johansson ; Ceph Users
Subject: Re: [ceph-users] Incomplete pgs and no data movement ( cluster appears
readonly )
On Wed, Jan 10, 2018 at 11:14 AM
r was a hot mess at that point, its possible it was
> borked and therefore the pg is borked. I am trying to avoid deleting the
> data as there is data in the OSDs that are online.
>
>
>
> -Brent
>
>
>
>
>
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.
nnedy
Sent: Wednesday, January 10, 2018 12:20 PM
To: 'Janne Johansson'
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Incomplete pgs and no data movement ( cluster appears
readonly )
I change “mon max pg per osd” to 5000 because when I changed it to zero, which
was supposed to disa
:00 AM
To: Brent Kennedy
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Incomplete pgs and no data movement ( cluster appears
readonly )
2018-01-10 8:51 GMT+01:00 Brent Kennedy mailto:bkenn...@cfl.rr.com> >:
As per a previous thread, my pgs are set too high. I tried adj
2018-01-10 8:51 GMT+01:00 Brent Kennedy :
> As per a previous thread, my pgs are set too high. I tried adjusting the
> “mon max pg per osd” up higher and higher, which did clear the
> error(restarted monitors and managers each time), but it seems that data
> simply wont move around the cluster.
As per a previous thread, my pgs are set too high. I tried adjusting the
"mon max pg per osd" up higher and higher, which did clear the
error(restarted monitors and managers each time), but it seems that data
simply wont move around the cluster. If I stop the primary OSD of an
incomplete pg, the
Sorry, I've not mentioned, the ceph version is Luminous 12.2.1
С уважением,
Ухина Ольга
Моб. тел.: 8(905)-566-46-62
2017-11-14 15:30 GMT+03:00 Ольга Ухина :
> Hi! I've a ceph installation where one host with OSDs are on Blustore and
> three other are on FileStore, it worked till deleting this f
Hi! I've a ceph installation where one host with OSDs are on Blustore and
three other are on FileStore, it worked till deleting this first host with
all Bluestore OSDs and then these OSDs were back completely clean. Ceph
remapped and I ended up with 19 pgs inactive and 19 incomplete. Primary
OSDs f
9:26
To: Vasilakakos, George (STFC,RAL,SC)
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Incomplete PGs, how do I get them back without data
loss?
On Wed, May 11, 2016 at 6:53 PM, wrote:
> Hey Dan,
>
> This is on Hammer 0.94.5. osd.52 was always on a problematic machine and when
:28
> To: Vasilakakos, George (STFC,RAL,SC)
> Cc: ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] Incomplete PGs, how do I get them back without data
> loss?
>
> Hi George,
>
> Which version of Ceph is this?
> I've never had incompete pgs stuck like this b
016 17:28
To: Vasilakakos, George (STFC,RAL,SC)
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Incomplete PGs, how do I get them back without data
loss?
Hi George,
Which version of Ceph is this?
I've never had incompete pgs stuck like this before. AFAIK it means
that osd.52 would need t
Hi George,
Which version of Ceph is this?
I've never had incompete pgs stuck like this before. AFAIK it means
that osd.52 would need to be brought up before you can restore those
PGs.
Perhaps you'll need ceph-objectstore-tool to help dump osd.52 and
bring up its data elsewhere. A quick check on t
Hi folks,
I've got a serious issue with a Ceph cluster that's used for RBD.
There are 4 PGs stuck in an incomplete state and I'm trying to repair this
problem to no avail.
Here's ceph status:
health HEALTH_WARN
4 pgs incomplete
4 pgs stuck inactive
4 pgs stuc
I have a small update to this:
After an even closer reading of an offending pg's query I noticed the following:
"peer": "4",
"pgid": "19.6e",
"last_update": "51072'48910307",
"last_complete": "51072'48910307",
"log_tail": "50495'48906592",
The log tail seems to have lagged behind the last_updat
Hi all, I have a problem with some incomplete pgs. Here’s the backstory: I had
a pool that I had accidently left with a size of 2. On one of the ods nodes,
the system hdd started to fail and I attempted to rescue it by sacrificing one
of my osd nodes. That went ok and I was able to bring the nod
Thanks Greg, that helped get the last stuck PGs back online, and
everything looks normal again.
Here's the promised post-mortem. It might contain only a little of
value to developers, but certainly a bunch of face-palming for readers
(and a big red face for me).
This mess started during a r
Thanks Greg, that helped get the last stuck PGs back online, and
everything looks normal again.
Here's the promised post-mortem. It might contain only a little of
value to developers, but certainly a bunch of face-palming for readers
(and a big red face for me).
This mess started during a r
Hmm, so you've got PGs which are out-of-date on disk (by virtue of being an
older snapshot?) but still have records of them being newer in the OSD
journal?
That's a new failure node for me and I don't think we have any tools
designed for solving. If you can *back up the disk* before doing this, I
t
Greg, thanks for the tips in both this and the BTRFS_IOC_SNAP_CREATE
thread. They were enough to get PGs 'incomplete' due to 'not enough
OSDs hosting' resolved by rolling back to a btrfs snapshot. I promise
to write a full post-mortem (embarrassing as it will be) after the
cluster is fully health
On Tue, Aug 26, 2014 at 10:46 PM, John Morris wrote:
> In the docs [1], 'incomplete' is defined thusly:
>
> Ceph detects that a placement group is missing a necessary period of
> history from its log. If you see this state, report a bug, and try
> to start any failed OSDs that may contain th
In the docs [1], 'incomplete' is defined thusly:
Ceph detects that a placement group is missing a necessary period of
history from its log. If you see this state, report a bug, and try
to start any failed OSDs that may contain the needed information.
However, during an extensive review of l
21 matches
Mail list logo