Hi folks,
I've got a serious issue with a Ceph cluster that's used for RBD.
There are 4 PGs stuck in an incomplete state and I'm trying to repair this
problem to no avail.
Here's ceph status:
health HEALTH_WARN
4 pgs incomplete
4 pgs stuck inactive
4 pgs stuc
Hey Dan,
This is on Hammer 0.94.5. osd.52 was always on a problematic machine and when
this happened had less data on its local disk than the other OSDs. I've tried
adapting that blog post's solution to this situation to no avail.
I've tried things like looking at all probing OSDs in the query
What exactly do you mean by log? As in a journal of the actions taken or
logging done by a daemon.
I'm making the same guess but I'm not sure what else I can try at this point.
The PG I've been working on actively reports it needs to probe 4 OSDs (the new
set and the old primary) which are all u
Hey Ceph folks,
I was wondering what the current status/roadmap/intentions etc. are on the
possibility of providing a way of transitioning a cluster from IPv4 to IPv6 in
the future.
My current understanding is that this not possible at the moment and that one
should deploy initially with the v
I don't think you can do that, it would require running a mixed cluster which,
going by the docs, doesn't seem to be supported.
From: Jake Young [jak3...@gmail.com]
Sent: 27 June 2017 22:42
To: Wido den Hollander; ceph-users@lists.ceph.com; Vasilakakos, George
(ST
Hey Wido,
Thanks for your suggestion. It sounds like the process might be feasible but
I'd be looking for an "official" thing to do to a production cluster. Something
that's documented ceph.com/docs, tested and "endorsed" if you will by the Ceph
team.
We could try this on a pre-prod environmen
> I don't think either. I don't think there is another way then just 'hacky'
> changing the MONMaps. There have been talks of being able to make Ceph
> dual-stack, but I don't think there is any code in the source right now.
Yeah, that's what I'd like to know. What do the Ceph team think of prov
Good to know. Frankly, the RGW isn’t my major concern at the moment, it seems
to be able to handle things well enough.
It’s the RBD/CephFS side of things for one cluster where we will eventually
need to support IPv6 clients but will not necessarily be able to switch
everyone to IPv6 in one go.
Hey folks,
We have a cluster that's currently backfilling from increasing PG counts. We
have tuned recovery and backfill way down as a "precaution" and would like to
start tuning it to bring up to a good balance between that and client I/O.
At the moment we're in the process of bumping up PG nu
Thanks for your response David.
What you've described has been what I've been thinking about too. We have 1401
OSDs in the cluster currently and this output is from the tail end of the
backfill for +64 PG increase on the biggest pool.
The problem is we see this cluster do at most 20 backfills a
@Christian: I think the slow tail end is just the fact that there is contention
for the same OSDs.
@David: Yes that's what I did, used shell/awk/python to grab and compare the
set of OSDs locked for backfilling versus the ones waiting.
From: Christian Balz
Hey folks,
I'm staring at a problem that I have found no solution for and which is causing
major issues.
We've had a PG go down with the first 3 OSDs all crashing and coming back only
to crash again with the following error in their logs:
-1> 2017-08-22 17:27:50.961633 7f4af4057700 -1 osd.
No, nothing like that.
The cluster is in the process of having more OSDs added and, while that was
ongoing, one was removed because the underlying disk was throwing up a bunch of
read errors.
Shortly after, the first three OSDs in this PG started crashing with error
messages about corrupted EC
Hi ceph-users,
We have a Ceph cluster (running Kraken) that is exhibiting some odd behaviour.
A couple weeks ago, the LevelDBs on some our OSDs started growing large (now at
around 20G size).
The one thing they have in common is the 11 disks with inflating LevelDBs are
all in the set for one PG
From: Gregory Farnum [gfar...@redhat.com]
Sent: 06 December 2017 22:50
To: David Turner
Cc: Vasilakakos, George (STFC,RAL,SC); ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Sudden omap growth on some OSDs
On Wed, Dec 6, 2017 at 2:35 PM David Turner
mailto:d
From: Gregory Farnum [gfar...@redhat.com]
Sent: 07 December 2017 21:57
To: Vasilakakos, George (STFC,RAL,SC)
Cc: drakonst...@gmail.com; ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Sudden omap growth on some OSDs
On Thu, Dec 7, 2017 at 4:41 AM
mailto:geo
On 11 Dec 2017, at 18:24, Gregory Farnum
mailto:gfar...@redhat.com>> wrote:
Hmm, this does all sound odd. Have you tried just restarting the primary OSD
yet? That frequently resolves transient oddities like this.
If not, I'll go poke at the kraken source and one of the developers more
familiar
From: Gregory Farnum
Date: Tuesday, 12 December 2017 at 19:24
To: "Vasilakakos, George (STFC,RAL,SC)"
Cc: "ceph-users@lists.ceph.com"
Subject: Re: [ceph-users] Sudden omap growth on some OSDs
On Tue, Dec 12, 2017 at 3:16 AM
mailto:george.vasilaka...@stfc.ac.uk>> wrote:
On 11 Dec 2017, at 18:2
Hi Greg,
I have re-introduced the OSD that was taken out (the one that used to be a
primary). I have kept debug 20 logs from both the re-introduced primary and the
outgoing primary. I have used ceph-post-file to upload these, tag:
5b305f94-83e2-469c-a301-7299d2279d94
Hope this helps, let me kn
Hi Ceph folks,
I’ve just posted a bug report http://tracker.ceph.com/issues/18508
I have a cluster (Jewel 10.2.3, SL7) that has trouble creating PGs in EC pools.
Essentially, I’ll get a lot of CRUSH_ITEM_NONE (2147483647) in there and PGs
will stay in peering states. This sometimes affects oth
Hi Ceph folks,
I have a cluster running Jewel 10.2.5 using a mix EC and replicated pools.
After rebooting a host last night, one PG refuses to complete peering
pg 1.323 is stuck inactive for 73352.498493, current state peering, last acting
[595,1391,240,127,937,362,267,320,7,634,716]
Restartin
Hi Corentin,
I've tried that, the primary hangs when trying to injectargs so I set the
option in the config file and restarted all OSDs in the PG, it came up with:
pg 1.323 is remapped+peering, acting
[595,1391,2147483647,127,937,362,267,320,7,634,716]
Still can't query the PG, no error messag
Hi Greg,
> Yes, "bad crc" indicates that the checksums on an incoming message did
> not match what was provided — ie, the message got corrupted. You
> shouldn't try and fix that by playing around with the peering settings
> as it's not a peering bug.
> Unless there's a bug in the messaging layer c
Hey Greg,
Thanks for your quick responses. I have to leave the office now but I'll look
deeper into it tomorrow to try and understand what's the cause of this. I'll
try to find other peerings between these two hosts and check those OSDs' logs
for potential anomalies. I'll also have a look at an
OK, I've had a look.
Haven't been able to take a proper look at the network yet but here's what I've
gathered on other fronts so far:
* Marking either osd.595 or osd.7 out results in this:
$ ceph health detail | grep -v stuck | grep 1.323
pg 1.323 is remapped+peering, acting
[2147483647,1391,2
Hi Brad,
I'll be doing so later in the day.
Thanks,
George
From: Brad Hubbard [bhubb...@redhat.com]
Sent: 13 February 2017 22:03
To: Vasilakakos, George (STFC,RAL,SC); Ceph Users
Subject: Re: [ceph-users] PG stuck peering after host reboot
I'd suggest cr
Hi folks,
I have just made a tracker for this issue: http://tracker.ceph.com/issues/18960
I used ceph-post-file to upload some logs from the primary OSD for the troubled
PG.
Any help would be appreciated.
If we can't get it to peer, we'd like to at least get it unstuck, even if it
means data l
Hi Wido,
In an effort to get the cluster to complete peering that PG (as we need to be
able to use our pool) we have removed osd.595 from the CRUSH map to allow a new
mapping to occur.
When I left the office yesterday osd.307 had replaced osd.595 in the up set but
the acting set had CRUSH_ITEM
On 17/02/2017, 12:00, "Wido den Hollander" wrote:
>
>> Op 17 februari 2017 om 11:09 schreef george.vasilaka...@stfc.ac.uk:
>>
>>
>> Hi Wido,
>>
>> In an effort to get the cluster to complete peering that PG (as we need to
>> be able to use our pool) we have removed osd.595 from the CRUSH ma
Hi Wido,
Just to make sure I have everything straight,
> If the PG still doesn't recover do the same on osd.307 as I think that 'ceph
> pg X query' still hangs?
> The info from ceph-objectstore-tool might shed some more light on this PG.
You mean run the objectstore command on 307, not remove
> Can you for the sake of redundancy post your sequence of commands you
> executed and their output?
[root@ceph-sn852 ~]# systemctl stop ceph-osd@307
[root@ceph-sn852 ~]# ceph-objectstore-tool --data-path
/var/lib/ceph/osd/ceph-307 --op info --pgid 1.323
PG '1.323' not found
[root@ceph-sn852 ~]#
I have noticed something odd with the ceph-objectstore-tool command:
It always reports PG X not found even on healthly OSDs/PGs. The 'list' op works
on both and unhealthy PGs.
From: ceph-users [ceph-users-boun...@lists.ceph.com] on behalf of
george.vasil
Brad Hubbard pointed out on the bug tracker
(http://tracker.ceph.com/issues/18960) that, for EC, we need to add the shard
suffix to the PGID parameter in the command, e.g. --pgid 1.323s0
The command now works and produces the same output as PG query.
To avoid spamming the list I've put the outpu
So what I see there is this for osd.307:
"empty": 1,
"dne": 0,
"incomplete": 0,
"last_epoch_started": 0,
"hit_set_history": {
"current_last_update": "0'0",
"history": []
}
}
last_epoch_started is 0 and empty is 1. The other OSDs are reporting
last_epoch_st
Since we need this pool to work again, we decided to take the data loss and try
to move on.
So far, no luck. We tried a force create but, as expected, with a PG that is
not peering this did absolutely nothing.
We also tried rm-past-intervals and remove from ceph-objectstore-tool and
manually de
Hello Ceph folk,
We have a Ceph cluster (info at bottom) with some odd omap directory sizes in
our OSDs.
We're looking at 1439 OSDs were the most common omap sizes are 15-40MB.
However a quick sampling reveals some outliers, looking at around the 100
largest omaps one can see sizes go to a few
> Your RGW buckets, how many objects in them, and do they have the index
> sharded?
> I know we have some very large & old buckets (10M+ RGW objects in a
> single bucket), with correspondingly large OMAPs wherever that bucket
> index is living (sufficently large that trying to list the entire thin
Hi Wido,
I see your point. I would expect OMAPs to grow with the number of objects but
multiple OSDs getting to multiple tens of GBs for their omaps seems excessive.
I find it difficult to believe that not sharding the index for a bucket of 500k
objects in RGW causes the 10 largest OSD omaps to
Hi Greg,
> This does sound weird, but I also notice that in your earlier email you
> seemed to have only ~5k PGs across ~1400 OSDs, which is a pretty
> low number. You may just have a truly horrible PG balance; can you share
> more details (eg ceph osd df)?
Our distribution is pretty bad, we're
For future reference,
You can reset your keyring's permissions using a keyring located on the
monitors at /var/lib/ceph/mon/your-mon/keyring. Specify the -k option
for the ceph command and the full path to the keyring and you can
correct this without having to restart the cluster a couple of times
40 matches
Mail list logo