On 2024/11/17 15:20, Gregory Orange wrote:
On 17/11/24 19:44, Roland Giesler wrote:
I cannot see any option that allows me to disable mclock...
It's not so much disabling mclock as changing the op queue scheduler to
use wpq instead of it.
https://docs.ceph.com/en/reef/rados/configuration/osd-c
Glad you’re sorted out. I had a feeling it was a function of not being able to
satisfy pool / rule constraints.
> On Nov 18, 2024, at 1:58 AM, Roland Giesler wrote:
>
> On 2024/11/17 18:12, Anthony D'Atri wrote:
>> I see 5 OSDs with 0 CRUSH weight, is that intentional?
>
> Yes, I set the wei
On 2024/11/17 18:12, Anthony D'Atri wrote:
I see 5 OSDs with 0 CRUSH weight, is that intentional?
Yes, I set the weight to 0 to ensure all the pg's are removed from them
them since I'm removing them (worn out ssd's)
I think I found the problem. I had created a CRUSH rule called old_ssd
(a
I see 5 OSDs with 0 CRUSH weight, is that intentional?
Notably:
> All the problem pg's are on osd.39.
osd.39 has 0 CRUSH weight, so CRUSH shouldn’t be placing any PGs there. Yet
there appear to be PGs mapped to the 4x 0 weight OSDs that are up. I had hoped
that the health detail would show
On 17/11/24 19:44, Roland Giesler wrote:
> On 2024/11/16 18:38, Anthony D'Atri wrote:
>> Disabling mclock as described here
>> https://docs.ceph.com/en/reef/rados/configuration/mclock-config-ref/ might
>> help
>
> I cannot see any option that allows me to disable mclock...
It's not so much disab
On 2024/11/16 18:38, Anthony D'Atri wrote:
Disabling mclock as described here
https://docs.ceph.com/en/reef/rados/configuration/mclock-config-ref/ might
help
I cannot see any option that allows me to disable mclock...
Also, you have a small cluster with a bunch of small OSDs. Please
send
All the problem pg's are on osd.39. When I stop osd.39, it shows 86
pg's would be offline. However, there is no recovery that happens. It
just stays there. 86 undersized+remapped+peered
I managed to pin down all the pg groups that are in this state by using:
ceph pg dump | grep active+clea
On 2024/11/15 13:00, Gregory Orange wrote:
On 15/11/24 17:11, Roland Giesler wrote:
How do I determine the primary osd?
ceph pg map $pg
ceph pg $pg query | jq .info.stats.acting_primary
You can jq and less to take a look at other values which might be
informative too.
Ah, of course :-) Sor
On 15/11/24 17:11, Roland Giesler wrote:
> How do I determine the primary osd?
ceph pg map $pg
ceph pg $pg query | jq .info.stats.acting_primary
You can jq and less to take a look at other values which might be
informative too.
Greg.
___
ceph-users ma
How do I determine the primary osd?
On 2024/11/14 16:12, Anthony D'Atri wrote:
You might also first try
ceph osd down 1701
This marks the OSD down in the map, it doesn’t restart anything, but it does
serve in some cases to goose progress. The OSD will quickly mark itself back
You might also first try
ceph osd down 1701
This marks the OSD down in the map, it doesn’t restart anything, but it does
serve in some cases to goose progress. The OSD will quickly mark itself back
up.
Where 1701 is the ID of said primary.
ceph health detail
I redid what I did before. I changed the osd class and had a look at
the storage and found a VM image! I deleted it (it was a copy) and now
the storage is empty. I guess the image (assigned to a non-existent
storage after reverting left pg's that could not be moved. I'm
reverting now again
Hi Roland,
Yes, you can. See mclock documentation here [1].
One think I can think of is that these 113 PGs may have a common misbehaving
OSD (primary or not) with a ridiculous osd_mclock_max_capacity_iops_ssd value
set.
Restarting the primary and/or adjusting osd_mclock_max_capacity_iops_ssd
v
On 2024/11/14 11:44, Joachim Kraftmayer wrote:
I know the similar behaviour when mclock is active.
For osd.0 I see:
osd.0 basic osd_mclock_max_capacity_iops_ssd 14305.161403
I'm unfamiliar with mclock. Can one tune that to improve the situation?
Roland
Joachim
joachim.kra
I know the similar behaviour when mclock is active.
Joachim
joachim.kraftma...@clyso.com
www.clyso.com
Hohenzollernstr. 27, 80801 Munich
Utting a. A. | HR: Augsburg | HRB: 25866 | USt. ID-Nr.: DE2754306
Roland Giesler schrieb am Do., 14. Nov. 2024, 05:40:
> On 2024/11/13 21:05, Anthony D'
It's not clear to me if you wanted to add some more details after "I
see this:" (twice).
So you do see backfilling traffic if you out the OSD? Then maybe the
remapped PGs are not even on that OSD? Have you checked 'ceph pg ls
remapped'?
To drain an OSD, you can either set it "out" as you alre
I had attached images, but these are not shown...
On 2024/11/14 10:12, Roland Giesler wrote:
On 2024/11/14 09:37, Eugen Block wrote:
Remapped PGs is exactly what to expect after removing (or adding) a
device class. Did you revert the change entirely? It sounds like you
maybe forgot to add the
On 2024/11/14 09:37, Eugen Block wrote:
Remapped PGs is exactly what to expect after removing (or adding) a
device class. Did you revert the change entirely? It sounds like you
maybe forgot to add the original device class back to the OSD where
you changed it? Maybe share 'ceph osd tree'? Do yo
Remapped PGs is exactly what to expect after removing (or adding) a
device class. Did you revert the change entirely? It sounds like you
maybe forgot to add the original device class back to the OSD where
you changed it? Maybe share 'ceph osd tree'? Do you have recovery IO
(ceph -s)? Does t
On 2024/11/13 21:05, Anthony D'Atri wrote:
I would think that there was some initial data movement and that it all went
back when you reverted. I would not expect a mess.
data:
volumes: 1/1 healthy
pools: 7 pools, 1586 pgs
objects: 5.79M objects, 12 TiB
usage: 24 TiB use
I would think that there was some initial data movement and that it all went
back when you reverted. I would not expect a mess.
> On Nov 13, 2024, at 12:48 PM, Roland Giesler wrote:
>
> I created a new osd class and changed the class of an osd to the new one
> without taking the osd out and s
21 matches
Mail list logo