> Thank you for your explanations and references. I will check them all. In the
> meantime it turned out that the disks for Ceph will come from SAN
Be prepared for the additional latency and amplified network traffic.
> Probably in this case the per OSD CPU cores can be lowered to 2 CPU/OSD. B
Hello,
a cluster from my friend has an strage issue with CephFS Permission.
Ceph Version is 17.2.7 in cephadm.
The Cluster was much older and was upgraded over years from 12 and moved to
new hosts.
When I create a cephfs user for my private cluster with "ceph fs authorize" I
get permission like
We’re running Quincy 17.2.7 here,
And we see the iops benchmark performed on osd start:
2024-09-20T09:57:26.265+1000 7facbdc64540 1 osd.196 2879010
maybe_override_max_osd_capacity_for_qos osd bench result - bandwidth (MiB/sec):
4.369 iops: 1118.445 elapsed_sec: 2.682
2024-09-20T09:57:26.265+100
Ah, yes, that's a good point - if there's backfill going on then
buildup like this can happen.
On Thu, Sep 19, 2024 at 10:08 AM Konstantin Shalygin wrote:
>
> Hi,
>
> On 19 Sep 2024, at 18:26, Joshua Baergen wrote:
>
> Whenever we've seen osdmaps not being trimmed, we've made sure that
> any dow
Hi,
> On 19 Sep 2024, at 18:26, Joshua Baergen wrote:
>
> Whenever we've seen osdmaps not being trimmed, we've made sure that
> any down OSDs are out+destroyed, and then have rolled a restart
> through the mons. As of recent Pacific at least this seems to have
> reliably gotten us out of this si
Whenever we've seen osdmaps not being trimmed, we've made sure that
any down OSDs are out+destroyed, and then have rolled a restart
through the mons. As of recent Pacific at least this seems to have
reliably gotten us out of this situation.
Josh
On Thu, Sep 19, 2024 at 9:14 AM Igor Fedotov wrote
Here it goes beyond of my expertise.
I saw unbounded osdmap epoch growth for two completely different cases.
And unable to say what's causing it this time.
But IMO you shouldn't do any osdmap trimming yourself - that could
likely result in an unpredictable behavior. So I'd encourage you to fi
Hi,
I didn't notice any changes in the counts after running the check --fix |
check --check-objects --fix. Also the bucket isn't versioned.
I will take a look at the index vs the radoslist. Which side would cause
the 'invalid_multipart_entries"?
Thanks
On Thu, Sep 19, 2024 at 5:50 AM Frédéric N
Igor, thanks, very helpful.
Our current osdmap weighs 1.4MB. And it changes all calculations..
Looks like it can be our case.
I think we have this situation due to long backfilling which takes place
now and going for the last 3 weeks.
Can we drop some amount of osdmaps before rebalance completes
please see my comments inline.
On 9/19/2024 1:53 PM, Александр Руденко wrote:
Igor, thanks!
> What are the numbers today?
Today we have the same "oldest_map": 2408326 and "newest_map":
2637838, *+2191*.
ceph-objectstore-tool --op meta-list --data-path
/var/lib/ceph/osd/ceph-70 | grep osdm
Hi,
we had a problem with a Ceph OSD stuck in snaptrim. We set the Ceph OSD
down temporarily to potentially solve the issue. This ended up kernel
panicking two of our Kubernetes workers that use this Ceph cluster for RBD
and CephFS directly and via CSI driver.
The more notable lines were these af
Hi,
> On 19 Sep 2024, at 12:33, Igor Fedotov wrote:
>
> osd_target_transaction_size should control that.
>
> I've heard of it being raized to 150 with no obvious issues. Going beyond is
> at your own risk. So I'd suggest to apply incremental increase if needed.
Thanks! Now much better
k
__
I think the advice is not to use floating tags (i.e. "latest") and use specific
tags if possible.
I believe you can achieve what you want with either:
"ceph orch upgrade --image "
not sure if this allows you to downgrade, but certainly lets you upgrade and
change image, see Upgrading Ceph — Cep
Hi,
the problem comes from older ceph releases. In our case, hdd iops were
benchmarked in the range of 250 to 4000, which clearly makes no sense.
At osd startup, the benchmark is skipped if that value is already in
ceph config, so these initial benchmark values were never changed. To
reset th
Hi Reid,
I see. It seems weird that the --fix command output shows no differences
between existing_header and calculated_header after it cleaned up some index
entries (removing manifest part from index).
Have you tried running the stats command again to see if any figures were
updated? Based
Hello
Recently I deployed a ceph cluster (version: reef) in my lab and after
that, I deployed RGW using this manifest:
service_type: rgw
service_id: lab-object-storage
placement:
label: rgw
count_per_host: 1
spec:
rgw_frontend_port: 8080
Now I have a rgw container. the docker image is: quay
Igor, thanks!
> What are the numbers today?
Today we have the same "oldest_map": 2408326 and "newest_map": 2637838,
*+2191*.
ceph-objectstore-tool --op meta-list --data-path /var/lib/ceph/osd/ceph-70
| grep osdmap | wc -l
458994
Can you clarify this, please:
> and then multiply by amount of OS
Hi Konstantin,
osd_target_transaction_size should control that.
I've heard of it being raized to 150 with no obvious issues. Going
beyond is at your own risk. So I'd suggest to apply incremental increase
if needed.
Thanks,
Igor
On 9/19/2024 10:44 AM, Konstantin Shalygin wrote:
Hi Igor,
Hi Alexander,
so newwest_map looks slowly growing. And (which is worse) oldest_map is
constant. Which means no old map pruning is happening and more and more
maps are coming.
What are the numbers today?
You can assess the number of objects in "meta" pool (that's where
osdmaps are kept) for
Oh, by the way, since 35470 is near two times 18k, couldn't it be that the
source bucket is versioned and the destination bucket only got the most recent
copy of each object?
Regards,
Frédéric.
- Le 18 Sep 24, à 20:39, Reid Guyett a écrit :
> Hi Frederic,
> Thanks for those notes.
>
Hi Denis,
we observed the same behaviour here. The cause was that the number of
iops discovered at OSD startup was way too high. In our setup the
rocksdb is on flash.
When I set osd_mclock_max_capacity_iops_hdd to a value that the HDDs
could handle, the situation was resolved, clients got th
On 19-09-2024 05:10, Anthony D'Atri wrote:
Anthony,
So it sounds like I need to make a new crush rule for replicated pools that
specifies default-hdd and the device class? (Or should I go the other way
around? I think I'd rather change the replicated pools even though there's
more of th
Hi Igor,
> On 18 Sep 2024, at 18:22, Igor Fedotov wrote:
>
> I recall a couple of cases when permanent osdmap epoch growth has been
> filling OSD with relevant osd map info. Which could be tricky to catch.
>
> Please run 'ceph tell osd.N status" for a couple of affected OSDs twice
> within e.
23 matches
Mail list logo