[ceph-users] Re: OSDs ignore memory limit

2025-04-09 Thread Eric Petit
Just in case, make sure the Ceph builds you use do have tcmalloc enabled in the first place. The only time I’ve seen OSDs exceed their memory targets so far was on a Pacific cluster that used Debian 12 provided packages, and I eventually figured that those had Crimson enabled - which comes with

[ceph-users] Re: Migrating from S3 to Ceph RGW (Cloud Sync Module)

2025-04-09 Thread Gregory Orange
On 15/4/24 19:58, Ondřej Kukla wrote: > If you have a quite large amount of data you can maybe try the Chorus from > CLYSO. In March we completed a migration of 17PB of data between two local Ceph clusters using Chorus. It took some work to prepare network configurations and test it and increase

[ceph-users] Re: Cephadm flooding /var/log/ceph/cephadm.log

2025-04-09 Thread Eugen Block
You would have to do that after each upgrade. Seems easier to just accept the debug logs. ;-) Zitat von Alex : Official IBM and RH "fix" is to replace DEBUG with INFO in /var/lib/ceph//cephadm.hash ¯\_ (ツ) _/¯ ___ ceph-users mailing list -- ceph

[ceph-users] Re: Cephadm flooding /var/log/ceph/cephadm.log

2025-04-09 Thread Alex
How does that work? ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Cephadm flooding /var/log/ceph/cephadm.log

2025-04-09 Thread Eugen Block
I haven't modified the logrotate config from the cephadm package. There's no cephadm process running, it's invoked everytime the orchestrator does one of its checks (host, network, osd specs etc.). So I don't see a reason to tweak anything here, I stick to the defaults. Zitat von Alex : H

[ceph-users] Re: NIH Datasets

2025-04-09 Thread Linas Vepstas
Hi Alex, The data purge is political. Data includes research on gun violence, sexually transmitted disease, you name it. The scientists keeping the data were keeping it for conventional science reasons. Labs are typically funded, and storage costs are paid for by "principal investigators": the big

[ceph-users] Re: Cephadm flooding /var/log/ceph/cephadm.log

2025-04-09 Thread Alex
Thanks. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Ceph squid fresh install

2025-04-09 Thread Eugen Block
Hi, you can query the MON sessions to identify your older clients with: ceph tell mon. sessions It will show you the IP address, con_features_release (Luminous) and a couple of other things. Zitat von Laura Flores : Hi Rafael, I would not force the min_compat_client to be reef when there

[ceph-users] Re: Cephadm flooding /var/log/ceph/cephadm.log

2025-04-09 Thread Anthony D'Atri
I have at times used admin sockets to tell daemons to re-open log files after running a truncate > On Apr 9, 2025, at 3:51 PM, Eugen Block wrote: > > I haven't modified the logrotate config from the cephadm package. There's no > cephadm process running, it's invoked everytime the orchestrator

[ceph-users] Re: OSDs ignore memory limit

2025-04-09 Thread Anthony D'Atri
> But checking the top output etc. doesn't confirm those values. I suspect a startup peak that subsides for steady-state operation. I observed this with mons back in Luminous. A cluster had been expanded considerably without restarting mons, so when they tried to restart there wasn’t enough

[ceph-users] Re: OSDs ignore memory limit

2025-04-09 Thread Eugen Block
Then I suggest to do the usual troubleshooting [0], not necessarily in this order: - osd logs - ceph tell osd.X heap stats - ceph osd df tree (to look for unbalanced PG distribution) - check tracker.ceph.com for existing issues - How are the nodes equipped RAM wise? - Are the oom killers happen

[ceph-users] Re: ceph deployment best practice

2025-04-09 Thread Anthony D'Atri
> > We would start deploying Ceph with 4 hosts ( HP Proliant servers ) each > running RockyLinux 9. > > One of the hosts called ceph-adm will be smaller one and will have > following hardware :- > > 2x4T SSD with raid 1 to install OS on. > > 8 Core with 3600MHz freq. > > 64G RAM > > We are

[ceph-users] Re: Diskprediction_local mgr module removal - Call for feedback

2025-04-09 Thread Anthony D'Atri
> On Apr 9, 2025, at 7:01 AM, Jan Marek wrote: > > I would like vote against remove diskprediction module from Ceph. Have you personally seen it be useful? ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-

[ceph-users] Re: Ceph squid fresh install

2025-04-09 Thread quag...@bol.com.br
___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Diskprediction_local mgr module removal - Call for feedback

2025-04-09 Thread Anthony D'Atri
Lots of things anyone can do. Note my participation in the referenced #205 over a year ago. At the time it wasn’t going anywhere, after months of quibbling. If there is finally code there that supports multiple occulting HBAs, a mixture of native, passed-through, and occulted drives without m

[ceph-users] Re: OSDs ignore memory limit

2025-04-09 Thread Eugen Block
I noticed the quite high reported memory stats for OSDs as well on a recently upgraded customer cluster, now running 18.2.4. But checking the top output etc. doesn't confirm those values. I don't really know where they come from, tbh. Can you confirm that those are actually OSD processes fil

[ceph-users] Re: OSDs ignore memory limit

2025-04-09 Thread Jonas Schwab
Yes, it's the ceph-osd processes filling up the RAM. On 2025-04-09 15:13, Eugen Block wrote: I noticed the quite high reported memory stats for OSDs as well on a recently upgraded customer cluster, now running 18.2.4. But checking the top output etc. doesn't confirm those values. I don't really

[ceph-users] Re: Cephadm flooding /var/log/ceph/cephadm.log

2025-04-09 Thread Alex
Haha yeah think that's what we're doing. I'm just going to add it to logrotate. Do you use copytruncate option or postrotate to restart ceph? ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Diskprediction_local mgr module removal - Call for feedback

2025-04-09 Thread Konstantin Shalygin
Hi, You can always consult with Releases page [1] Thanks, k [1] https://github.com/prometheus-community/smartctl_exporter/releases Sent from my iPhone > On 9 Apr 2025, at 17:51, Anthony D'Atri wrote: > > Unless something has changed with smartctl_exporter, there wasn’t working > support for

[ceph-users] Re: Diskprediction_local mgr module removal - Call for feedback

2025-04-09 Thread Anthony D'Atri
Unless something has changed with smartctl_exporter, there wasn’t working support for drives behind a RAID HBA. There was when I looked the potential for harmonizing metrics, though necessarily by editing golang code and recompiling. > On Apr 9, 2025, at 2:34 AM, Konstantin Shalygin wrote:

[ceph-users] Re: Cephadm flooding /var/log/ceph/cephadm.log

2025-04-09 Thread Eugen Block
This is from a openSUSE base OS: # cat /etc/logrotate.d/cephadm # created by cephadm /var/log/ceph/cephadm.log { rotate 7 daily compress missingok notifempty su root root } Zitat von Alex : I did have to add "su root root" to the log rotate script to fix the permissio

[ceph-users] Re: OSDs ignore memory limit

2025-04-09 Thread Janne Johansson
> >> killing processes. Does someone have ideas why the daemons seem > >> to completely ignore the set memory limits? > > Remember that osd_memory_target is a TARGET not a LIMIT. Upstream docs > suggest an aggregate 20% headroom, personally I like 100% headroom, but > that’s informed by some pri

[ceph-users] Image Live-Migration does not respond to OpenStack Glance images

2025-04-09 Thread Yuta Kambe (Fujitsu)
Hi everyone. I am trying Image Live-Migration but it is not working well and I would like some advice. https://docs.ceph.com/en/latest/rbd/rbd-live-migration/ I use Ceph as a backend for OpenStack Glance. I tried to migrate the Pool of Ceph used in Glance to the new Pool. Source Pool: - images

[ceph-users] Repo name bug?

2025-04-09 Thread Alex
Good morning everyone. Does the preflight playbook have a bug? https://github.com/ceph/cephadm-ansible/blob/devel/cephadm-preflight.yml Line 82: paths: "{{ ['noarch', '$basearch'] if ceph_origin == 'community' else ['$basearch'] }}" The yum repo file then gets named ceph_stable_$basearch. Sh

[ceph-users] Re: Cephadm flooding /var/log/ceph/cephadm.log

2025-04-09 Thread Alex
Thanks Eugen! I think you're right since support had me grep for the same code. Seems crazy though that it's hardcoded doesn't it? I guess we can mod the Python file, but you'd think that wouldn't be necessary. Should we make a feature request or modify the code ourselves and make a pull request?

[ceph-users] ceph deployment best practice

2025-04-09 Thread gagan tiwari
Hi Guys, We have an HPC environment which currently has a single master host that stores the entire data around 100T and exports that data via NFS to the clients. We are using OpenZFS on the single master host. But now we need to store much more data around 500T and we are fa

[ceph-users] Re: Cephadm flooding /var/log/ceph/cephadm.log

2025-04-09 Thread Alex
Official IBM and RH "fix" is to replace DEBUG with INFO in /var/lib/ceph//cephadm.hash ¯\_ (ツ) _/¯ ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Ceph squid fresh install

2025-04-09 Thread Eugen Block
It's not necessarily a ceph mount that shows older features, but the kernel version might still be relevant. Just a quick example, a fresh 19.2.1 install on a virtual machine with kernel: soc9-ceph:~ # uname -r 6.4.0-150600.23.33-default soc9-ceph:~ # ceph versions { "mon": { "ce

[ceph-users] OSDs ignore memory limit

2025-04-09 Thread Jonas Schwab
Hello everyone, I recently have many problems with OSDs using much more memory than they are supposed to (> 10GB), leading to the node running out of memory and killing processes. Does someone have ideas why the daemons seem to completely ignore the set memory limits? See e.g. the following: $

[ceph-users] Re: RBD Block alignment 16k for Databases

2025-04-09 Thread Eugen Block
Hi, the default rbd chunk size is 4MB, it's the "order" that defines the chunk size: rbd info pool/volume1 rbd image 'volume1': size 8 GiB in 2048 objects order 22 (4 MiB objects) You can create images with a different order using the '--object-size' parameter: rbd create

[ceph-users] Re: Diskprediction_local mgr module removal - Call for feedback

2025-04-09 Thread Jan Marek
Hello, I would like vote against remove diskprediction module from CEPH. Sincerely Jan Marek Dne út, dub 08, 2025 at 09:59:34 CEST napsal(a) Michal Strnad: > Hi. > > From our point of view, it's important to keep disk failure prediction tool > as part of Ceph, ideally as an MGR module. In envir