[ceph-users] Re: MDS cache always increasing
It was worse with 1 MDS, therefor we moved to 2 active MDS with directory pinning (so the balancer won't be an issue/make things extra complicated). The number of caps stay for the most part the same, some ups and downs. I would guess it maybe has something to do with caching the accessed directories or files. Because it increases a lot the first time when using rsync and the second time there isn't really an increase of memory usage, only for a little time when the rsync is running and afterwards it drops again. NFS isn't really an option because it adds another hop for the clients :( Second it happens on our Production environment and I won't be making any changes there for a test. Will try to replicate in our staging environment, but that one has a lot less load on it. Kind regards, Sake > Op 31-08-2024 09:15 CEST schreef Alexander Patrakov : > > > Got it. > > However, to narrow down the issue, I suggest that you test whether it > still exists after the following changes: > > 1. Reduce max_mds to 1. > 2. Do not reduce max_mds to 1, but migrate all clients from a direct > CephFS mount to NFS. > > On Sat, Aug 31, 2024 at 2:55 PM Sake Ceph wrote: > > > > I was talking about the hosts where the MDS containers are running on. The > > clients are all RHEL 9. > > > > Kind regards, > > Sake > > > > > Op 31-08-2024 08:34 CEST schreef Alexander Patrakov : > > > > > > > > > Hello Sake, > > > > > > The combination of two active MDSs and RHEL8 does ring a bell, and I > > > have seen this with Quincy, too. However, what's relevant is the > > > kernel version on the clients. If they run the default 4.18.x kernel > > > from RHEL8, please either upgrade to the mainline kernel or decrease > > > max_mds to 1. If they run a modern kernel, then it is something I do > > > not know about. > > > > > > On Sat, Aug 31, 2024 at 1:21 PM Sake Ceph wrote: > > > > > > > > @Anthony: it's a small virtualized cluster and indeed SWAP shouldn't be > > > > used, but this doesn't change the problem. > > > > > > > > @Alexander: the problem is in the active nodes, the standby replay > > > > don't have issues anymore. > > > > > > > > Last night's backup run increased the memory usage to 86% when rsync > > > > was running for app2. It dropped to 77,8% when it was done. When the > > > > rsync for app4 was running it increased to 84% and dropping to 80%. > > > > After a few hours it's now settled on 82%. > > > > It looks to me the MDS server is caching something forever while it > > > > isn't being used.. > > > > > > > > The underlying host is running on RHEL 8. Upgrade to RHEL 9 is planned, > > > > but hit some issues with automatically upgrading hosts. > > > > > > > > Kind regards, > > > > Sake > > > > ___ > > > > ceph-users mailing list -- ceph-users@ceph.io > > > > To unsubscribe send an email to ceph-users-le...@ceph.io > > > > > > > > > > > > -- > > > Alexander Patrakov > > > > -- > Alexander Patrakov ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] How many MDS & MON servers are required
Hi, We are using raw cephfs data 182 TB replica 2 and single MDS seemed to regularly run around 4002 req/s, So how many MDS & MON servers are required? Also mentioned current ceph cluster servers Client : 60 MDS: 3 (2 Active + 1 Standby) MON: 4 MGR: 3 (1 Active + 2 Standby) OSD: 52 PG : auto scale ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: How many MDS & MON servers are required
The number of mons is ideally an odd number. For production 5 is usually the right number. MDS is a complicated question. > On Aug 30, 2024, at 2:24 AM, s.dhivagar@gmail.com wrote: > > Hi, > > We are using raw cephfs data 182 TB replica 2 and single MDS seemed to > regularly run around 4002 req/s, So how many MDS & MON servers are required? > > Also mentioned current ceph cluster servers > > Client : 60 > MDS: 3 (2 Active + 1 Standby) > MON: 4 > MGR: 3 (1 Active + 2 Standby) > OSD: 52 > PG : auto scale > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: ceph-ansible installation error
Den lör 31 aug. 2024 kl 15:42 skrev Tim Holloway : > > I would greatly like to know what the rationale is for avoiding > containers. > > Especially in large shops. From what I can tell, you need to use the > containerized Ceph if you want to run multiple Ceph filesystems on a > single host. The legacy installations only support dumping everything > directly under /var/lib/ceph, so you'd have to invest a lot of effort > into installing, maintaining and operating a second fsid under the > legacy architecture. Using two fsids on one machine is far outside our scope for the 10-or-so clusters we run. Not saying no one does it, but it was frowned upon to have multiple clusternames on the same host, so I guess most people took that to also include multiple fsids running in parallel on the same host, even if the cluster name was the same. > The only definite argument I've ever heard in my insular world against > containers was based on security. Yet the primary security issues > seemed to be more because people were pulling insecure containers from > Docker repositories. I'd expect Ceph to have safeguards. Plus Ceph > under RHEL 9 (and 8?) will run entirely and preferably under Podman, > which allegedly is more secure, and can in fact, run containers under > user accounts to allow additional security. I do that myself, although > I think the mechanisms could stand some extra polishing. From what I see on irc and the maillists, the container setup seems to sometimes end up recreating new containers with new/unique fsids at times as if it forgot the old cluster and rather decided to invent a new one. This seems to combine well (sarcastically) with those ceph admins inability to easily enter the containers and/or read out logs from the old/missing containers to figure out what happened and why this new mon container wants to reinvent the cluster instead of joining the existing one. I know bugs are bugs, but wrapping it all into an extra layer is not helping new ceph admins when it breaks. We have a decent page on PG repairs of various kinds, but perhaps not as much on what to do when the orchestrator isn't orchestrating? Containers help them set the initial cluster up tons faster, but it seems as if it leads them into situations where the container's ephemeral state is actively working against their ability to figure out when things go wrong and what the actual cause for that was. Perhaps it is clusters that were adopted into the new style, perhaps they run the containers in the wrong way, but there are a certain amount of posts about "I pressed the button for totally automated (re)deploy of X,Y and Z and it doesn't work". I would not like to end up in this situation while at the same time handling real customers who wonder why our storage is not serving IO at this moment. Doing installs 'manually' is far from optimal, but at least I know the logs end up under /var/log/ceph/-.log and they stay there even if the OSD disk is totally dead and gone. -- May the most significant bit of your life be positive. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: ceph-ansible installation error
Den fre 30 aug. 2024 kl 20:43 skrev Milan Kupcevic : > > On 8/30/24 12:38, Tim Holloway wrote: > > I believe that the original Ansible installation process is deprecated. > > This would be a bad news as I repeatedly hear from admins running large > storage deployments that they prefer to stay away from containers. You have other choices than Ansible or containers. It has always been possible to install using rpm/deb's manually, using any kind of scripts or frameworks. The point is that the ceph people no longer tries to make pre-made ansible scripts that work "everywhere", because they didn't. This does not prevent you in any way from avoiding containers (if that is what you want) but it makes you responsible for figuring out the automation part if you need one. -- May the most significant bit of your life be positive. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: ceph-ansible installation error
I would greatly like to know what the rationale is for avoiding containers. Especially in large shops. From what I can tell, you need to use the containerized Ceph if you want to run multiple Ceph filesystems on a single host. The legacy installations only support dumping everything directly under /var/lib/ceph, so you'd have to invest a lot of effort into installing, maintaining and operating a second fsid under the legacy architecture. Plus, IBM Red Hat is a big fan of containers, so if you're a large corporation that likes IBM hand-holding, they're throwing their support in a direction contrary to the old install-directly approach, And from an IBM viewpoint, supporting containers is generally going to be easier than supporting software that's directly splattered all over the OS. And much less overhead that spinning up an entire VM. The only definite argument I've ever heard in my insular world against containers was based on security. Yet the primary security issues seemed to be more because people were pulling insecure containers from Docker repositories. I'd expect Ceph to have safeguards. Plus Ceph under RHEL 9 (and 8?) will run entirely and preferably under Podman, which allegedly is more secure, and can in fact, run containers under user accounts to allow additional security. I do that myself, although I think the mechanisms could stand some extra polishing. Tim On Sat, 2024-08-31 at 09:49 +0200, Janne Johansson wrote: > Den fre 30 aug. 2024 kl 20:43 skrev Milan Kupcevic > : > > > > On 8/30/24 12:38, Tim Holloway wrote: > > > I believe that the original Ansible installation process is > > > deprecated. > > > > This would be a bad news as I repeatedly hear from admins running > > large > > storage deployments that they prefer to stay away from containers. > > You have other choices than Ansible or containers. It has always been > possible to install using rpm/deb's manually, using any kind of > scripts or frameworks. The point is that the ceph people no longer > tries to make pre-made ansible scripts that work "everywhere", > because > they didn't. > > This does not prevent you in any way from avoiding containers (if > that > is what you want) but it makes you responsible for figuring out the > automation part if you need one. > ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: MDS cache always increasing
Ow it got worse after the upgrade to Reef (was running Quincy). With Quincy the memory usage was also a lot of times around 95% and some swap usage, but never exceeding both to the point of crashing. Kind regards, Sake > Op 31-08-2024 09:15 CEST schreef Alexander Patrakov : > > > Got it. > > However, to narrow down the issue, I suggest that you test whether it > still exists after the following changes: > > 1. Reduce max_mds to 1. > 2. Do not reduce max_mds to 1, but migrate all clients from a direct > CephFS mount to NFS. > > On Sat, Aug 31, 2024 at 2:55 PM Sake Ceph wrote: > > > > I was talking about the hosts where the MDS containers are running on. The > > clients are all RHEL 9. > > > > Kind regards, > > Sake > > > > > Op 31-08-2024 08:34 CEST schreef Alexander Patrakov : > > > > > > > > > Hello Sake, > > > > > > The combination of two active MDSs and RHEL8 does ring a bell, and I > > > have seen this with Quincy, too. However, what's relevant is the > > > kernel version on the clients. If they run the default 4.18.x kernel > > > from RHEL8, please either upgrade to the mainline kernel or decrease > > > max_mds to 1. If they run a modern kernel, then it is something I do > > > not know about. > > > > > > On Sat, Aug 31, 2024 at 1:21 PM Sake Ceph wrote: > > > > > > > > @Anthony: it's a small virtualized cluster and indeed SWAP shouldn't be > > > > used, but this doesn't change the problem. > > > > > > > > @Alexander: the problem is in the active nodes, the standby replay > > > > don't have issues anymore. > > > > > > > > Last night's backup run increased the memory usage to 86% when rsync > > > > was running for app2. It dropped to 77,8% when it was done. When the > > > > rsync for app4 was running it increased to 84% and dropping to 80%. > > > > After a few hours it's now settled on 82%. > > > > It looks to me the MDS server is caching something forever while it > > > > isn't being used.. > > > > > > > > The underlying host is running on RHEL 8. Upgrade to RHEL 9 is planned, > > > > but hit some issues with automatically upgrading hosts. > > > > > > > > Kind regards, > > > > Sake > > > > ___ > > > > ceph-users mailing list -- ceph-users@ceph.io > > > > To unsubscribe send an email to ceph-users-le...@ceph.io > > > > > > > > > > > > -- > > > Alexander Patrakov > > > > -- > Alexander Patrakov > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: MDS cache always increasing
As a workaround, to reduce the impact of the MDS slowed down by excessive memory consumption, I would suggest installing earlyoom, disabling swap, and configuring earlyoom as follows (usually through /etc/sysconfig/earlyoom, but could be in a different place on your distribution): EARLYOOM_ARGS="-p -r 600 -m 4,4 -s 1,1" On Sat, Aug 31, 2024 at 3:44 PM Sake Ceph wrote: > > Ow it got worse after the upgrade to Reef (was running Quincy). With Quincy > the memory usage was also a lot of times around 95% and some swap usage, but > never exceeding both to the point of crashing. > > Kind regards, > Sake > > Op 31-08-2024 09:15 CEST schreef Alexander Patrakov : > > > > > > Got it. > > > > However, to narrow down the issue, I suggest that you test whether it > > still exists after the following changes: > > > > 1. Reduce max_mds to 1. > > 2. Do not reduce max_mds to 1, but migrate all clients from a direct > > CephFS mount to NFS. > > > > On Sat, Aug 31, 2024 at 2:55 PM Sake Ceph wrote: > > > > > > I was talking about the hosts where the MDS containers are running on. > > > The clients are all RHEL 9. > > > > > > Kind regards, > > > Sake > > > > > > > Op 31-08-2024 08:34 CEST schreef Alexander Patrakov > > > > : > > > > > > > > > > > > Hello Sake, > > > > > > > > The combination of two active MDSs and RHEL8 does ring a bell, and I > > > > have seen this with Quincy, too. However, what's relevant is the > > > > kernel version on the clients. If they run the default 4.18.x kernel > > > > from RHEL8, please either upgrade to the mainline kernel or decrease > > > > max_mds to 1. If they run a modern kernel, then it is something I do > > > > not know about. > > > > > > > > On Sat, Aug 31, 2024 at 1:21 PM Sake Ceph wrote: > > > > > > > > > > @Anthony: it's a small virtualized cluster and indeed SWAP shouldn't > > > > > be used, but this doesn't change the problem. > > > > > > > > > > @Alexander: the problem is in the active nodes, the standby replay > > > > > don't have issues anymore. > > > > > > > > > > Last night's backup run increased the memory usage to 86% when rsync > > > > > was running for app2. It dropped to 77,8% when it was done. When the > > > > > rsync for app4 was running it increased to 84% and dropping to 80%. > > > > > After a few hours it's now settled on 82%. > > > > > It looks to me the MDS server is caching something forever while it > > > > > isn't being used.. > > > > > > > > > > The underlying host is running on RHEL 8. Upgrade to RHEL 9 is > > > > > planned, but hit some issues with automatically upgrading hosts. > > > > > > > > > > Kind regards, > > > > > Sake > > > > > ___ > > > > > ceph-users mailing list -- ceph-users@ceph.io > > > > > To unsubscribe send an email to ceph-users-le...@ceph.io > > > > > > > > > > > > > > > > -- > > > > Alexander Patrakov > > > > > > > > -- > > Alexander Patrakov > > ___ > > ceph-users mailing list -- ceph-users@ceph.io > > To unsubscribe send an email to ceph-users-le...@ceph.io > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io -- Alexander Patrakov ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: MDS cache always increasing
Got it. However, to narrow down the issue, I suggest that you test whether it still exists after the following changes: 1. Reduce max_mds to 1. 2. Do not reduce max_mds to 1, but migrate all clients from a direct CephFS mount to NFS. On Sat, Aug 31, 2024 at 2:55 PM Sake Ceph wrote: > > I was talking about the hosts where the MDS containers are running on. The > clients are all RHEL 9. > > Kind regards, > Sake > > > Op 31-08-2024 08:34 CEST schreef Alexander Patrakov : > > > > > > Hello Sake, > > > > The combination of two active MDSs and RHEL8 does ring a bell, and I > > have seen this with Quincy, too. However, what's relevant is the > > kernel version on the clients. If they run the default 4.18.x kernel > > from RHEL8, please either upgrade to the mainline kernel or decrease > > max_mds to 1. If they run a modern kernel, then it is something I do > > not know about. > > > > On Sat, Aug 31, 2024 at 1:21 PM Sake Ceph wrote: > > > > > > @Anthony: it's a small virtualized cluster and indeed SWAP shouldn't be > > > used, but this doesn't change the problem. > > > > > > @Alexander: the problem is in the active nodes, the standby replay don't > > > have issues anymore. > > > > > > Last night's backup run increased the memory usage to 86% when rsync was > > > running for app2. It dropped to 77,8% when it was done. When the rsync > > > for app4 was running it increased to 84% and dropping to 80%. After a few > > > hours it's now settled on 82%. > > > It looks to me the MDS server is caching something forever while it isn't > > > being used.. > > > > > > The underlying host is running on RHEL 8. Upgrade to RHEL 9 is planned, > > > but hit some issues with automatically upgrading hosts. > > > > > > Kind regards, > > > Sake > > > ___ > > > ceph-users mailing list -- ceph-users@ceph.io > > > To unsubscribe send an email to ceph-users-le...@ceph.io > > > > > > > > -- > > Alexander Patrakov -- Alexander Patrakov ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io