[ceph-users] Cluster network and public network

2020-05-08 Thread Nghia Viet Tran
Hi everyone, I have a question about the network setup. From the document, It’s recommended to have 2 NICs per hosts as described in below picture [Diagram] In the picture, OSD hosts will connect to the Cluster network for replicate and heartbeat between OSDs, therefore, we definitely need 2 NICs

[ceph-users] Re: cephfs change/migrate default data pool

2020-05-08 Thread Kenneth Waegeman
Hi Frank, Patrick, I tried to create a cephfs elsewhere and got the warning, that's what got my attention. Nothing is broken, but I thought it could have impact on performance. Very good to know it's not that bad! Thanks!! Kenneth On 07/05/2020 23:31, Patrick Donnelly wrote: On Wed, Apr 2

[ceph-users] Re: Cluster network and public network

2020-05-08 Thread Martin Verges
Hello Nghia, just use one network interface card and use frontend and backend traffic on the same. No problem with that. If you have a dual port card, use both ports as an LACP channel and maybe separate it using VLANs if you want to, but not required as well. -- Martin Verges Managing director

[ceph-users] Re: ceph-mgr high CPU utilization

2020-05-08 Thread Dan van der Ster
If an upmap is not stored, it means that OSDMap::check_pg_upmaps is deciding that those upmaps are invalid for some reason. Additional debugging can help sort out why. (Maybe you have a complex crush tree and the balancer is creating invalid upmaps). -- dan On Fri, May 1, 2020 at 2:48 PM Andras P

[ceph-users] Re: Cluster network and public network

2020-05-08 Thread Nghia Viet Tran
Hi Martin, Thanks for your response. You mean one network interface for only MON hosts or for the whole cluster including OSD hosts? I’m confusing now because there are some projects that only use one public network for the whole cluster. That means the rebalancing, replicating objects and hear

[ceph-users] Re: Cluster network and public network

2020-05-08 Thread Willi Schiegel
Hello Nghia, I once asked a similar question about network architecture and got the same answer as Martin wrote from Wido den Hollander: There is no need to have a public and cluster network with Ceph. Working as a Ceph consultant I've deployed multi-PB Ceph clusters with a single public netwo

[ceph-users] Nautilus cluster rados gateway not sharding bucket indexes

2020-05-08 Thread Marcel Kuiper
Hi Sorry for the repost, but I didn't get any respons on my first post so I will try to rephrase it We have several ceph clusters running nautilus (coming from mimic). On one of the clusters I got a health warning HEALTH_WARN 1 large omap objects * when I check on the rados gateway

[ceph-users] Unit testing of CRUSH Algorithm

2020-05-08 Thread Bobby
Hi, Are there any more resources of unit tests for CRUSH algorithm other than the test cases here: : https://github.com/ceph/ceph/tree/master/src/test/crush Or more unit testing of CRUSH apart from the these test cases would be an overkill? BR Bobby !

[ceph-users] Re: Nautilus cluster rados gateway not sharding bucket indexes

2020-05-08 Thread Casey Bodley
Hi Marcel, What version of nautilus is this? It sounds similar to https://tracker.ceph.com/issues/43188, which should be fixed in 14.2.9. On Fri, May 8, 2020 at 6:18 AM Marcel Kuiper wrote: > > Hi > > Sorry for the repost, but I didn't get any respons on my first post so I > will try to rephras

[ceph-users] Re: Nautilus cluster rados gateway not sharding bucket indexes

2020-05-08 Thread Marcel Kuiper
Hi Casey, We're running 14.2.8. I will schedule an upgrade a.s.a.p. Thanks for your respons Marcel > Hi Marcel, > > What version of nautilus is this? It sounds similar to > https://tracker.ceph.com/issues/43188, which should be fixed in > 14.2.9. > > > On Fri, May 8, 2020 at 6:18 AM Marcel Kuipe

[ceph-users] Re: ceph-mgr high CPU utilization

2020-05-08 Thread Andras Pataki
Hi Dan, You are absolutely right - it didn't occur to me to check whether the upmaps are correct, and indeed some of them are obviously not. I do have a more complex crush rule for our 6+3 EC pool (pool 9):     step take root-disk     step choose indep 3 type pod     step choose i

[ceph-users] Re: Data loss by adding 2OSD causing Long heartbeat ping times

2020-05-08 Thread Frank Schilder
On all OSD nodes I'm using vm.min_free_kbytes = 4194304 (4GB). This was one of the first tunings on the cluster. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Anthony D'Atri Sent: 08 May 2020 10:17 To: Frank Sc

[ceph-users] Cluster rename procedure

2020-05-08 Thread Anthony D'Atri
I’ve inherited a couple of clusters with non-default (ie, not “ceph”) internal names, and I want to rename them for the usual reasons. I had previously developed a full list of steps - which I no longer have access to. Anyone done this recently? Want to be sure I’m not missing something. * N

[ceph-users] Re: Cluster rename procedure

2020-05-08 Thread Brad Hubbard
Are they LVM based? The keyring files should be just the filenames, yes. Here's a recent list I saw which was missing the keyring step but is reported to be complete otherwise. - Stop RGW services - Set the flags (noout,norecover,norebalance,nobackfill,nodown,pause) - Stop OSD/MGR/MON services -