[ceph-users] Re: Is there a better way to make a samba/nfs gateway?

2020-03-12 Thread Rafael Lopez
Hi Seth, I don't know if this helps you, but I'll share what we do. We present a large amount of CephFS using NFS and SMB and a handful of cephfs direct clients, and rarely encounter issues with either frontends or CephFS. However, the 'gateway' is multiple servers - we use 2x ganesha servers wi

[ceph-users] Re: Is there a better way to make a samba/nfs gateway?

2020-03-12 Thread Konstantin Shalygin
On 3/11/20 11:16 PM, Seth Galitzer wrote: I have a hybrid environment and need to share with both Linux and Windows clients. For my previous iterations of file storage, I exported nfs and samba shares directly from my monolithic file server. All Linux clients used nfs and all Windows clients

[ceph-users] Re: MGRs failing once per day and generally slow response times

2020-03-12 Thread Konstantin Shalygin
On 3/13/20 12:57 AM, Janek Bevendorff wrote: NTPd is running, all the nodes have the same time to the second. I don't think that is the problem. As always in such cases - try to switch your ntpd to default EL7 daemon - chronyd. k ___ ceph-users

[ceph-users] Ceph storage distribution between pools

2020-03-12 Thread alexander . v . litvak
I have a small cluster with a single crash map. I use 3 pools one (Openebula VMs on rbd), cephfs_data and cephfs_metadata for cephfs. Here is my ceph df RAW STORAGE: CLASS SIZE AVAIL USED RAW USED %RAW USED ssd 94 TiB 78 TiB 17 TiB 17 TiB

[ceph-users] preventing the spreading of corona virus on ceph.io

2020-03-12 Thread Marc Roos
Since the spreading of the corona virus is taking such drastic proportions that flights between Europe and the US are being halted. I would suggest we show some support and we temporary use in the mailing list only :) and not :D ___ ceph-users ma

[ceph-users] Re: centos7 / nautilus where to get kernel 5.5 from?

2020-03-12 Thread Marc Roos
With the default redhat kernel I am getting these[1] firmware updates etc. Does this elrepo supply these also? Is it as 'secure' as the el6/el7/el8 kernel? [1] microcode_ctl Jan 12 18:49:14 c01 journal: This updated microcode supersedes microcode provided by Red Hat with#012the CVE-2017-5715

[ceph-users] Re: MGRs failing once per day and generally slow response times

2020-03-12 Thread Janne Johansson
Den tors 12 mars 2020 kl 18:58 skrev Janek Bevendorff < janek.bevendo...@uni-weimar.de>: > Hi Caspar, > > NTPd is running, all the nodes have the same time to the second. I don't > think that is the problem. > Mons want < 50ms precision, so "to the second" is a bit too vague perhaps. -- May the

[ceph-users] Re: Single machine / multiple monitors

2020-03-12 Thread Brian Topping
Ok, I think that answers my question then, thanks! Too risky to be playing with patterns that will get increasingly difficult to support over time. > On Mar 12, 2020, at 12:48 PM, Anthony D'Atri wrote: > > They won’t be AFAIK. Few people ever did this. > >> On Mar 12, 2020, at 11:08 AM, Brian

[ceph-users] Re: HELP! Ceph( v 14.2.8) bucket notification dose not work!

2020-03-12 Thread 曹 海旺
I think it is a bug . I reinstall the cluster . The response of create topic still 405 .methodnotallowed, anynoe konw why? Thank you very much ! 2020年3月12日 下午6:53,曹 海旺 mailto:caohaiw...@hotmail.com>> 写道: Hi, I upgrade the ceph from 14.2.7 to the new version 14.2.8 . The bucket notificati

[ceph-users] Re: Single machine / multiple monitors

2020-03-12 Thread Brian Topping
If the ceph roadmap is getting rid of named clusters, how will multiple clusters be supported? How (for instance) would `/var/lib/ceph/mon/{name}` directories be resolved? > On Mar 11, 2020, at 8:29 PM, Brian Topping wrote: > >> On Mar 11, 2020, at 7:59 PM, Anthony D'Atri wrote: >> >>> This

[ceph-users] Re: Cluster blacklists MDS, can't start

2020-03-12 Thread Robert LeBlanc
I tried both several times. I looks like it just had to read through the entire journal. I wish there was more progress notification about journal reading progress in debug less than 10 because 10 is way too noisy. That could give us an idea of how much longer there is left to go. It seems that the

[ceph-users] Re: MGRs failing once per day and generally slow response times

2020-03-12 Thread Janek Bevendorff
Hi Caspar, NTPd is running, all the nodes have the same time to the second. I don't think that is the problem. Janek On 12/03/2020 12:02, Caspar Smit wrote: Janek, This error already should have put you in the right direction: "possible clock skew" Probably the date/times are too far apa

[ceph-users] Re: RGWReshardLock::lock failed to acquire lock ret=-16

2020-03-12 Thread Josh Haft
Any thoughts on this? We just experienced this again last night. Our 3 RGW servers had issues servicing requests for approx 7 minutes while this reshard happened. Our users received 5xx errors from haproxy which fronts the RGW instances. Haproxy is configured with a backend server timeout of 60 sec

[ceph-users] Re: osd_pg_create causing slow requests in Nautilus

2020-03-12 Thread Nikola Ciprich
Hi Dan, nope, osdmap_first_commited is still 1, it must be some different issue.. I'll report when I have something.. n. On Thu, Mar 12, 2020 at 04:07:26PM +0100, Dan van der Ster wrote: > You have to wait 5 minutes or so after restarting the mon before it > starts trimming. > Otherwise, hmm,

[ceph-users] Re: osd_pg_create causing slow requests in Nautilus

2020-03-12 Thread Dan van der Ster
You have to wait 5 minutes or so after restarting the mon before it starts trimming. Otherwise, hmm, I'm not sure. -- dan On Thu, Mar 12, 2020 at 3:55 PM Nikola Ciprich wrote: > > Hi Dan, > > # ceph report 2>/dev/null | jq .osdmap_first_committed > 1 > # ceph report 2>/dev/null | jq .osdmap_last

[ceph-users] Re: osd_pg_create causing slow requests in Nautilus

2020-03-12 Thread Nikola Ciprich
Hi Dan, # ceph report 2>/dev/null | jq .osdmap_first_committed 1 # ceph report 2>/dev/null | jq .osdmap_last_committed 4646 seems like osdmap_first_committed doesn't change at all, restarting mons doesn't help.. I don't have any down OSD, everything seems to be healty.. BR nik On Thu, Mar 1

[ceph-users] Re: osd_pg_create causing slow requests in Nautilus

2020-03-12 Thread Dan van der Ster
If untrimed osdmaps is related, then you should check: https://tracker.ceph.com/issues/37875, particularly #note6 You can see what the mon thinks the valid range of osdmaps is: # ceph report | jq .osdmap_first_committed 113300 # ceph report | jq .osdmap_last_committed 113938 Then the workaround

[ceph-users] IPv6 connectivity gone for Ceph Telemetry

2020-03-12 Thread Wido den Hollander
Hi, I was just checking on a few (13) IPv6-only Ceph clusters and I noticed that they couldn't send their Telemetry data anymore: telemetry.ceph.com has address 8.43.84.137 This server used to have Dual-Stack connectivity while it was still hosted at OVH. It seemed to have moved to Red Hat, but

[ceph-users] EC pool 4+2 - failed to guarantee a failure domain

2020-03-12 Thread Maks Kowalik
Hello, I have created a small 16pg EC pool with k=4, m=2. Then I applied following crush rule to it: rule test_ec { id 99 type erasuremin_size 5 max_size 6 step set_chooseleaf_tries 5 step set_choose_tries 100 step take default step choose indep 3

[ceph-users] Re: MGRs failing once per day and generally slow response times

2020-03-12 Thread Caspar Smit
Janek, This error already should have put you in the right direction: "possible clock skew" Probably the date/times are too far apart on your nodes. Make sure all your nodes are time synced using NTP Kind regards, Caspar Op wo 11 mrt. 2020 om 09:47 schreef Janek Bevendorff < janek.bevendo...@u

[ceph-users] HELP! Ceph( v 14.2.8) bucket notification dose not work!

2020-03-12 Thread 曹 海旺
Hi, I upgrade the ceph from 14.2.7 to the new version 14.2.8 . The bucket notification dose not work. I can’t create a TOPIC : I use post man to send a post flow by https://docs.ceph.com/docs/master/radosgw/notifications/#create-a-topic REQUEST: POST http://rgw1:7480/?Action=CreateTopi

[ceph-users] Re: osd_pg_create causing slow requests in Nautilus

2020-03-12 Thread Nikola Ciprich
OK, so I can confirm that at least in my case, the problem is caused by old osd maps not being pruned for some reason, and thus not fitting into cache. When I increased osd map cache to 5000 the problem is gone. The question is why they're not being pruned, even though the cluster is in healthy s

[ceph-users] Re: ceph-mon store.db disk usage increase on OSD-Host fail

2020-03-12 Thread Hartwig Hauschild
Am 12.03.2020 schrieb Wido den Hollander: > > > On 3/12/20 7:44 AM, Hartwig Hauschild wrote: > > Am 10.03.2020 schrieb Wido den Hollander: > >> > >> > >> On 3/10/20 10:48 AM, Hartwig Hauschild wrote: > >>> Hi, > >>> > >>> I've done a bit more testing ... > >>> > >>> Am 05.03.2020 schrieb Hartwig

[ceph-users] Re: ceph-mon store.db disk usage increase on OSD-Host fail

2020-03-12 Thread Hartwig Hauschild
Am 12.03.2020 schrieb XuYun: > We got the same problem today while we were adding memory to OSD nodes, > and it decreased monitor’s performance a lot. I noticed that the db kept > increasing after an OSD is shutdown, so I guess that it is caused by the > warning reports collected by mgr insights mo

[ceph-users] Re: ceph-mon store.db disk usage increase on OSD-Host fail

2020-03-12 Thread Wido den Hollander
On 3/12/20 7:44 AM, Hartwig Hauschild wrote: > Am 10.03.2020 schrieb Wido den Hollander: >> >> >> On 3/10/20 10:48 AM, Hartwig Hauschild wrote: >>> Hi, >>> >>> I've done a bit more testing ... >>> >>> Am 05.03.2020 schrieb Hartwig Hauschild: Hi, > [ snipped ] >>> I've read somewhere

[ceph-users] Re: osd_pg_create causing slow requests in Nautilus

2020-03-12 Thread Nikola Ciprich
Hi Paul and others, while digging deeper, I noticed that when the cluster gets into this state, osd_map_cache_miss on OSDs starts growing rapidly.. even when I increased osd map cache size to 500 (which was the default at least for luminous) it behaves the same.. I think this could be related..

[ceph-users] Re: ceph-mon store.db disk usage increase on OSD-Host fail

2020-03-12 Thread XuYun
We got the same problem today while we were adding memory to OSD nodes, and it decreased monitor’s performance a lot. I noticed that the db kept increasing after an OSD is shutdown, so I guess that it is caused by the warning reports collected by mgr insights module. When I disabled the mgr insi

[ceph-users] Re: Cluster blacklists MDS, can't start

2020-03-12 Thread Yan, Zheng
On Thu, Mar 12, 2020 at 1:41 PM Robert LeBlanc wrote: > > This is the second time this happened in a couple of weeks. The MDS locks > up and the stand-by can't take over so the Montiors black list them. I try > to unblack list them, but they still say this in the logs > > mds.0.1184394 waiting for