Hi Seth,
I don't know if this helps you, but I'll share what we do. We present a
large amount of CephFS using NFS and SMB and a handful of cephfs direct
clients, and rarely encounter issues with either frontends or CephFS.
However, the 'gateway' is multiple servers - we use 2x ganesha servers wi
On 3/11/20 11:16 PM, Seth Galitzer wrote:
I have a hybrid environment and need to share with both Linux and
Windows clients. For my previous iterations of file storage, I
exported nfs and samba shares directly from my monolithic file server.
All Linux clients used nfs and all Windows clients
On 3/13/20 12:57 AM, Janek Bevendorff wrote:
NTPd is running, all the nodes have the same time to the second. I
don't think that is the problem.
As always in such cases - try to switch your ntpd to default EL7 daemon
- chronyd.
k
___
ceph-users
I have a small cluster with a single crash map. I use 3 pools one (Openebula
VMs on rbd), cephfs_data and cephfs_metadata for cephfs. Here is my ceph df
RAW STORAGE:
CLASS SIZE AVAIL USED RAW USED %RAW USED
ssd 94 TiB 78 TiB 17 TiB 17 TiB
Since the spreading of the corona virus is taking such drastic
proportions that flights between Europe and the US are being halted. I
would suggest we show some support and we temporary use in the mailing
list only :) and not :D
___
ceph-users ma
With the default redhat kernel I am getting these[1] firmware updates
etc. Does this elrepo supply these also? Is it as 'secure' as the
el6/el7/el8 kernel?
[1]
microcode_ctl
Jan 12 18:49:14 c01 journal: This updated microcode supersedes microcode
provided by Red Hat with#012the CVE-2017-5715
Den tors 12 mars 2020 kl 18:58 skrev Janek Bevendorff <
janek.bevendo...@uni-weimar.de>:
> Hi Caspar,
>
> NTPd is running, all the nodes have the same time to the second. I don't
> think that is the problem.
>
Mons want < 50ms precision, so "to the second" is a bit too vague perhaps.
--
May the
Ok, I think that answers my question then, thanks! Too risky to be playing with
patterns that will get increasingly difficult to support over time.
> On Mar 12, 2020, at 12:48 PM, Anthony D'Atri wrote:
>
> They won’t be AFAIK. Few people ever did this.
>
>> On Mar 12, 2020, at 11:08 AM, Brian
I think it is a bug . I reinstall the cluster . The response of create topic
still 405 .methodnotallowed, anynoe konw why? Thank you very much !
2020年3月12日 下午6:53,曹 海旺 mailto:caohaiw...@hotmail.com>>
写道:
Hi,
I upgrade the ceph from 14.2.7 to the new version 14.2.8 . The bucket
notificati
If the ceph roadmap is getting rid of named clusters, how will multiple
clusters be supported? How (for instance) would `/var/lib/ceph/mon/{name}`
directories be resolved?
> On Mar 11, 2020, at 8:29 PM, Brian Topping wrote:
>
>> On Mar 11, 2020, at 7:59 PM, Anthony D'Atri wrote:
>>
>>> This
I tried both several times. I looks like it just had to read through the
entire journal. I wish there was more progress notification about journal
reading progress in debug less than 10 because 10 is way too noisy. That
could give us an idea of how much longer there is left to go. It seems that
the
Hi Caspar,
NTPd is running, all the nodes have the same time to the second. I don't
think that is the problem.
Janek
On 12/03/2020 12:02, Caspar Smit wrote:
Janek,
This error already should have put you in the right direction:
"possible clock skew"
Probably the date/times are too far apa
Any thoughts on this? We just experienced this again last night. Our 3
RGW servers had issues servicing requests for approx 7 minutes while
this reshard happened. Our users received 5xx errors from haproxy
which fronts the RGW instances. Haproxy is configured with a backend
server timeout of 60 sec
Hi Dan,
nope, osdmap_first_commited is still 1, it must be some different issue..
I'll report when I have something..
n.
On Thu, Mar 12, 2020 at 04:07:26PM +0100, Dan van der Ster wrote:
> You have to wait 5 minutes or so after restarting the mon before it
> starts trimming.
> Otherwise, hmm,
You have to wait 5 minutes or so after restarting the mon before it
starts trimming.
Otherwise, hmm, I'm not sure.
-- dan
On Thu, Mar 12, 2020 at 3:55 PM Nikola Ciprich
wrote:
>
> Hi Dan,
>
> # ceph report 2>/dev/null | jq .osdmap_first_committed
> 1
> # ceph report 2>/dev/null | jq .osdmap_last
Hi Dan,
# ceph report 2>/dev/null | jq .osdmap_first_committed
1
# ceph report 2>/dev/null | jq .osdmap_last_committed
4646
seems like osdmap_first_committed doesn't change at all, restarting mons
doesn't help.. I don't have any down OSD, everything seems to be healty..
BR
nik
On Thu, Mar 1
If untrimed osdmaps is related, then you should check:
https://tracker.ceph.com/issues/37875, particularly #note6
You can see what the mon thinks the valid range of osdmaps is:
# ceph report | jq .osdmap_first_committed
113300
# ceph report | jq .osdmap_last_committed
113938
Then the workaround
Hi,
I was just checking on a few (13) IPv6-only Ceph clusters and I noticed
that they couldn't send their Telemetry data anymore:
telemetry.ceph.com has address 8.43.84.137
This server used to have Dual-Stack connectivity while it was still
hosted at OVH.
It seemed to have moved to Red Hat, but
Hello,
I have created a small 16pg EC pool with k=4, m=2.
Then I applied following crush rule to it:
rule test_ec { id 99 type erasuremin_size 5 max_size 6
step
set_chooseleaf_tries 5 step set_choose_tries 100 step take
default
step choose indep 3
Janek,
This error already should have put you in the right direction:
"possible clock skew"
Probably the date/times are too far apart on your nodes.
Make sure all your nodes are time synced using NTP
Kind regards,
Caspar
Op wo 11 mrt. 2020 om 09:47 schreef Janek Bevendorff <
janek.bevendo...@u
Hi,
I upgrade the ceph from 14.2.7 to the new version 14.2.8 . The bucket
notification dose not work.
I can’t create a TOPIC :
I use post man to send a post flow by
https://docs.ceph.com/docs/master/radosgw/notifications/#create-a-topic
REQUEST:
POST
http://rgw1:7480/?Action=CreateTopi
OK,
so I can confirm that at least in my case, the problem is caused
by old osd maps not being pruned for some reason, and thus not fitting
into cache. When I increased osd map cache to 5000 the problem is gone.
The question is why they're not being pruned, even though the cluster is in
healthy s
Am 12.03.2020 schrieb Wido den Hollander:
>
>
> On 3/12/20 7:44 AM, Hartwig Hauschild wrote:
> > Am 10.03.2020 schrieb Wido den Hollander:
> >>
> >>
> >> On 3/10/20 10:48 AM, Hartwig Hauschild wrote:
> >>> Hi,
> >>>
> >>> I've done a bit more testing ...
> >>>
> >>> Am 05.03.2020 schrieb Hartwig
Am 12.03.2020 schrieb XuYun:
> We got the same problem today while we were adding memory to OSD nodes,
> and it decreased monitor’s performance a lot. I noticed that the db kept
> increasing after an OSD is shutdown, so I guess that it is caused by the
> warning reports collected by mgr insights mo
On 3/12/20 7:44 AM, Hartwig Hauschild wrote:
> Am 10.03.2020 schrieb Wido den Hollander:
>>
>>
>> On 3/10/20 10:48 AM, Hartwig Hauschild wrote:
>>> Hi,
>>>
>>> I've done a bit more testing ...
>>>
>>> Am 05.03.2020 schrieb Hartwig Hauschild:
Hi,
> [ snipped ]
>>> I've read somewhere
Hi Paul and others,
while digging deeper, I noticed that when the cluster gets into this
state, osd_map_cache_miss on OSDs starts growing rapidly.. even when
I increased osd map cache size to 500 (which was the default at least
for luminous) it behaves the same..
I think this could be related..
We got the same problem today while we were adding memory to OSD nodes, and it
decreased monitor’s performance a lot. I noticed that the db kept increasing
after an OSD is shutdown, so I guess that it is caused by the warning reports
collected by mgr insights module. When I disabled the mgr insi
On Thu, Mar 12, 2020 at 1:41 PM Robert LeBlanc wrote:
>
> This is the second time this happened in a couple of weeks. The MDS locks
> up and the stand-by can't take over so the Montiors black list them. I try
> to unblack list them, but they still say this in the logs
>
> mds.0.1184394 waiting for
28 matches
Mail list logo