On Sat, Feb 9, 2019 at 12:36 AM Jake Grimmett wrote:
>
> Dear All,
>
> Unfortunately the MDS has crashed on our Mimic cluster...
>
> First symptoms were rsync giving:
> "No space left on device (28)"
> when trying to rename or delete
>
> This prompted me to try restarting the MDS, as it reported l
On Sat, Feb 9, 2019 at 8:10 AM Hector Martin wrote:
>
> Hi list,
>
> As I understand it, CephFS implements hard links as effectively "smart
> soft links", where one link is the primary for the inode and the others
> effectively reference it. When it comes to directories, the size for a
> hardlinke
Hi Jason,
that's been very helpful, but it got me thinking and looking.
The pool name is both inside the libvirt.xml (and running KVM config)
and it's cached in the Nova database. For it to change would require a
detach/attach which may not be viable or easy, specially for boot
volumes.
What abo
Hi,
As http://docs.ceph.com/docs/master/cephfs/nfs/ says, it's OK to
config active/passive NFS-Ganesha to use CephFs. My question is if we
can use active/active nfs-ganesha for CephFS.
In my thought, only state consistance should we think about.
1. Lock support for Active/Active. Even each nfs-gane
On 2/8/2019 6:57 PM, Alexandre DERUMIER wrote:
another mempool dump after 1h run. (latency ok)
Biggest difference:
before restart
-
"bluestore_cache_other": {
"items": 48661920,
"bytes": 1539544228
},
"bluestore_cache_data": {
"items": 54,
"bytes": 643072
},
(other caches seem to b
Hello Ashley,
Am 9. Februar 2019 17:30:31 MEZ schrieb Ashley Merrick
:
>What does the output of apt-get update look like on one of the nodes?
>
>You can just list the lines that mention CEPH
>
... .. .
Get:6 Https://Download.ceph.com/debian-luminous bionic InRelease [8393 B]
... .. .
The Last a
Hi Zheng,
Many, many thanks for your help...
Your suggestion of setting large values for mds_cache_size and
mds_cache_memory_limit stopped our MDS crashing :)
The values in ceph.conf are now:
mds_cache_size = 8589934592
mds_cache_memory_limit = 17179869184
Should these values be left in our co
Hi Zheng,
Sorry - I've just re-read your email and saw your instruction to restore
the mds_cache_size and mds_cache_memory_limit to original values if the
MDS does not crash - I have now done this...
thanks again for your help,
best regards,
Jake
On 2/11/19 12:01 PM, Jake Grimmett wrote:
> Hi
On Mon, Feb 11, 2019 at 8:01 PM Jake Grimmett wrote:
>
> Hi Zheng,
>
> Many, many thanks for your help...
>
> Your suggestion of setting large values for mds_cache_size and
> mds_cache_memory_limit stopped our MDS crashing :)
>
> The values in ceph.conf are now:
>
> mds_cache_size = 8589934592
> m
On Mon, Feb 11, 2019 at 4:53 AM Luis Periquito wrote:
>
> Hi Jason,
>
> that's been very helpful, but it got me thinking and looking.
>
> The pool name is both inside the libvirt.xml (and running KVM config)
> and it's cached in the Nova database. For it to change would require a
> detach/attach w
Hi,
as 12.2.11 is out for some days and no panic mails showed up on the list I was
planing to update too.
I know there are recommended orders in which to update/upgrade the cluster but
I don’t know how rpm packages are handling restarting services after a yum
update. E.g. when MDS and MONs are
You can't tell from the client log here, but probably the MDS itself was
failing over to a new instance during that interval. There's not much
experience with it, but you could experiment with faster failover by
reducing the mds beacon and grace times. This may or may not work
reliably...
On Sat,
On Thu, Feb 7, 2019 at 3:31 AM Hector Martin wrote:
> On 07/02/2019 19:47, Marc Roos wrote:
> >
> > Is this difference not related to chaching? And you filling up some
> > cache/queue at some point? If you do a sync after each write, do you
> > have still the same results?
>
> No, the slow operat
On Mon, Feb 11, 2019 at 12:10 PM Götz Reinicke
wrote:
> as 12.2.11 is out for some days and no panic mails showed up on the list I
> was planing to update too.
>
> I know there are recommended orders in which to update/upgrade the cluster
> but I don’t know how rpm packages are handling restarti
Hi all,
Tested 4 cases. Case 1-3 are as expected, while for case 4, rebuild didn’t
take place on surviving room as Gregory mentioned. Repeated case 4 several
times on both rooms got same result. We’re running mimic 13.2.2.
E.g.
Room1
Host 1 osd: 2,5
Host 2 osd: 1,3
Room 2 <-- failed roo
Thanks a lot Brad !
The problem is indeed in the network: we moved the OSD nodes back to the
"old" switches and the problem disappeared.
Now we have to figure out what is wrong/misconfigured with the new switch:
we would try to replicate the problem, possibly without a ceph deployment
...
Thanks
Glad to help!
On Tue, Feb 12, 2019 at 4:55 PM Massimo Sgaravatto
wrote:
>
> Thanks a lot Brad !
>
> The problem is indeed in the network: we moved the OSD nodes back to the
> "old" switches and the problem disappeared.
>
> Now we have to figure out what is wrong/misconfigured with the new switch
17 matches
Mail list logo