On 15 June 2015 at 13:09, Gregory Farnum <g...@gregs42.com> wrote:

> On Mon, Jun 15, 2015 at 4:03 AM, Roland Giesler <rol...@giesler.za.net>
> wrote:
> > I have a small cluster of 4 machines and quite a few drives.  After
> about 2
> > - 3 weeks cephfs fails.  It's not properly mounted anymore in
> /mnt/cephfs,
> > which of course causes the VM's running to fail too.
> >
> > In /var/log/syslog I have "/mnt/cephfs: File exists at
> > /usr/share/perl5/PVE/Storage/DirPlugin.pm line 52" repeatedly.
> >
> > There doesn't seem to be anything wrong with ceph at the time.
> >
> > # ceph -s
> >     cluster 40f26838-4760-4b10-a65c-b9c1cd671f2f
> >      health HEALTH_WARN clock skew detected on mon.s1
> >      monmap e2: 2 mons at
> > {h1=192.168.121.30:6789/0,s1=192.168.121.33:6789/0}, election epoch 312,
> > quorum 0,1 h1,s1
> >      mdsmap e401: 1/1/1 up {0=s3=up:active}, 1 up:standby
> >      osdmap e5577: 19 osds: 19 up, 19 in
> >       pgmap v11191838: 384 pgs, 3 pools, 774 GB data, 455 kobjects
> >             1636 GB used, 9713 GB / 11358 GB avail
> >                  384 active+clean
> >   client io 12240 kB/s rd, 1524 B/s wr, 24 op/s
> > # ceph osd tree
> > # id  weight   type name    up/down  reweight
> > -1    11.13    root default
> > -2     8.14        host h1
> >  1     0.9             osd.1    up    1
> >  3     0.9             osd.3    up    1
> >  4     0.9             osd.4    up    1
> >  5     0.68            osd.5    up    1
> >  6     0.68            osd.6    up    1
> >  7     0.68            osd.7    up    1
> >  8     0.68            osd.8    up    1
> >  9     0.68            osd.9    up    1
> > 10     0.68            osd.10   up    1
> > 11     0.68            osd.11   up    1
> > 12     0.68            osd.12   up    1
> > -3     0.45        host s3
> >  2     0.45            osd.2    up    1
> > -4     0.9         host s2
> > 13     0.9             osd.13   up    1
> > -5     1.64        host s1
> > 14     0.29            osd.14   up    1
> >  0     0.27            osd.0    up    1
> > 15     0.27            osd.15   up    1
> > 16     0.27            osd.16   up    1
> > 17     0.27            osd.17   up    1
> > 18     0.27            osd.18   up    1
> >
> > When I "umount -l /mnt/cephfs" and then "mount -a" after that, the the
> ceph
> > volume is loaded again.  I can restart the VM's and all seems well.
> >
> > I can't find errors pertaining to cephfs in the the other logs either.
> >
> > System information:
> >
> > Linux s1 2.6.32-34-pve #1 SMP Fri Dec 19 07:42:04 CET 2014 x86_64
> GNU/Linux
>
> I'm not sure what version of Linux this really is (I assume it's a
> vendor kernel of some kind!), but it's definitely an old one! CephFS
> sees pretty continuous improvements to stability and it could be any
> number of resolved bugs.
>

​This is the stock standard installation of Proxmo​x with CephFS.



> If you can't upgrade the kernel, you might try out the ceph-fuse
> client instead as you can run a much newer and more up-to-date version
> of it, even on the old kernel.


​I'm under the impression that CephFS is the filesystem implimented by
ceph-fuse. Is it not? ​



> Other than that, can you include more
> information about exactly what you mean when saying CephFS unmounts
> itself?
>

​Everything runs fine for weeks.  Then suddenly a user reports that a VM is
not functioning anymore.  On investigation is transpires than CephFS is not
mounted anymore and the error I reported is logged.

I can't see anything else wrong at this stage.  ceph is running, the osd
are all up.

thanks again

Roland​



> -Greg
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to