Re: [ceph-users] CephFS with cache-tier kernel-mount client unable to write (Nautilus)

2020-01-22 Thread Hayashida, Mami
Thanks, Ilya. I just tried modifying the osd cap for client.testuser by getting rid of "tag cephfs data=cephfs_test" part and confirmed this key does work (i.e. lets the CephFS client read/write). It now reads: [client.testuser] key = XXXZZZ caps mds = "allow rw" caps mon = "allow r" caps os

Re: [ceph-users] CephFS with cache-tier kernel-mount client unable to write (Nautilus)

2020-01-21 Thread Ilya Dryomov
On Tue, Jan 21, 2020 at 7:51 PM Hayashida, Mami wrote: > > Ilya, > > Thank you for your suggestions! > > `dmsg` (on the client node) only had `libceph: mon0 10.33.70.222:6789 socket > error on write`. No further detail. But using the admin key (client.admin) > for mounting CephFS solved my pro

Re: [ceph-users] CephFS with cache-tier kernel-mount client unable to write (Nautilus)

2020-01-21 Thread Hayashida, Mami
Ilya, Thank you for your suggestions! `dmsg` (on the client node) only had `libceph: mon0 10.33.70.222:6789 socket error on write`. No further detail. But using the admin key (client.admin) for mounting CephFS solved my problem. I was able to write successfully! :-) $ sudo mount -t ceph 10.33

Re: [ceph-users] CephFS with cache-tier kernel-mount client unable to write (Nautilus)

2020-01-21 Thread Ilya Dryomov
On Tue, Jan 21, 2020 at 6:02 PM Hayashida, Mami wrote: > > I am trying to set up a CephFS with a Cache Tier (for data) on a mini test > cluster, but a kernel-mount CephFS client is unable to write. Cache tier > setup alone seems to be working fine (I tested it with `rados put` and `osd > map`

Re: [ceph-users] cephfs kernel client io performance decreases extremely

2019-12-31 Thread ste...@bit.nl
Quoting renjianxinlover (renjianxinlo...@163.com): > hi,Stefan > could you please provide further guidence? https://docs.ceph.com/docs/master/cephfs/troubleshooting/#slow-requests-mds Do a "dump ops in flight" to see what's going on on the MDS. https://docs.ceph.com/docs/master/cephfs/trou

Re: [ceph-users] cephfs kernel client io performance decreases extremely

2019-12-29 Thread renjianxinlover
hi,Stefan could you please provide further guidence? Brs | | renjianxinlover | | renjianxinlo...@163.com | 签名由网易邮箱大师定制 On 12/28/2019 21:44,renjianxinlover wrote: Sorry what i said was fuzzy before. Currently, my mds is running with certain osds at same node in which SSD drive serves as ca

Re: [ceph-users] cephfs kernel client io performance decreases extremely

2019-12-28 Thread renjianxinlover
Sorry what i said was fuzzy before. Currently, my mds is running with certain osds at same node in which SSD drive serves as cache device. | | renjianxinlover | | renjianxinlo...@163.com | 签名由网易邮箱大师定制 On 12/28/2019 15:49,Stefan Kooman wrote: Quoting renjianxinlover (renjianxinlo...@163.com): HI

Re: [ceph-users] cephfs kernel client io performance decreases extremely

2019-12-27 Thread Stefan Kooman
Quoting renjianxinlover (renjianxinlo...@163.com): > HI, Nathan, thanks for your quick reply! > comand 'ceph status' outputs warning including about ten clients failing to > respond to cache pressure; > in addition, in mds node, 'iostat -x 1' shows drive io usage of mds within > five seconds as f

Re: [ceph-users] cephfs kernel client io performance decreases extremely

2019-12-27 Thread renjianxinlover
HI, Nathan, thanks for your quick reply! comand 'ceph status' outputs warning including about ten clients failing to respond to cache pressure; in addition, in mds node, 'iostat -x 1' shows drive io usage of mds within five seconds as follow, Device: rrqm/s wrqm/s r/s w/srkB

Re: [ceph-users] cephfs kernel client io performance decreases extremely

2019-12-26 Thread Nathan Fish
I would start by viewing "ceph status", drive IO with: "iostat -x 1 /dev/sd{a..z}" and the CPU/RAM usage of the active MDS. If "ceph status" warns that the MDS cache is oversized, that may be an easy fix. On Thu, Dec 26, 2019 at 7:33 AM renjianxinlover wrote: > hello, >recently, after de

Re: [ceph-users] CephFS "denied reconnect attempt" after updating Ceph

2019-12-11 Thread William Edwards
It seems like this has been fixed with client kernel version 4.19.0-0.bpo.5-amd64. -- Groeten, William Edwards Tuxis Internet Engineering   - Originele bericht - Van: William Edwards (wedwa...@tuxis.nl) Datum: 08/13/19 16:46 Naar: ceph-users@lists.ceph.com Onderwerp: [ceph-users] CephFS

Re: [ceph-users] CephFS kernel module lockups in Ubuntu linux-image-5.0.0-32-generic?

2019-11-06 Thread Simon Oosthoek
I finally took the time to report the bug: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1851470 On 29/10/2019 10:44, Simon Oosthoek wrote: > On 24/10/2019 16:23, Christopher Wieringa wrote: >> Hello all, >> >>   >> >> I’ve been using the Ceph kernel modules in Ubuntu to load a CephFS >> f

Re: [ceph-users] cephfs 1 large omap objects

2019-10-30 Thread Patrick Donnelly
On Wed, Oct 30, 2019 at 9:28 AM Jake Grimmett wrote: > > Hi Zheng, > > Many thanks for your helpful post, I've done the following: > > 1) set the threshold to 1024 * 1024: > > # ceph config set osd \ > osd_deep_scrub_large_omap_object_key_threshold 1048576 > > 2) deep scrubbed all of the pgs on th

Re: [ceph-users] cephfs 1 large omap objects

2019-10-30 Thread Jake Grimmett
Hi Zheng, Many thanks for your helpful post, I've done the following: 1) set the threshold to 1024 * 1024: # ceph config set osd \ osd_deep_scrub_large_omap_object_key_threshold 1048576 2) deep scrubbed all of the pgs on the two OSD that reported "Large omap object found." - these were all in p

Re: [ceph-users] CephFS client hanging and cache issues

2019-10-30 Thread Bob Farrell
Thanks a lot and sorry for the spam, I should have checked ! We are on 18.04, kernel is currently upgrading so if you don't hear back from me then it is fixed. Thanks for the amazing support ! On Wed, 30 Oct 2019, 09:54 Lars Täuber, wrote: > Hi. > > Sounds like you use kernel clients with kerne

Re: [ceph-users] CephFS client hanging and cache issues

2019-10-30 Thread Lars Täuber
Hi. Sounds like you use kernel clients with kernels from canonical/ubuntu. Two kernels have a bug: 4.15.0-66 and 5.0.0-32 Updated kernels are said to have fixes. Older kernels also work: 4.15.0-65 and 5.0.0-31 Lars Wed, 30 Oct 2019 09:42:16 + Bob Farrell ==> ceph-users : > Hi. We are ex

Re: [ceph-users] CephFS client hanging and cache issues

2019-10-30 Thread Paul Emmerich
Kernel bug due to a bad backport, see recent posts here. Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 On Wed, Oct 30, 2019 at 10:42 AM Bob Farrell wrote: > > Hi. W

Re: [ceph-users] cephfs 1 large omap objects

2019-10-29 Thread Yan, Zheng
see https://tracker.ceph.com/issues/42515. just ignore the warning for now On Mon, Oct 7, 2019 at 7:50 AM Nigel Williams wrote: > > Out of the blue this popped up (on an otherwise healthy cluster): > > HEALTH_WARN 1 large omap objects > LARGE_OMAP_OBJECTS 1 large omap objects > 1 large objec

Re: [ceph-users] CephFS kernel module lockups in Ubuntu linux-image-5.0.0-32-generic?

2019-10-29 Thread Simon Oosthoek
On 24/10/2019 16:23, Christopher Wieringa wrote: > Hello all, > >   > > I’ve been using the Ceph kernel modules in Ubuntu to load a CephFS > filesystem quite successfully for several months.  Yesterday, I went > through a round of updates on my Ubuntu 18.04 machines, which loaded > linux-image-5.

Re: [ceph-users] cephfs 1 large omap objects

2019-10-28 Thread Jake Grimmett
Hi Paul, Nigel, I'm also seeing "HEALTH_WARN 6 large omap objects" warnings with cephfs after upgrading to 14.2.4: The affected osd's are used (only) by the metadata pool: POOLID STORED OBJECTS USED %USED MAX AVAIL mds_ssd 1 64 GiB 1.74M 65 GiB 4.47 466 GiB See below for more log de

Re: [ceph-users] CephFS kernel module lockups in Ubuntu linux-image-5.0.0-32-generic?

2019-10-24 Thread Ilya Dryomov
On Thu, Oct 24, 2019 at 5:45 PM Paul Emmerich wrote: > > Could it be related to the broken backport as described in > https://tracker.ceph.com/issues/40102 ? > > (It did affect 4.19, not sure about 5.0) It does, I have just updated the linked ticket to reflect that. Thanks, Ilya

Re: [ceph-users] CephFS kernel module lockups in Ubuntu linux-image-5.0.0-32-generic?

2019-10-24 Thread Sasha Litvak
Also, search for this topic on the list. Ubuntu Disco with most recent Kernel 5.0.0-32 seems to be instable On Thu, Oct 24, 2019 at 10:45 AM Paul Emmerich wrote: > Could it be related to the broken backport as described in > https://tracker.ceph.com/issues/40102 ? > > (It did affect 4.19, not

Re: [ceph-users] CephFS kernel module lockups in Ubuntu linux-image-5.0.0-32-generic?

2019-10-24 Thread Paul Emmerich
Could it be related to the broken backport as described in https://tracker.ceph.com/issues/40102 ? (It did affect 4.19, not sure about 5.0) Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel

Re: [ceph-users] cephfs 1 large omap objects

2019-10-08 Thread Paul Emmerich
Hi, the default for this warning changed recently (see other similar threads on the mailing list), it was 2 million before 14.2.3. I don't think the new default of 200k is a good choice, so increasing it is a reasonable work-around. Paul -- Paul Emmerich Looking for help with your Ceph cluste

Re: [ceph-users] cephfs 1 large omap objects

2019-10-06 Thread Nigel Williams
I've adjusted the threshold: ceph config set osd osd_deep_scrub_large_omap_object_key_threshold 35 Colleague suggested that this will take effect on the next deep-scrub. Is the default of 200,000 too small? will this be adjusted in future releases or is it meant to be adjusted in some use-ca

Re: [ceph-users] cephfs 1 large omap objects

2019-10-06 Thread Nigel Williams
I followed some other suggested steps, and have this: root@cnx-17:/var/log/ceph# zcat ceph-osd.178.log.?.gz|fgrep Large 2019-10-02 13:28:39.412 7f482ab1c700 0 log_channel(cluster) log [WRN] : Large omap object found. Object: 2:654134d2:::mds0_openfiles.0:head Key count: 306331 Size (bytes): 13993

Re: [ceph-users] cephfs performance issue MDSs report slow requests and osd memory usage

2019-09-24 Thread Robert LeBlanc
On Tue, Sep 24, 2019 at 4:33 AM Thomas <74cmo...@gmail.com> wrote: > > Hi, > > I'm experiencing the same issue with this setting in ceph.conf: > osd op queue = wpq > osd op queue cut off = high > > Furthermore I cannot read any old data in the relevant pool that is > serving CephFS.

Re: [ceph-users] cephfs performance issue MDSs report slow requests and osd memory usage

2019-09-24 Thread Thomas
Hi, I'm experiencing the same issue with this setting in ceph.conf:     osd op queue = wpq     osd op queue cut off = high Furthermore I cannot read any old data in the relevant pool that is serving CephFS. However, I can write new data and read this new data. Regards Thomas Am 24.09.20

Re: [ceph-users] cephfs performance issue MDSs report slow requests and osd memory usage

2019-09-24 Thread Yoann Moulin
Hello, >> I have a Ceph Nautilus Cluster 14.2.1 for cephfs only on 40x 1.8T SAS disk >> (no SSD) in 20 servers. >> >> I often get "MDSs report slow requests" and plenty of "[WRN] 3 slow >> requests, 0 included below; oldest blocked for > 60281.199503 secs" >> >> After a few investigations, I saw

Re: [ceph-users] cephfs performance issue MDSs report slow requests and osd memory usage

2019-09-23 Thread Robert LeBlanc
On Thu, Sep 19, 2019 at 2:36 AM Yoann Moulin wrote: > > Hello, > > I have a Ceph Nautilus Cluster 14.2.1 for cephfs only on 40x 1.8T SAS disk > (no SSD) in 20 servers. > > > cluster: > > id: 778234df-5784-4021-b983-0ee1814891be > > health: HEALTH_WARN > > 2 MDSs report s

Re: [ceph-users] CephFS deletion performance

2019-09-18 Thread Hector Martin
On 17/09/2019 17.46, Yan, Zheng wrote: > when a snapshoted directory is deleted, mds moves the directory into > to stray directory. You have 57k strays, each time mds have a cache > miss for stray, mds needs to load a stray dirfrag. This is very > inefficient because a stray dirfrag contains lots

Re: [ceph-users] cephfs: apache locks up after parallel reloads on multiple nodes

2019-09-17 Thread Gregory Farnum
On Tue, Sep 17, 2019 at 8:12 AM Sander Smeenk wrote: > > Quoting Paul Emmerich (paul.emmer...@croit.io): > > > Yeah, CephFS is much closer to POSIX semantics for a filesystem than > > NFS. There's an experimental relaxed mode called LazyIO but I'm not > > sure if it's applicable here. > > Out of c

Re: [ceph-users] cephfs: apache locks up after parallel reloads on multiple nodes

2019-09-17 Thread Sander Smeenk
Quoting Paul Emmerich (paul.emmer...@croit.io): > Yeah, CephFS is much closer to POSIX semantics for a filesystem than > NFS. There's an experimental relaxed mode called LazyIO but I'm not > sure if it's applicable here. Out of curiosity, how would CephFS being more POSIX compliant cause this muc

Re: [ceph-users] CephFS deletion performance

2019-09-17 Thread Yan, Zheng
On Sat, Sep 14, 2019 at 8:57 PM Hector Martin wrote: > > On 13/09/2019 16.25, Hector Martin wrote: > > Is this expected for CephFS? I know data deletions are asynchronous, but > > not being able to delete metadata/directories without an undue impact on > > the whole filesystem performance is somew

Re: [ceph-users] CephFS deletion performance

2019-09-14 Thread Hector Martin
On 13/09/2019 16.25, Hector Martin wrote: > Is this expected for CephFS? I know data deletions are asynchronous, but > not being able to delete metadata/directories without an undue impact on > the whole filesystem performance is somewhat problematic. I think I'm getting a feeling for who the cu

Re: [ceph-users] CephFS client-side load issues for write-/delete-heavy workloads

2019-09-13 Thread Janek Bevendorff
Here's some more information on this issue. I found the MDS host not to have any load issues, but other clients who have the FS mounted cannot execute statfs/fstatfs on the mount, since the call never returns while my rsync job is running. Other syscalls like fstat work without problems. Thus,

Re: [ceph-users] cephfs: apache locks up after parallel reloads on multiple nodes

2019-09-12 Thread jesper
Thursday, 12 September 2019, 17.16 +0200 from Paul Emmerich : >Yeah, CephFS is much closer to POSIX semantics for a filesystem than >NFS. There's an experimental relaxed mode called LazyIO but I'm not >sure if it's applicable here. > >You can debug this by dumping slow requests from the MDS se

Re: [ceph-users] cephfs: apache locks up after parallel reloads on multiple nodes

2019-09-12 Thread Paul Emmerich
Yeah, CephFS is much closer to POSIX semantics for a filesystem than NFS. There's an experimental relaxed mode called LazyIO but I'm not sure if it's applicable here. You can debug this by dumping slow requests from the MDS servers via the admin socket Paul -- Paul Emmerich Looking for help w

Re: [ceph-users] cephfs-snapshots causing mds failover, hangs

2019-08-27 Thread thoralf schulze
hi Zheng, On 8/26/19 3:31 PM, Yan, Zheng wrote: […] > change code to : […] we can happily confirm that this resolves the issue. thank you _very_ much & with kind regards, t. signature.asc Description: OpenPGP digital signature ___ ceph-users mailin

Re: [ceph-users] cephfs full, 2/3 Raw capacity used

2019-08-26 Thread Mark Nelson
On 8/26/19 7:39 AM, Wido den Hollander wrote: On 8/26/19 1:35 PM, Simon Oosthoek wrote: On 26-08-19 13:25, Simon Oosthoek wrote: On 26-08-19 13:11, Wido den Hollander wrote: The reweight might actually cause even more confusion for the balancer. The balancer uses upmap mode and that re-allo

Re: [ceph-users] cephfs-snapshots causing mds failover, hangs

2019-08-26 Thread Yan, Zheng
On Mon, Aug 26, 2019 at 9:25 PM thoralf schulze wrote: > > hi Zheng - > > On 8/26/19 2:55 PM, Yan, Zheng wrote: > > I tracked down the bug > > https://tracker.ceph.com/issues/41434 > > wow, that was quick - thank you for investigating. we are looking > forward for the fix :-) > > in the meantime,

Re: [ceph-users] cephfs-snapshots causing mds failover, hangs

2019-08-26 Thread thoralf schulze
hi Zheng - On 8/26/19 2:55 PM, Yan, Zheng wrote: > I tracked down the bug > https://tracker.ceph.com/issues/41434 wow, that was quick - thank you for investigating. we are looking forward for the fix :-) in the meantime, is there anything we can do to prevent q == p->second.end() from happening?

Re: [ceph-users] cephfs-snapshots causing mds failover, hangs

2019-08-26 Thread Yan, Zheng
On Mon, Aug 26, 2019 at 6:57 PM thoralf schulze wrote: > > hi Zheng, > > On 8/21/19 4:32 AM, Yan, Zheng wrote: > > Please enable debug mds (debug_mds=10), and try reproducing it again. > > please find the logs at > https://www.user.tu-berlin.de/thoralf.schulze/ceph-debug.tar.xz . > > we managed to

Re: [ceph-users] cephfs full, 2/3 Raw capacity used

2019-08-26 Thread Wido den Hollander
On 8/26/19 1:35 PM, Simon Oosthoek wrote: > On 26-08-19 13:25, Simon Oosthoek wrote: >> On 26-08-19 13:11, Wido den Hollander wrote: >> >>> >>> The reweight might actually cause even more confusion for the balancer. >>> The balancer uses upmap mode and that re-allocates PGs to different OSDs >>

Re: [ceph-users] cephfs full, 2/3 Raw capacity used

2019-08-26 Thread Simon Oosthoek
On 26-08-19 13:25, Simon Oosthoek wrote: On 26-08-19 13:11, Wido den Hollander wrote: The reweight might actually cause even more confusion for the balancer. The balancer uses upmap mode and that re-allocates PGs to different OSDs if needed. Looking at the output send earlier I have some repl

Re: [ceph-users] cephfs full, 2/3 Raw capacity used

2019-08-26 Thread Simon Oosthoek
On 26-08-19 13:11, Wido den Hollander wrote: The reweight might actually cause even more confusion for the balancer. The balancer uses upmap mode and that re-allocates PGs to different OSDs if needed. Looking at the output send earlier I have some replies. See below. Looking at this outpu

Re: [ceph-users] cephfs full, 2/3 Raw capacity used

2019-08-26 Thread Wido den Hollander
t; /Simon > >> >> Regards >> >> >> -Mensaje original- >> De: ceph-users En nombre de Simon >> Oosthoek >> Enviado el: lunes, 26 de agosto de 2019 11:52 >> Para: Dan van der Ster >> CC: ceph-users >> Asunto: Re: [ceph

Re: [ceph-users] cephfs-snapshots causing mds failover, hangs

2019-08-26 Thread thoralf schulze
hi Zheng, On 8/21/19 4:32 AM, Yan, Zheng wrote: > Please enable debug mds (debug_mds=10), and try reproducing it again. please find the logs at https://www.user.tu-berlin.de/thoralf.schulze/ceph-debug.tar.xz . we managed to reproduce the issue as a worst case scenario: before snapshotting, juju-

Re: [ceph-users] cephfs full, 2/3 Raw capacity used

2019-08-26 Thread Paul Emmerich
; > > > > -Mensaje original- > > De: ceph-users En nombre de Simon > > Oosthoek > > Enviado el: lunes, 26 de agosto de 2019 11:52 > > Para: Dan van der Ster > > CC: ceph-users > > Asunto: Re: [ceph-users] cephfs full, 2/3 Raw capacity used &g

Re: [ceph-users] cephfs full, 2/3 Raw capacity used

2019-08-26 Thread Simon Oosthoek
imon Regards -Mensaje original- De: ceph-users En nombre de Simon Oosthoek Enviado el: lunes, 26 de agosto de 2019 11:52 Para: Dan van der Ster CC: ceph-users Asunto: Re: [ceph-users] cephfs full, 2/3 Raw capacity used On 26-08-19 11:37, Dan van der Ster wrote: Thanks. The version and balancer c

Re: [ceph-users] cephfs full, 2/3 Raw capacity used

2019-08-26 Thread EDH - Manuel Rios Fernandez
der Ster CC: ceph-users Asunto: Re: [ceph-users] cephfs full, 2/3 Raw capacity used On 26-08-19 11:37, Dan van der Ster wrote: > Thanks. The version and balancer config look good. > > So you can try `ceph osd reweight osd.10 0.8` to see if it helps to > get you out of this. I'

Re: [ceph-users] cephfs full, 2/3 Raw capacity used

2019-08-26 Thread Simon Oosthoek
On 26-08-19 11:37, Dan van der Ster wrote: Thanks. The version and balancer config look good. So you can try `ceph osd reweight osd.10 0.8` to see if it helps to get you out of this. I've done this and the next fullest 3 osds. This will take some time to recover, I'll let you know when it's d

Re: [ceph-users] cephfs full, 2/3 Raw capacity used

2019-08-26 Thread Dan van der Ster
Thanks. The version and balancer config look good. So you can try `ceph osd reweight osd.10 0.8` to see if it helps to get you out of this. -- dan On Mon, Aug 26, 2019 at 11:35 AM Simon Oosthoek wrote: > > On 26-08-19 11:16, Dan van der Ster wrote: > > Hi, > > > > Which version of ceph are you

Re: [ceph-users] cephfs full, 2/3 Raw capacity used

2019-08-26 Thread Simon Oosthoek
On 26-08-19 11:16, Dan van der Ster wrote: Hi, Which version of ceph are you using? Which balancer mode? Nautilus (14.2.2), balancer is in upmap mode. The balancer score isn't a percent-error or anything humanly usable. `ceph osd df tree` can better show you exactly which osds are over/under

Re: [ceph-users] cephfs full, 2/3 Raw capacity used

2019-08-26 Thread Dan van der Ster
Hi, Which version of ceph are you using? Which balancer mode? The balancer score isn't a percent-error or anything humanly usable. `ceph osd df tree` can better show you exactly which osds are over/under utilized and by how much. You might be able to manually fix things by using `ceph osd reweigh

Re: [ceph-users] cephfs-snapshots causing mds failover, hangs

2019-08-21 Thread thoralf schulze
hi zheng, On 8/21/19 4:32 AM, Yan, Zheng wrote: > Please enable debug mds (debug_mds=10), and try reproducing it again. we will get back with the logs on monday. thank you & with kind regards, t. signature.asc Description: OpenPGP digital signature

Re: [ceph-users] cephfs-snapshots causing mds failover, hangs

2019-08-20 Thread Yan, Zheng
On Tue, Aug 20, 2019 at 9:43 PM thoralf schulze wrote: > > hi there, > > we are struggling with the creation of cephfs-snapshots: doing so > reproducible causes a failover of our metadata servers. afterwards, the > demoted mds servers won't be available as standby servers and the mds > daemons on

Re: [ceph-users] CephFS meltdown fallout: mds assert failure, kernel oopses

2019-08-15 Thread Jeff Layton
On Thu, 2019-08-15 at 16:45 +0900, Hector Martin wrote: > On 15/08/2019 03.40, Jeff Layton wrote: > > On Wed, 2019-08-14 at 19:29 +0200, Ilya Dryomov wrote: > > > Jeff, the oops seems to be a NULL dereference in ceph_lock_message(). > > > Please take a look. > > > > > > > (sorry for duplicate mai

Re: [ceph-users] CephFS meltdown fallout: mds assert failure, kernel oopses

2019-08-15 Thread Hector Martin
On 15/08/2019 03.40, Jeff Layton wrote: On Wed, 2019-08-14 at 19:29 +0200, Ilya Dryomov wrote: Jeff, the oops seems to be a NULL dereference in ceph_lock_message(). Please take a look. (sorry for duplicate mail -- the other one ended up in moderation) Thanks Ilya, That function is pretty st

Re: [ceph-users] CephFS meltdown fallout: mds assert failure, kernel oopses

2019-08-14 Thread Jeff Layton
On Wed, 2019-08-14 at 19:29 +0200, Ilya Dryomov wrote: > On Tue, Aug 13, 2019 at 1:06 PM Hector Martin wrote: > > I just had a minor CephFS meltdown caused by underprovisioned RAM on the > > MDS servers. This is a CephFS with two ranks; I manually failed over the > > first rank and the new MDS ser

Re: [ceph-users] CephFS meltdown fallout: mds assert failure, kernel oopses

2019-08-14 Thread Ilya Dryomov
On Tue, Aug 13, 2019 at 1:06 PM Hector Martin wrote: > > I just had a minor CephFS meltdown caused by underprovisioned RAM on the > MDS servers. This is a CephFS with two ranks; I manually failed over the > first rank and the new MDS server ran out of RAM in the rejoin phase > (ceph-mds didn't get

Re: [ceph-users] Cephfs cannot mount with kernel client

2019-08-14 Thread Serkan Çoban
Hi, just double checked the stack trace and I can confirm it is same as in tracker. compaction also worked for me, I can now mount cephfs without problems. Thanks for help, Serkan On Tue, Aug 13, 2019 at 6:44 PM Ilya Dryomov wrote: > > On Tue, Aug 13, 2019 at 4:30 PM Serkan Çoban wrote: > > > >

Re: [ceph-users] Cephfs cannot mount with kernel client

2019-08-13 Thread Ilya Dryomov
On Tue, Aug 13, 2019 at 4:30 PM Serkan Çoban wrote: > > I am out of office right now, but I am pretty sure it was the same > stack trace as in tracker. > I will confirm tomorrow. > Any workarounds? Compaction # echo 1 >/proc/sys/vm/compact_memory might help if the memory in question is moveable

Re: [ceph-users] Cephfs cannot mount with kernel client

2019-08-13 Thread Serkan Çoban
I am out of office right now, but I am pretty sure it was the same stack trace as in tracker. I will confirm tomorrow. Any workarounds? On Tue, Aug 13, 2019 at 5:16 PM Ilya Dryomov wrote: > > On Tue, Aug 13, 2019 at 3:57 PM Serkan Çoban wrote: > > > > I checked /var/log/messages and see there ar

Re: [ceph-users] Cephfs cannot mount with kernel client

2019-08-13 Thread Ilya Dryomov
On Tue, Aug 13, 2019 at 3:57 PM Serkan Çoban wrote: > > I checked /var/log/messages and see there are page allocation > failures. But I don't understand why? > The client has 768GB memory and most of it is not used, cluster has > 1500OSDs. Do I need to increase vm.min_free_kytes? It is set to 1GB

Re: [ceph-users] Cephfs cannot mount with kernel client

2019-08-13 Thread Serkan Çoban
I checked /var/log/messages and see there are page allocation failures. But I don't understand why? The client has 768GB memory and most of it is not used, cluster has 1500OSDs. Do I need to increase vm.min_free_kytes? It is set to 1GB now. Also huge_page is disabled in clients. Thanks, Serkan On

Re: [ceph-users] Cephfs cannot mount with kernel client

2019-08-13 Thread Ilya Dryomov
On Tue, Aug 13, 2019 at 12:36 PM Serkan Çoban wrote: > > Hi, > > Just installed nautilus 14.2.2 and setup cephfs on it. OS is all centos 7.6. > From a client I can mount the cephfs with ceph-fuse, but I cannot > mount with ceph kernel client. > It gives "mount error 110 connection timeout" and I c

Re: [ceph-users] CephFS snapshot for backup & disaster recovery

2019-08-08 Thread Alexandre DERUMIER
ce between two snapshots? I think it's on the roadmap for next ceph version. - Mail original - De: "Eitan Mosenkis" À: "Vitaliy Filippov" Cc: "ceph-users" Envoyé: Lundi 5 Août 2019 18:43:00 Objet: Re: [ceph-users] CephFS snapshot for backup &

Re: [ceph-users] CephFS snapshot for backup & disaster recovery

2019-08-05 Thread Eitan Mosenkis
I'm using it for a NAS to make backups from the other machines on my home network. Since everything is in one location, I want to keep a copy offsite for disaster recovery. Running Ceph across the internet is not recommended and is also very expensive compared to just storing snapshots. On Sun, Au

Re: [ceph-users] CephFS snapshot for backup & disaster recovery

2019-08-05 Thread Lars Marowsky-Bree
On 2019-08-04T13:27:00, Eitan Mosenkis wrote: > I'm running a single-host Ceph cluster for CephFS and I'd like to keep > backups in Amazon S3 for disaster recovery. Is there a simple way to > extract a CephFS snapshot as a single file and/or to create a file that > represents the incremental diff

Re: [ceph-users] CephFS Recovery/Internals Questions

2019-08-04 Thread Gregory Farnum
On Fri, Aug 2, 2019 at 12:13 AM Pierre Dittes wrote: > > Hi, > we had some major up with our CephFS. Long story short..no Journal backup > and journal was truncated. > Now..I still see a metadata pool with all objects and datapool is fine, from > what I know neither was corrupted. Last mount

Re: [ceph-users] CephFS snapshot for backup & disaster recovery

2019-08-04 Thread Виталий Филиппов
Afaik no. What's the idea of running a single-host cephfs cluster? 4 августа 2019 г. 13:27:00 GMT+03:00, Eitan Mosenkis пишет: >I'm running a single-host Ceph cluster for CephFS and I'd like to keep >backups in Amazon S3 for disaster recovery. Is there a simple way to >extract a CephFS snapshot a

Re: [ceph-users] cephfs quota setfattr permission denied

2019-07-31 Thread Mattia Belluco
Hi Nathan, Indeed that was the reason. With your hint I was able to find the relevant documentation: https://docs.ceph.com/docs/master/cephfs/client-auth/ that is completely absent from: https://docs.ceph.com/docs/master/cephfs/quota/#configuration I will send a pull request to include the lin

Re: [ceph-users] cephfs quota setfattr permission denied

2019-07-31 Thread Nathan Fish
The client key which is used to mount the FS needs the 'p' permission to set xattrs. eg: ceph fs authorize cephfs client.foo / rwsp That might be your problem. On Wed, Jul 31, 2019 at 5:43 AM Mattia Belluco wrote: > > Dear ceph users, > > We have been recently trying to use the two quota attrib

Re: [ceph-users] cephfs snapshot scripting questions

2019-07-19 Thread Frank Schilder
This is a question I'm interested as well. Right now, I'm using cephfs-snap from the storage tools project and am quite happy with that. I made a small modification, but will probably not change. Its a simple and robust tool. About where to take snapshots. There seems to be a bug in cephfs that

Re: [ceph-users] cephfs snapshot scripting questions

2019-07-17 Thread Marc Roos
H, ok ok, test it first, can't remember if it is finished. Checks also if it is usefull to create a snapshot, by checking the size of the directory. [@ cron.daily]# cat backup-archive-mail.sh #!/bin/bash cd /home/ for account in `ls -c1 /home/mail-archive/ | sort` do /usr/local/sbin/ba

Re: [ceph-users] CephFS : Kernel/Fuse technical differences

2019-06-25 Thread Robert LeBlanc
There may also be more memory coping involved instead of just passing pointers around as well, but I'm not 100% sure. Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Mon, Jun 24, 2019 at 10:28 AM Jeff Layton wrote: > On Mon, 2019-06-24 at 15

Re: [ceph-users] CephFS : Kernel/Fuse technical differences

2019-06-24 Thread Jeff Layton
On Mon, 2019-06-24 at 15:51 +0200, Hervé Ballans wrote: > Hi everyone, > > We successfully use Ceph here for several years now, and since recently, > CephFS. > > From the same CephFS server, I notice a big difference between a fuse > mount and a kernel mount (10 times faster for kernel mount).

Re: [ceph-users] CephFS damaged and cannot recover

2019-06-19 Thread Patrick Donnelly
On Wed, Jun 19, 2019 at 9:19 AM Wei Jin wrote: > > There are plenty of data in this cluster (2PB), please help us, thx. > Before doing this dangerous > operations(http://docs.ceph.com/docs/master/cephfs/disaster-recovery-experts/#disaster-recovery-experts) > , any suggestions? > > Ceph version: 12

Re: [ceph-users] Cephfs free space vs ceph df free space disparity

2019-05-28 Thread Robert Ruge
To: Robert Ruge Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Cephfs free space vs ceph df free space disparity On 27.05.19 09:08, Stefan Kooman wrote: > Quoting Robert Ruge (robert.r...@deakin.edu.au): >> Ceph newbie question. >> >> I have a disparity between the f

Re: [ceph-users] Cephfs free space vs ceph df free space disparity

2019-05-28 Thread Peter Wienemann
On 27.05.19 09:08, Stefan Kooman wrote: > Quoting Robert Ruge (robert.r...@deakin.edu.au): >> Ceph newbie question. >> >> I have a disparity between the free space that my cephfs file system >> is showing and what ceph df is showing. As you can see below my >> cephfs file system says there is 9.5T

Re: [ceph-users] Cephfs free space vs ceph df free space disparity

2019-05-27 Thread Stefan Kooman
Quoting Robert Ruge (robert.r...@deakin.edu.au): > Ceph newbie question. > > I have a disparity between the free space that my cephfs file system > is showing and what ceph df is showing. As you can see below my > cephfs file system says there is 9.5TB free however ceph df says there > is 186TB w

Re: [ceph-users] CephFS object mapping.

2019-05-24 Thread Robert LeBlanc
On Fri, May 24, 2019 at 2:14 AM Burkhard Linke < burkhard.li...@computational.bio.uni-giessen.de> wrote: > Hi, > On 5/22/19 5:53 PM, Robert LeBlanc wrote: > > When you say 'some' is it a fixed offset that the file data starts? Is the > first stripe just metadata? > > No, the first stripe contains

Re: [ceph-users] CephFS object mapping.

2019-05-24 Thread Burkhard Linke
Hi, On 5/22/19 5:53 PM, Robert LeBlanc wrote: On Wed, May 22, 2019 at 12:22 AM Burkhard Linke > wrote: Hi, On 5/21/19 9:46 PM, Robert LeBlanc wrote: > I'm at a new job working with Ceph again and am excited to back in the

Re: [ceph-users] cephfs causing high load on vm, taking down 15 min later another cephfs vm

2019-05-23 Thread Frank Schilder
Hi Marc, if you can exclude network problems, you can ignore this message. The only time we observed something that might be similar to your problem was, when a network connection was overloaded. Potential causes include - broadcast storm - the "too much cache memory" issues https://www.suse.c

Re: [ceph-users] CephFS object mapping.

2019-05-22 Thread Robert LeBlanc
On Wed, May 22, 2019 at 12:22 AM Burkhard Linke < burkhard.li...@computational.bio.uni-giessen.de> wrote: > Hi, > > On 5/21/19 9:46 PM, Robert LeBlanc wrote: > > I'm at a new job working with Ceph again and am excited to back in the > > community! > > > > I can't find any documentation to support

Re: [ceph-users] CephFS msg length greater than osd_max_write_size

2019-05-22 Thread Ryan Leimenstoll
Thanks for the reply! We will be more proactive about evicting clients in the future rather than waiting. One followup however, it seems that the filesystem going read only was only a WARNING state, which didn’t immediately catch our eye due to some other rebalancing operations. Is there a rea

Re: [ceph-users] CephFS msg length greater than osd_max_write_size

2019-05-22 Thread Yan, Zheng
On Tue, May 21, 2019 at 6:10 AM Ryan Leimenstoll wrote: > > Hi all, > > We recently encountered an issue where our CephFS filesystem unexpectedly was > set to read-only. When we look at some of the logs from the daemons I can see > the following: > > On the MDS: > ... > 2019-05-18 16:34:24.341 7

Re: [ceph-users] Cephfs client evicted, how to unmount the filesystem on the client?

2019-05-22 Thread Yan, Zheng
try 'umount -f' On Tue, May 21, 2019 at 4:41 PM Marc Roos wrote: > > > > > > [@ceph]# ps -aux | grep D > USER PID %CPU %MEMVSZ RSS TTY STAT START TIME COMMAND > root 12527 0.0 0.0 123520 932 pts/1D+ 09:26 0:00 umount > /home/mail-archive > root 14549 0.2 0

Re: [ceph-users] CephFS object mapping.

2019-05-22 Thread Burkhard Linke
Hi, On 5/21/19 9:46 PM, Robert LeBlanc wrote: I'm at a new job working with Ceph again and am excited to back in the community! I can't find any documentation to support this, so please help me understand if I got this right. I've got a Jewel cluster with CephFS and we have an inconsistent

Re: [ceph-users] cephfs causing high load on vm, taking down 15 min later another cephfs vm

2019-05-21 Thread Marc Roos
No, but even if, I never had any issues when running multiple scrubs. -Original Message- From: EDH - Manuel Rios Fernandez [mailto:mrios...@easydatahost.com] Sent: dinsdag 21 mei 2019 10:03 To: Marc Roos; 'ceph-users' Subject: RE: [ceph-users] cephfs causing high load on

Re: [ceph-users] cephfs causing high load on vm, taking down 15 min later another cephfs vm

2019-05-21 Thread EDH - Manuel Rios Fernandez
Hi Marc Is there any scrub / deepscrub running in the affected OSDs? Best Regards, Manuel -Mensaje original- De: ceph-users En nombre de Marc Roos Enviado el: martes, 21 de mayo de 2019 10:01 Para: ceph-users ; Marc Roos Asunto: Re: [ceph-users] cephfs causing high load on vm, taking

Re: [ceph-users] cephfs causing high load on vm, taking down 15 min later another cephfs vm

2019-05-21 Thread Marc Roos
users@lists.ceph.com; Marc Roos Subject: RE: [ceph-users] cephfs causing high load on vm, taking down 15 min later another cephfs vm I have got this today again? I cannot unmount the filesystem and looks like some osd's are having 100% cpu utilization? -Original Message- From: Marc

Re: [ceph-users] cephfs causing high load on vm, taking down 15 min later another cephfs vm

2019-05-21 Thread Marc Roos
I have got this today again? I cannot unmount the filesystem and looks like some osd's are having 100% cpu utilization? -Original Message- From: Marc Roos Sent: maandag 20 mei 2019 12:42 To: ceph-users Subject: [ceph-users] cephfs causing high load on vm, taking down 15 min later

Re: [ceph-users] cephfs deleting files No space left on device

2019-05-10 Thread Rafael Lopez
Hey Kenneth, We encountered this when the number of strays (unlinked files yet to be purged) reached 1 million, which is a result of many many file removals happening on the fs repeatedly. It can also happen when there are more than 100k files in a dir with default settings. You can tune it via '

Re: [ceph-users] cephfs deleting files No space left on device

2019-05-10 Thread Paul Emmerich
About how many files are we talking here? Implementation detail on file deletion to understand why this might happen: deletion is async, deleting a file inserts it into the purge queue and the actual data will be removed in the background. Paul -- Paul Emmerich Looking for help with your Ceph

Re: [ceph-users] Cephfs on an EC Pool - What determines object size

2019-05-02 Thread Daniel Williams
Thanks so much for your help! On Mon, Apr 29, 2019 at 6:49 PM Gregory Farnum wrote: > Yes, check out the file layout options: > http://docs.ceph.com/docs/master/cephfs/file-layouts/ > > On Mon, Apr 29, 2019 at 3:32 PM Daniel Williams > wrote: > > > > Is the 4MB configurable? > > > > On Mon, Apr

Re: [ceph-users] Cephfs on an EC Pool - What determines object size

2019-04-29 Thread Gregory Farnum
Yes, check out the file layout options: http://docs.ceph.com/docs/master/cephfs/file-layouts/ On Mon, Apr 29, 2019 at 3:32 PM Daniel Williams wrote: > > Is the 4MB configurable? > > On Mon, Apr 29, 2019 at 4:36 PM Gregory Farnum wrote: >> >> CephFS automatically chunks objects into 4MB objects b

Re: [ceph-users] Cephfs on an EC Pool - What determines object size

2019-04-29 Thread Daniel Williams
Is the 4MB configurable? On Mon, Apr 29, 2019 at 4:36 PM Gregory Farnum wrote: > CephFS automatically chunks objects into 4MB objects by default. For > an EC pool, RADOS internally will further subdivide them based on the > erasure code and striping strategy, with a layout that can vary. But > b

Re: [ceph-users] Cephfs on an EC Pool - What determines object size

2019-04-29 Thread Gregory Farnum
CephFS automatically chunks objects into 4MB objects by default. For an EC pool, RADOS internally will further subdivide them based on the erasure code and striping strategy, with a layout that can vary. But by default if you have eg an 8+3 EC code, you'll end up with a bunch of (4MB/8=)512KB objec

  1   2   3   4   5   6   7   8   9   10   >