23 PM Patrick Donnelly wrote:
> Looks like this bug: https://tracker.ceph.com/issues/41148
>
> On Wed, Oct 9, 2019 at 1:15 PM David C wrote:
> >
> > Hi Daniel
> >
> > Thanks for looking into this. I hadn't installed ceph-debuginfo, here's
> the bt wit
#x27;s causing the crash. Can you get line numbers from your backtrace?
>
> Daniel
>
> On 10/7/19 9:59 AM, David C wrote:
> > Hi All
> >
> > Further to my previous messages, I upgraded
> > to libcephfs2-14.2.2-0.el7.x86_64 as suggested and things certainly seem
&
:
rw,relatime,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,soft,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=removed,local_lock=none,addr=removed)
On Fri, Jul 19, 2019 at 5:47 PM David C wrote:
> Thanks, Jeff. I'll give 14.2.2 a go when it's released.
>
> On Wed, 17 J
se it'll show up in.
> >
> > Cheers,
> > Jeff
> >
> > On Wed, 2019-07-17 at 10:36 +0100, David C wrote:
> > > Thanks for taking a look at this, Daniel. Below is the only
> interesting bit from the Ceph MDS log at the time of the crash but I
> suspect t
beyond my expertise, at this point. Maybe some ceph
> logs would help?
>
> Daniel
>
> On 7/15/19 10:54 AM, David C wrote:
> > This list has been deprecated. Please subscribe to the new devel list at
> lists.nfs-ganesha.org.
> >
> >
> > Hi All
> >
>
7:07 PM Jeff Layton wrote:
> On Tue, Apr 16, 2019 at 10:36 AM David C wrote:
> >
> > Hi All
> >
> > I have a single export of my cephfs using the ceph_fsal [1]. A CentOS 7
> machine mounts a sub-directory of the export [2] and is using it for the
> home directory o
On Sat, 27 Apr 2019, 18:50 Nikhil R, wrote:
> Guys,
> We now have a total of 105 osd’s on 5 baremetal nodes each hosting 21
> osd’s on HDD which are 7Tb with journals on HDD too. Each journal is about
> 5GB
>
This would imply you've got a separate hdd partition for journals, I don't
think there'
Hi All
I have a single export of my cephfs using the ceph_fsal [1]. A CentOS 7
machine mounts a sub-directory of the export [2] and is using it for the
home directory of a user (e.g everything under ~ is on the server).
This works fine until I start a long sequential write into the home
directory
Out of curiosity, are you guys re-exporting the fs to clients over
something like nfs or running applications directly on the OSD nodes?
On Tue, 12 Mar 2019, 18:28 Paul Emmerich, wrote:
> Mounting kernel CephFS on an OSD node works fine with recent kernels
> (4.14+) and enough RAM in the servers
The general advice has been to not use the kernel client on an osd node as
you may see a deadlock under certain conditions. Using the fuse client
should be fine or use the kernel client inside a VM.
On Wed, 6 Mar 2019, 03:07 Zhenshi Zhou, wrote:
> Hi,
>
> I'm gonna mount cephfs from my ceph serv
On Mon, Mar 4, 2019 at 5:53 PM Jeff Layton wrote:
>
> On Mon, 2019-03-04 at 17:26 +, David C wrote:
> > Looks like you're right, Jeff. Just tried to write into the dir and am
> > now getting the quota warning. So I guess it was the libcephfs cache
> > as you say
= 1;
Cache_Size = 1;
}
Thanks,
On Mon, Mar 4, 2019 at 2:50 PM Jeff Layton wrote:
> On Mon, 2019-03-04 at 09:11 -0500, Jeff Layton wrote:
> > This list has been deprecated. Please subscribe to the new devel list at
> lists.nfs-ganesha.org.
> > On Fri, 2019-03-01 at 15:49 +000
Hi All
Exporting cephfs with the CEPH_FSAL
I set the following on a dir:
setfattr -n ceph.quota.max_bytes -v 1 /dir
setfattr -n ceph.quota.max_files -v 10 /dir
>From an NFSv4 client, the quota.max_bytes appears to be completely ignored,
I can go GBs over the quota in the dir. The *quota
On Wed, Feb 27, 2019 at 11:35 AM Hector Martin
wrote:
> On 27/02/2019 19:22, David C wrote:
> > Hi All
> >
> > I'm seeing quite a few directories in my filesystem with rctime years in
> > the future. E.g
> >
> > ]# getfattr -d -m ceph.dir.* /path/to
Hi All
I'm seeing quite a few directories in my filesystem with rctime years in
the future. E.g
]# getfattr -d -m ceph.dir.* /path/to/dir
getfattr: Removing leading '/' from absolute path names
# file: path/to/dir
ceph.dir.entries="357"
ceph.dir.files="1"
ceph.dir.rbytes="35606883904011"
ceph.di
47 PM Patrick Donnelly
wrote:
> On Mon, Jan 14, 2019 at 7:11 AM Daniel Gryniewicz wrote:
> >
> > Hi. Welcome to the community.
> >
> > On 01/14/2019 07:56 AM, David C wrote:
> > > Hi All
> > >
> > > I've been playing around with the nfs
It could also be the kernel client versions, what are you running? I
remember older kernel clients didn't always deal with recovery scenarios
very well.
On Mon, Jan 21, 2019 at 9:18 AM Marc Roos wrote:
>
>
> I think his downtime is coming from the mds failover, that takes a while
> in my case to
On Fri, 18 Jan 2019, 14:46 Marc Roos
>
> [@test]# time cat 50b.img > /dev/null
>
> real0m0.004s
> user0m0.000s
> sys 0m0.002s
> [@test]# time cat 50b.img > /dev/null
>
> real0m0.002s
> user0m0.000s
> sys 0m0.002s
> [@test]# time cat 50b.img > /dev/null
>
> real0m0.002s
On Fri, Jan 18, 2019 at 2:12 PM wrote:
> Hi.
>
> We have the intention of using CephFS for some of our shares, which we'd
> like to spool to tape as a part normal backup schedule. CephFS works nice
> for large files but for "small" .. < 0.1MB .. there seem to be a
> "overhead" on 20-40ms per fil
On Wed, 16 Jan 2019, 02:20 David Young Hi folks,
>
> My ceph cluster is used exclusively for cephfs, as follows:
>
> ---
> root@node1:~# grep ceph /etc/fstab
> node2:6789:/ /ceph ceph
> auto,_netdev,name=admin,secretfile=/root/ceph.admin.secret
> root@node1:~#
> ---
>
> "rados df" shows me the fol
Hi All
I've been playing around with the nfs-ganesha 2.7 exporting a cephfs
filesystem, it seems to be working pretty well so far. A few questions:
1) The docs say " For each NFS-Ganesha export, FSAL_CEPH uses a libcephfs
client,..." [1]. For arguments sake, if I have ten top level dirs in my
Cep
On Thu, Jan 10, 2019 at 4:07 PM Scottix wrote:
> I just had this question as well.
>
> I am interested in what you mean by fullest, is it percentage wise or raw
> space. If I have an uneven distribution and adjusted it, would it make more
> space available potentially.
>
Yes - I'd recommend usin
On Sat, 5 Jan 2019, 13:38 Marc Roos
> I have straw2, balancer=on, crush-compat and it gives worst spread over
> my ssd drives (4 only) being used by only 2 pools. One of these pools
> has pg 8. Should I increase this to 16 to create a better result, or
> will it never be any better.
>
> For now I
Hi All
Luminous 12.2.12
Single MDS
Replicated pools
A 'df' on a CephFS kernel client used to show me the usable space (i.e the
raw space with the replication overhead applied). This was when I just had
a single cephfs data pool.
After adding a second pool to the mds and using file layouts to map
ap
> and start again.
>
> Cheers and Happy new year 2019.
>
> Eric
>
>
>
> On Sun, Dec 30, 2018, 21:16 David C wrote:
>
>> Hi All
>>
>> I'm trying to set the existing pools in a Luminous cluster to use the hdd
>> device-class but without
Hi All
I'm trying to set the existing pools in a Luminous cluster to use the hdd
device-class but without moving data around. If I just create a new rule
using the hdd class and set my pools to use that new rule it will cause a
huge amount of data movement even though the pgs are all already on HD
I'm in a similar situation, currently running filestore with spinners and
journals on NVME partitions which are about 1% of the size of the OSD. If I
migrate to bluestore, I'll still only have that 1% available. Per the docs,
if my block.db device fills up, the metadata is going to spill back onto
Yep, that cleared it. Sorry for the noise!
On Sun, Dec 16, 2018 at 12:16 AM David C wrote:
> Hi Paul
>
> Thanks for the response. Not yet, just being a bit cautious ;) I'll go
> ahead and do that.
>
> Thanks
> David
>
>
> On Sat, 15 Dec 2018, 23:39 Paul E
ontact us at https://croit.io
>
> croit GmbH
> Freseniusstr. 31h
> 81247 München
> www.croit.io
> Tel: +49 89 1896585 90
>
> On Sun, Dec 16, 2018 at 12:22 AM David C wrote:
> >
> > Hi All
> >
> > I have what feels like a bit of a rookie question
>
Hi All
I have what feels like a bit of a rookie question
I shutdown a Luminous 12.2.1 cluster with noout,nobackfill,norecover set
Before shutting down, all PGs were active+clean
I brought the cluster up, all daemons started and all but 2 PGs are
active+clean
I have 2 pgs showing: "active+recov
Hi Jeff
Many thanks for this! Looking forward to testing it out.
Could you elaborate a bit on why Nautilus is recommended for this set-up
please. Would attempting this with a Luminous cluster be a non-starter?
On Wed, 12 Dec 2018, 12:16 Jeff Layton (Sorry for the duplicate email to ganesha li
Is that one big xfs filesystem? Are you able to mount with krbd?
On Tue, 27 Nov 2018, 13:49 Vikas Rana Hi There,
>
> We are replicating a 100TB RBD image to DR site. Replication works fine.
>
> rbd --cluster cephdr mirror pool status nfs --verbose
>
> health: OK
>
> images: 1 total
>
> 1 repl
Same issue here, Gmail user, member of different lists but only get
disabled on ceph-users. Happens about once a month but had three in Sept.
On Sat, 6 Oct 2018, 18:28 Janne Johansson, wrote:
> Den lör 6 okt. 2018 kl 15:06 skrev Elias Abacioglu
> :
> >
> > Hi,
> >
> > I'm bumping this old thread
h osd crush reweight osd.27 2
> sudo -u ceph ceph osd crush reweight osd.28 2
> sudo -u ceph ceph osd crush reweight osd.29 2
>
> Etc etc
>
>
> -Original Message-
> From: David C [mailto:dcsysengin...@gmail.com]
> Sent: maandag 3 september 2018 14:34
> To: ce
Hi all
Trying to add a new host to a Luminous cluster, I'm doing one OSD at a
time. I've only added one so far but it's getting too full.
The drive is the same size (4TB) as all others in the cluster, all OSDs
have crush weight of 3.63689. Average usage on the drives is 81.70%
With the new OSD I
gt; This moved to the PG map in luminous. I think it might have been there in
> Jewel as well.
>
> http://docs.ceph.com/docs/luminous/man/8/ceph/#pg
> ceph pg set_full_ratio
> ceph pg set_backfillfull_ratio
> ceph pg set_nearfull_ratio
>
>
> On Thu, Aug 30, 2018, 1:5
Does "ceph health detail" work?
Have you manually confirmed the OSDs on the nodes are working?
What was the replica size of the pools?
Are you seeing any progress with the recovery?
On Sun, Sep 2, 2018 at 9:42 AM Lee wrote:
> Running 0.94.5 as part of a Openstack enviroment, our ceph setup is
Hi All
I feel like this is going to be a silly query with a hopefully simple
answer. I don't seem to have the osd_backfill_full_ratio config option on
my OSDs and can't inject it. This a Lumimous 12.2.1 cluster that was
upgraded from Jewel.
I added an OSD to the cluster and woke up the next day t
Something like smallfile perhaps? https://github.com/bengland2/smallfile
Or you just time creating/reading lots of files
With read benching you would want to ensure you've cleared your mds cache
or use a dataset larger than the cache.
I'd be interested in seeing your results, I this on the to do
On Sat, 30 Jun 2018, 21:48 Nick Fisk, wrote:
> Hi Paul,
>
>
>
> Thanks for your response, is there anything you can go into more detail on
> and share with the list? I’m sure it would be much appreciated by more than
> just myself.
>
>
>
> I was planning on Kernel CephFS and NFS server, both seem
I'd say it's safe in terms of data integrity. In terms of availability,
that's something you'll want to test thoroughly e.g what happens when the
cluster is in recovery, does the filesystem remain accessible?
I think you'll be disappointed in terms of performance, I found OCFS2 to be
slightly bett
yerm...@physik.uni-bonn.de> wrote:
> Hi David,
>
> did you already manage to check your librados2 version and manage to pin
> down the issue?
>
> Cheers,
> Oliver
>
> Am 11.05.2018 um 17:15 schrieb Oliver Freyermuth:
> > Hi David,
> >
> > Am 11.0
I've seen similar behavior with cephfs client around that age, try 4.14+
On 15 May 2018 1:57 p.m., "Josef Zelenka"
wrote:
Client's kernel is 4.4.0. Regarding the hung osd request, i'll have to
check, the issue is gone now, so i'm not sure if i'll find what you are
suggesting. It's rather odd, be
2.4-0.el7.x86_64
> nfs-ganesha-2.6.1-0.1.el7.x86_64
> nfs-ganesha-ceph-2.6.1-0.1.el7.x86_64
> Of course, we plan to upgrade to 12.2.5 soon-ish...
>
> Am 11.05.2018 um 00:05 schrieb David C:
> > Hi All
> >
> > I'm testing out the nfs-ganesha-2.6.1-0.1.el7.x86_6
Hi All
I'm testing out the nfs-ganesha-2.6.1-0.1.el7.x86_64.rpm package from
http://download.ceph.com/nfs-ganesha/rpm-V2.6-stable/luminous/x86_64/
It's failing to load /usr/lib64/ganesha/libfsalceph.so
With libcephfs-12.2.1 installed I get the following error in my ganesha log:
load_fsal :NFS S
How does your rados bench look?
Have you tried playing around with read ahead and striping?
On Tue, 24 Apr 2018 17:53 Jonathan Proulx, wrote:
> Hi All,
>
> I seem to be seeing consitently poor read performance on my cluster
> relative to both write performance and read perormance of a single
>
Pretty sure you're getting stung by: http://tracker.ceph.com/issues/17563
Consider using an elrepo kernel, 4.14 works well for me.
On Thu, 29 Mar 2018, 09:46 Dan van der Ster, wrote:
> On Thu, Mar 29, 2018 at 10:31 AM, Robert Sander
> wrote:
> > On 29.03.2018 09:50, ouyangxu wrote:
> >
> >>
Thanks, John. I'm pretty sure the root of my slow OSD issues is filestore
subfolder splitting.
On Wed, Mar 14, 2018 at 2:17 PM, John Spray wrote:
> On Tue, Mar 13, 2018 at 7:17 PM, David C wrote:
> > Hi All
> >
> > I have a Samba server that is exporting direct
= -16
> filestore_split_multiple = 256
>
> [2] https://gist.github.com/drakonstein/cb76c7696e65522ab0e699b7ea1ab1c4
>
> [3] filestore_merge_threshold = -1
> filestore_split_multiple = 1
> On Mon, Feb 26, 2018 at 12:18 PM David C wrote:
>
>> Thanks, David. I think I've pr
Thanks for the detailed response, Greg. A few follow ups inline:
On 13 Mar 2018 20:52, "Gregory Farnum" wrote:
On Tue, Mar 13, 2018 at 12:17 PM, David C wrote:
> Hi All
>
> I have a Samba server that is exporting directories from a Cephfs Kernel
> mount. Performance ha
Hi All
I have a Samba server that is exporting directories from a Cephfs Kernel
mount. Performance has been pretty good for the last year but users have
recently been complaining of short "freezes", these seem to coincide with
MDS related slow requests in the monitor ceph.log such as:
2018-03-13
On 27 Feb 2018 06:46, "Jan Pekař - Imatic" wrote:
I think I hit the same issue.
I have corrupted data on cephfs and I don't remember the same issue before
Luminous (i did the same tests before).
It is on my test 1 node cluster with lower memory then recommended (so
server is swapping) but it sho
[2] https://gist.github.com/drakonstein/cb76c7696e65522ab0e699b7ea1ab1c4
[3] filestore_merge_threshold = -1
filestore_split_multiple = 1
On Mon, Feb 26, 2018 at 12:18 PM David C wrote:
> Thanks, David. I think I've probably used the wrong terminology here, I'm
> not splitting PGs to create
and ceph -s to make sure I'm not giving
> too much priority to the recovery operations so that client IO can still
> happen.
>
> On Mon, Feb 26, 2018 at 11:10 AM David C wrote:
>
>> Hi All
>>
>> I have a 12.2.1 cluster, all filestore OSDs, OSDs are spinners, jo
Hi All
I have a 12.2.1 cluster, all filestore OSDs, OSDs are spinners, journals on
NVME. Cluster primarily used for CephFS, ~20M objects.
I'm seeing some OSDs getting marked down, it appears to be related to PG
splitting, e.g:
2018-02-26 10:27:27.935489 7f140dbe2700 1 _created [C,D] has 5121 ob
Thanks for the tips, John. I'll increase the debug level as suggested.
On 25 Feb 2018 20:56, "John Spray" wrote:
> On Sat, Feb 24, 2018 at 10:13 AM, David C wrote:
> > Hi All
> >
> > I had an MDS go down on a 12.2.1 cluster, the standby took over but I
&g
Hi All
I had an MDS go down on a 12.2.1 cluster, the standby took over but I don't
know what caused the issue. Scrubs are scheduled to start at 23:00 on this
cluster but this appears to have started a minute before.
Can anyone help me with diagnosing this please. Here's the relevant bit
from the
At a glance looks OK, I've not tested this in a while. Silly question but
does your Samba package definitely ship with the Ceph vfs? Caught me out in
the past.
Have you tried exporting a sub dir? Maybe 777 it although shouldn't make a
difference.
On 21 Dec 2017 13:16, "Felix Stolte" wrote:
> He
Do a search on "nfs pacemaker", should be loads of guides.
Another approach, assuming you're exporting cephfs, could be active/active
nfs servers using Ctdb from the samba project. You'll be restricted to
NFSv3 but it's much simpler than configuring pacemaker IMO.
On 20 Dec 2017 5:45 p.m., "nigel
osd_backfill_full_ratio = '0.92' (unchangeable)
osd.25: osd_backfill_full_ratio = '0.92' (unchangeable)
osd.26: osd_backfill_full_ratio = '0.92' (unchangeable)
osd.27: osd_backfill_full_ratio = '0.92' (unchangeable)
osd.28: osd_backfill_full_ratio = '
What's your backfill full ratio? You may be able to get healthy by
increasing your backfill full ratio (in small increments). But your next
immediate task should be to add more OSDs or remove data.
On 19 Dec 2017 4:26 p.m., "Nghia Than" wrote:
Hi,
My CEPH is stuck at this for few days, we adde
Is this nfs-ganesha exporting Cephfs?
Are you using NFS for a Vmware Datastore?
What are you using for the NFS failover?
We need more info but this does sound like a vmware/nfs question rather
than specifically ceph/nfs-ganesha
On Thu, Dec 14, 2017 at 1:47 PM, nigel davies wrote:
> Hay all
>
>
Hi Roman
Whilst you can define multiple subnets in the public network directive, the
MONs still only bind to a single IP. Your clients need to be able to route
to that IP. From what you're saying, 172.x.x.x/24 is an isolated network,
so a client on the 10.x.x.x network is not going to be able to a
Not seen this myself but you should update to at least CentOS 7.3, ideally
7.4. I believe a lot of cephfs fixes went into those kernels. If you still
have the issue with the CentOS kernels, test with the latest upstream
kernel. And/or test with latest Fuse client.
On Tue, Dec 5, 2017 at 12:01 PM,
On Mon, Dec 4, 2017 at 4:39 PM, Drew Weaver wrote:
> Howdy,
>
>
>
> I replaced a disk today because it was marked as Predicted failure. These
> were the steps I took
>
>
>
> ceph osd out osd17
>
> ceph -w #waited for it to get done
>
> systemctl stop ceph-osd@osd17
>
> ceph osd purge osd17 --yes-
On Tue, Nov 28, 2017 at 1:50 PM, Jens-U. Mozdzen wrote:
> Hi David,
>
> Zitat von David C :
>
>> On 27 Nov 2017 1:06 p.m., "Jens-U. Mozdzen" wrote:
>>
>> Hi David,
>>
>> Zitat von David C :
>>
>> Hi Jens
>>
>>>
&
On 27 Nov 2017 1:06 p.m., "Jens-U. Mozdzen" wrote:
Hi David,
Zitat von David C :
Hi Jens
>
> We also see these messages quite frequently, mainly the "replicating
> dir...". Only seen "failed to open ino" a few times so didn't do any real
> inv
Hi Jens
We also see these messages quite frequently, mainly the "replicating
dir...". Only seen "failed to open ino" a few times so didn't do any real
investigation. Our set up is very similar to yours, 12.2.1, active/standby
MDS and exporting cephfs through KNFS (hoping to replace with Ganesha
so
Yep, that did it! Thanks, Zheng. I should read release notes more carefully!
On Fri, Nov 24, 2017 at 7:09 AM, Yan, Zheng wrote:
> On Thu, Nov 23, 2017 at 9:17 PM, David C wrote:
> > Hi All
> >
> > I upgraded my 12.2.0 cluster to 12.2.1 a month or two back. I've noti
Hi All
I upgraded my 12.2.0 cluster to 12.2.1 a month or two back. I've noticed
that the number of inodes held in cache is only approx 1/5th of my
inode_max. This is a surprise to me as with 12.2.0, and before that Jewel,
after starting an MDS server, the cache would typically fill to the max
with
70 matches
Mail list logo