[ceph-users] Inconsistent PGs that ceph pg repair does not fix

2015-08-03 Thread Andras Pataki
Summary: I am having problems with inconsistent PG's that the 'ceph pg repair' command does not fix. Below are the details. Any help would be appreciated. # Find the inconsistent PG's ~# ceph pg dump | grep inconsistent dumped all in format plain 2.439 42080 00 017279507143 31033103 active+clea

Re: [ceph-users] Inconsistent PGs that ceph pg repair does not fix

2015-08-03 Thread Andras Pataki
f ceph-osd -v). >-Sam > >On Mon, Aug 3, 2015 at 12:34 PM, Andras Pataki > wrote: >> Summary: I am having problems with inconsistent PG's that the 'ceph pg >> repair' command does not fix. Below are the details. Any help would be >> appreciated. >>

Re: [ceph-users] Inconsistent PGs that ceph pg repair does not fix

2015-09-08 Thread Andras Pataki
"ceph version 0.94.3 (95cefea9fd9ab740263bf8bb4796fd864d9afe2b)" Could you have another look? Thanks, Andras ________ From: Andras Pataki Sent: Monday, August 3, 2015 4:09 PM To: Samuel Just Cc: ceph-users@lists.ceph.com; ceph-de...@vger.kernel.org Subject: Re: [ceph-users]

Re: [ceph-users] Inconsistent PGs that ceph pg repair does not fix

2015-09-08 Thread Andras Pataki
Cool, thanks! Andras From: Sage Weil Sent: Tuesday, September 8, 2015 2:07 PM To: Andras Pataki Cc: Samuel Just; ceph-users@lists.ceph.com; ceph-de...@vger.kernel.org Subject: Re: [ceph-users] Inconsistent PGs that ceph pg repair does not fix On Tue, 8

[ceph-users] Uneven data distribution across OSDs

2015-09-21 Thread Andras Pataki
Hi ceph users, I am using CephFS for file storage and I have noticed that the data gets distributed very unevenly across OSDs. * I have about 90 OSDs across 8 hosts, and 4096 PGs for the cephfs_data pool with 2 replicas, which is in line with the total PG recommendation if "Total PGs = (OS

Re: [ceph-users] Uneven data distribution across OSDs

2015-09-21 Thread Andras Pataki
ment groups. > >How many data pools are being used for storing objects? > >'ceph osd dump |grep pool' > >Also how are these 90 OSD's laid out across the 8 hosts and is there any >discrepancy between disk sizes and weight? > >'ceph osd tree' > >

[ceph-users] CephFS file to rados object mapping

2015-09-28 Thread Andras Pataki
Hi, Is there a way to find out which radios objects a file in cephfs is mapped to from the command line? Or vice versa, which file a particular radios object belongs to? Our ceph cluster has some inconsistencies/corruptions and I am trying to find out which files are impacted in cephfs. Thank

Re: [ceph-users] CephFS file to rados object mapping

2015-09-29 Thread Andras Pataki
Thanks, that worked. Is there a mapping in the other direction easily available, I.e. To find where all the 4MB pieces of a file are? On 9/28/15, 4:56 PM, "John Spray" wrote: >On Mon, Sep 28, 2015 at 9:46 PM, Andras Pataki > wrote: >> Hi, >> >> Is there a way

Re: [ceph-users] CephFS file to rados object mapping

2015-09-29 Thread Andras Pataki
d to run crush, by making use of the crushtool or >similar. >-Greg > >On Tue, Sep 29, 2015 at 6:29 AM, Andras Pataki > wrote: >> Thanks, that worked. Is there a mapping in the other direction easily >> available, I.e. To find where all the 4MB pieces of a file are? >&g

[ceph-users] PGs stuck in active+clean+replay

2015-10-22 Thread Andras Pataki
Hi ceph users, We’ve upgraded to 0.94.4 (all ceph daemons got restarted) – and are in the middle of doing some rebalancing due to crush changes (removing some disks). During the rebalance, I see that some placement groups get stuck in ‘active+clean+replay’ for a long time (essentially until I

Re: [ceph-users] PGs stuck in active+clean+replay

2015-10-27 Thread Andras Pataki
, "Gregory Farnum" wrote: >On Tue, Oct 27, 2015 at 11:03 AM, Gregory Farnum >wrote: >> On Thu, Oct 22, 2015 at 3:58 PM, Andras Pataki >> wrote: >>> Hi ceph users, >>> >>> We¹ve upgraded to 0.94.4 (all ceph daemons got restarted) ­ and are in &

Re: [ceph-users] PGs stuck in active+clean+replay

2015-10-27 Thread Andras Pataki
centos RPMs) before a planned larger rebalance. Andras On 10/27/15, 2:36 PM, "Gregory Farnum" wrote: >On Tue, Oct 27, 2015 at 11:22 AM, Andras Pataki > wrote: >> Hi Greg, >> >> No, unfortunately I haven¹t found any resolution to it. We are using >> cephfs, th

Re: [ceph-users] PGs stuck in active+clean+replay

2015-11-09 Thread Andras Pataki
Hi Greg, I’ve tested the patch below on top of the 0.94.5 hammer sources, and it works beautifully. No more active+clean+replay stuck PGs. Thanks! Andras On 10/27/15, 4:46 PM, "Andras Pataki" wrote: >Yes, this definitely sounds plausible (the peering/activating process does

Re: [ceph-users] Degraded objects while OSD is being added/filled

2017-07-06 Thread Andras Pataki
On 06/30/2017 04:38 PM, Gregory Farnum wrote: On Wed, Jun 21, 2017 at 6:57 AM Andras Pataki mailto:apat...@flatironinstitute.org>> wrote: Hi cephers, I noticed something I don't understand about ceph's behavior when adding an OSD. When I start with a clean cluster

Re: [ceph-users] Degraded objects while OSD is being added/filled

2017-07-20 Thread Andras Pataki
l pools are now with replicated size 3 and min size 2. Let me know if any other info would be helpful. Andras On 07/06/2017 02:30 PM, Andras Pataki wrote: Hi Greg, At the moment our cluster is all in balance. We have one failed drive that will be replaced in a few days (the OSD has been removed fr

[ceph-users] CephFS: concurrent access to the same file from multiple nodes

2017-07-20 Thread Andras Pataki
We are having some difficulties with cephfs access to the same file from multiple nodes concurrently. After debugging some large-ish applications with noticeable performance problems using CephFS (with the fuse client), I have a small test program to reproduce the problem. The core of the pro

Re: [ceph-users] CephFS: concurrent access to the same file from multiple nodes

2017-08-01 Thread Andras Pataki
the issue down further. Andras On 07/21/2017 05:41 AM, John Spray wrote: On Thu, Jul 20, 2017 at 9:19 PM, Andras Pataki wrote: We are having some difficulties with cephfs access to the same file from multiple nodes concurrently. After debugging some large-ish applications with noticeable p

Re: [ceph-users] CephFS: concurrent access to the same file from multiple nodes

2017-08-07 Thread Andras Pataki
I've filed a tracker bug for this: http://tracker.ceph.com/issues/20938 Andras On 08/01/2017 10:26 AM, Andras Pataki wrote: Hi John, Sorry for the delay, it took a bit of work to set up a luminous test environment. I'm sorry to have to report that the 12.1.1 RC version also su

[ceph-users] Luminous OSD startup errors

2017-08-15 Thread Andras Pataki
After upgrading to the latest Luminous RC (12.1.3), all our OSD's are crashing with the following assert: 0> 2017-08-15 08:28:49.479238 7f9b7615cd00 -1 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/rel

Re: [ceph-users] Luminous OSD startup errors

2017-08-15 Thread Andras Pataki
development packages that should resolve the issue in the meantime. [1] http://tracker.ceph.com/issues/20985 On Tue, Aug 15, 2017 at 9:08 AM, Andras Pataki wrote: After upgrading to the latest Luminous RC (12.1.3), all our OSD's are crashing with the following assert: 0> 2017-08-

Re: [ceph-users] Bluestore disk colocation using NVRAM, SSD and SATA

2017-09-20 Thread Andras Pataki
Is there any guidance on the sizes for the WAL and DB devices when they are separated to an SSD/NVMe?  I understand that probably there isn't a one size fits all number, but perhaps something as a function of cluster/usage parameters like OSD size and usage pattern (amount of writes, number/siz

[ceph-users] CephFS: clients hanging on write with ceph-fuse

2017-11-02 Thread Andras Pataki
We've been running into a strange problem with Ceph using ceph-fuse and the filesystem. All the back end nodes are on 10.2.10, the fuse clients are on 10.2.7. After some hours of runs, some processes get stuck waiting for fuse like: [root@worker1144 ~]# cat /proc/58193/stack [] wait_answer_int

Re: [ceph-users] CephFS: clients hanging on write with ceph-fuse

2017-11-02 Thread Andras Pataki
-fuse? This does sound vaguely familiar and is an issue I'd generally expect to have the fix backported for, once it was identified. On Thu, Nov 2, 2017 at 11:40 AM Andras Pataki mailto:apat...@flatironinstitute.org>> wrote: We've been running into a strange problem wit

Re: [ceph-users] CephFS: clients hanging on write with ceph-fuse

2017-11-03 Thread Andras Pataki
r ought to work fine. On Thu, Nov 2, 2017 at 4:58 PM Andras Pataki mailto:apat...@flatironinstitute.org>> wrote: I'm planning to test the newer ceph-fuse tomorrow.  Would it be better to stay with the Jewel 10.2.10 client, or would the 12.2.1 Luminous client be better (eve

[ceph-users] ceph-fuse memory usage

2017-11-22 Thread Andras Pataki
Hello ceph users, I've seen threads about Luminous OSDs using more memory than they should due to some memory accounting bugs.  Does this issue apply to ceph-fuse also? After upgrading to the latest ceph-fuse luminous client (12.2.1), we see some ceph-fuse processes using excessive memory. 

[ceph-users] luminous ceph-fuse crashes with "failed to remount for kernel dentry trimming"

2017-11-27 Thread Andras Pataki
Dear ceph users, After upgrading to the Luminous 12.2.1 ceph-fuse client, we've seen clients on various nodes randomly crash at the assert     FAILED assert(0 == "failed to remount for kernel dentry trimming") with the stack:  ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e)

Re: [ceph-users] luminous ceph-fuse crashes with "failed to remount for kernel dentry trimming"

2017-11-27 Thread Andras Pataki
rash to be definite about it. Andras On 11/27/2017 06:06 PM, Patrick Donnelly wrote: Hello Andras, On Mon, Nov 27, 2017 at 2:31 PM, Andras Pataki wrote: After upgrading to the Luminous 12.2.1 ceph-fuse client, we've seen clients on various nodes randomly crash at the assert

[ceph-users] After Luminous upgrade: ceph-fuse clients failing to respond to cache pressure

2018-01-16 Thread Andras Pataki
Dear Cephers, We've upgraded the back end of our cluster from Jewel (10.2.10) to Luminous (12.2.2).  The upgrade went smoothly for the most part, except we seem to be hitting an issue with cephfs.  After about a day or two of use, the MDS start complaining about clients failing to respond to c

Re: [ceph-users] After Luminous upgrade: ceph-fuse clients failingtorespond to cache pressure

2018-01-17 Thread Andras Pataki
18 09:50 PM, Andras Pataki wrote: Dear Cephers, *snipsnap* We are running with a larger MDS cache than usual, we have mds_cache_size set to 4 million.  All other MDS configs are the defaults. AFAIK the MDS cache management in luminous has changed, focusing on memory size instead of numb

Re: [ceph-users] After Luminous upgrade: ceph-fuse clients failing to respond to cache pressure

2018-01-17 Thread Andras Pataki
06:09 AM, John Spray wrote: On Tue, Jan 16, 2018 at 8:50 PM, Andras Pataki wrote: Dear Cephers, We've upgraded the back end of our cluster from Jewel (10.2.10) to Luminous (12.2.2). The upgrade went smoothly for the most part, except we seem to be hitting an issue with cephfs. After ab

Re: [ceph-users] After Luminous upgrade: ceph-fuse clients failing to respond to cache pressure

2018-01-18 Thread Andras Pataki
PM, John Spray wrote: On Wed, Jan 17, 2018 at 3:36 PM, Andras Pataki wrote: Hi John, All our hosts are CentOS 7 hosts, the majority are 7.4 with kernel 3.10.0-693.5.2.el7.x86_64, with fuse 2.9.2-8.el7. We have some hosts that have slight variations in kernel versions, the oldest one are a handful

Re: [ceph-users] After Luminous upgrade: ceph-fuse clients failing to respond to cache pressure

2018-01-22 Thread Andras Pataki
ache memory limit' of 16GB and bounced them, we have had no performance or cache pressure issues, and as expected they hover around 22-23GB of RSS. Thanks everyone for the help, Andras On 01/18/2018 12:34 PM, Patrick Donnelly wrote: Hi Andras, On Thu, Jan 18, 2018 at 3:38 AM, Andras Pataki wr

Re: [ceph-users] Ceph - incorrect output of ceph osd tree

2018-01-31 Thread Andras Pataki
There is a config option "mon osd min up ratio" (defaults to 0.3) - and if too many OSDs are down, the monitors will not mark further OSDs down.  Perhaps that's the culprit here? Andras On 01/31/2018 02:21 PM, Marc Roos wrote: Maybe the process is still responding on an active session? If

[ceph-users] Ceph pg active+clean+inconsistent

2016-12-15 Thread Andras Pataki
Hi everyone, Yesterday scrubbing turned up an inconsistency in one of our placement groups. We are running ceph 10.2.3, using CephFS and RBD for some VM images. [root@hyperv017 ~]# ceph -s cluster d7b33135-0940-4e48-8aa6-1d2026597c2f health HEALTH_ERR 1 pgs inconsistent

Re: [ceph-users] Ceph pg active+clean+inconsistent

2016-12-20 Thread Andras Pataki
s not match object info size (3014656) adjusted for ondisk to (3014656) 2016-12-20 16:27:35.885496 7f3e17cac700 -1 log_channel(cluster) log [ERR] : 6.92c repair 1 errors, 0 fixed Any help/hints would be appreciated. Thanks, Andras On 12/15/2016 10:13 AM, Andras Pataki wrote: Hi eve

Re: [ceph-users] Ceph pg active+clean+inconsistent

2016-12-21 Thread Andras Pataki
ou could have a look on this object on each related osd for the pg, compare them and delete the Different object. I assume you have size = 3. Then again pg repair. But be carefull iirc the replica will be recovered from the primary pg. Hth Am 20. Dezember 2016 22:39:44 MEZ, schrieb Andra

Re: [ceph-users] Ceph pg active+clean+inconsistent

2017-01-03 Thread Andras Pataki
"omap_stats_invalid": false, "hitset_stats_invalid": false, "hitset_bytes_stats_invalid": false, "pin_stats_invalid": true, "stat_sum": { "num_bytes"

Re: [ceph-users] Ceph pg active+clean+inconsistent

2017-01-03 Thread Andras Pataki
out in https://www.spinics.net/lists/ceph-devel/msg16519.html to set the size value to zero. Your target value is 1c. $ printf '%x\n' 1835008 1c Make sure you check it is right before injecting it back in with "attr -s" What version is this? Did you look for a simila

Re: [ceph-users] Ceph pg active+clean+inconsistent

2017-01-04 Thread Andras Pataki
# ceph pg debug unfound_objects_exist FALSE Andras On 01/03/2017 11:38 PM, Shinobu Kinjo wrote: Would you run: # ceph pg debug unfound_objects_exist On Wed, Jan 4, 2017 at 5:31 AM, Andras Pataki wrote: Here is the output of ceph pg query for one of hte active+clean+inconsistent PGs

Re: [ceph-users] Ceph pg active+clean+inconsistent

2017-01-09 Thread Andras Pataki
ects). But it'd be perhaps good to do some searching on how/why this problem came about before doing this. andras On 01/07/2017 06:48 PM, Shinobu Kinjo wrote: Sorry for the late. Are you still facing inconsistent pg status? On Wed, Jan 4, 2017 at 11:39 PM, Andras Pataki

[ceph-users] Testing a hypothetical crush map

2018-08-06 Thread Andras Pataki
Hi cephers, Is there a way to see what a crush map change does to the PG mappings (i.e. what placement groups end up on what OSDs) without actually setting the crush map (and have the map take effect)?  I'm looking for some way I could test hypothetical crush map changes without any effect on

Re: [ceph-users] Stability Issue with 52 OSD hosts

2018-08-23 Thread Andras Pataki
We are also running some fairly dense nodes with CentOS 7.4 and ran into similar problems.  The nodes ran filestore OSDs (Jewel, then Luminous).  Sometimes a node would be so unresponsive that one couldn't even ssh to it (even though the root disk was a physically separate drive on a separate c

Re: [ceph-users] Stability Issue with 52 OSD hosts

2018-08-24 Thread Andras Pataki
: Tyler Bishop [mailto:tyler.bis...@beyondhosting.net] Sent: vrijdag 24 augustus 2018 3:11 To: Andras Pataki Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Stability Issue with 52 OSD hosts Thanks for the info. I was investigating bluestore as well. My host dont go unresponsive but I do see

[ceph-users] ceph-fuse using excessive memory

2018-09-05 Thread Andras Pataki
Hi cephers, Every so often we have a ceph-fuse process that grows to rather large size (up to eating up the whole memory of the machine).  Here is an example of a 200GB RSS size ceph-fuse instance: # ceph daemon /var/run/ceph/ceph-client.admin.asok dump_mempools {     "bloom_filter": {   

Re: [ceph-users] ceph-fuse using excessive memory

2018-09-05 Thread Andras Pataki
,     "omap_wr": 0,     "omap_rd": 0,     "omap_del": 0     },     "throttle-msgr_dispatch_throttler-client": {     "val": 0,     "max": 104857600,     "get_started": 0,     "get": 673934,     "get

Re: [ceph-users] ceph-fuse using excessive memory

2018-09-06 Thread Andras Pataki
increased memory related settings in /etc/ceph.conf, but based on my understanding of the parameters, they shouldn't amount to such high memory usage. Andras On 09/05/2018 10:15 AM, Andras Pataki wrote: Below are the performance counters.  Some scientific workflows trigger this - some parts of

Re: [ceph-users] ceph-fuse using excessive memory

2018-09-19 Thread Andras Pataki
{     "items": 16782690,     "bytes": 68783148465     } } Andras On 9/6/18 11:58 PM, Yan, Zheng wrote: Could you please try make ceph-fuse use simple messenger (add "ms type = simple" to client section of ceph.conf). Regards Yan, Zheng On Wed, Sep 5, 2018

Re: [ceph-users] ceph-fuse using excessive memory

2018-09-20 Thread Andras Pataki
06 PM, Andras Pataki wrote: Hi Zheng, It looks like the memory growth happens even with the simple messenger: [root@worker1032 ~]# ceph daemon /var/run/ceph/ceph-client.admin.asok config get ms_type {     "ms_type": "simple" } [root@worker1032 ~]# ps -auxwww | grep ceph-fuse r

Re: [ceph-users] ceph-fuse using excessive memory

2018-09-24 Thread Andras Pataki
The whole cluster, including ceph-fuse is version 12.2.7. Andras On 9/24/18 6:27 AM, Yan, Zheng wrote: On Fri, Sep 21, 2018 at 5:40 AM Andras Pataki wrote: I've done some more experiments playing with client config parameters, and it seems like the the client_oc_size parameter is

Re: [ceph-users] ceph-fuse using excessive memory

2018-09-25 Thread Andras Pataki
running.  I can reproduce this issue at will in about 6 to 8 hours of running one of our scientific jobs - and I can also run a more instrumented/patched/etc. code to try. Andras On 9/24/18 10:06 PM, Yan, Zheng wrote: On Tue, Sep 25, 2018 at 2:23 AM Andras Pataki wrote: The whole cluster

[ceph-users] cephfs kernel client stability

2018-10-01 Thread Andras Pataki
We have so far been using ceph-fuse for mounting cephfs, but the small file performance of ceph-fuse is often problematic.  We've been testing the kernel client, and have seen some pretty bad crashes/hangs. What is the policy on fixes to the kernel client?  Is only the latest stable kernel upd

Re: [ceph-users] cephfs kernel client stability

2018-10-01 Thread Andras Pataki
x -Original Message- From: Andras Pataki [mailto:apat...@flatironinstitute.org] Sent: maandag 1 oktober 2018 20:10 To: ceph-users Subject: [ceph-users] cephfs kernel client stability We have so far been using ceph-fuse for mounting cephfs, but the small file performance of ceph-fuse

Re: [ceph-users] cephfs kernel client stability

2018-10-01 Thread Andras Pataki
logs there are no entries around the problematic times either.  And after all this, the mount is unusable: [root@worker1004 ~]# ls -l /mnt/cephtest ls: cannot access /mnt/cephtest: Permission denied Andras On 10/1/18 3:02 PM, Andras Pataki wrote: These hangs happen during random I/O fio benchma

[ceph-users] ceph-volume: recreate OSD with same ID after drive replacement

2018-10-03 Thread Andras Pataki
After replacing failing drive I'd like to recreate the OSD with the same osd-id using ceph-volume (now that we've moved to ceph-volume from ceph-disk).  However, I seem to not be successful.  The command I'm using: ceph-volume lvm prepare --bluestore --osd-id 747 --data H901D44/H901D44 --block

Re: [ceph-users] ceph-volume: recreate OSD with same ID after drive replacement

2018-10-03 Thread Andras Pataki
oval procedure, osd crush remove, auth del, osd rm)? Thanks, Andras On 10/3/18 10:36 AM, Alfredo Deza wrote: On Wed, Oct 3, 2018 at 9:57 AM Andras Pataki wrote: After replacing failing drive I'd like to recreate the OSD with the same osd-id using ceph-volume (now that we've move

Re: [ceph-users] ceph-volume: recreate OSD with same ID after drive replacement

2018-10-03 Thread Andras Pataki
en ID from scratch would be nice (given that the underlying raw ceph commands already support it). Andras On 10/3/18 11:41 AM, Alfredo Deza wrote: On Wed, Oct 3, 2018 at 11:23 AM Andras Pataki wrote: Thanks - I didn't realize that was such a recent fix. I've now tried 12.2.8, and perh

[ceph-users] ceph-volume and systemd troubles

2018-05-16 Thread Andras Pataki
Dear ceph users, I've been experimenting setting up a new node with ceph-volume and bluestore.  Most of the setup works right, but I'm running into a strange interaction between ceph-volume and systemd when starting OSDs. After preparing/activating the OSD, a systemd unit instance is created

Re: [ceph-users] ceph-volume and systemd troubles

2018-05-16 Thread Andras Pataki
Done: tracker #24152 Thanks, Andras On 05/16/2018 04:58 PM, Alfredo Deza wrote: On Wed, May 16, 2018 at 4:50 PM, Andras Pataki wrote: Dear ceph users, I've been experimenting setting up a new node with ceph-volume and bluestore. Most of the setup works right, but I'm runn

[ceph-users] Help/advice with crush rules

2018-05-17 Thread Andras Pataki
I've been trying to wrap my head around crush rules, and I need some help/advice.  I'm thinking of using erasure coding instead of replication, and trying to understand the possibilities for planning for failure cases. For a simplified example, consider a 2 level topology, OSDs live on hosts,

Re: [ceph-users] Help/advice with crush rules

2018-05-21 Thread Andras Pataki
can't put any erasure code with more than 9 chunks. Andras On 05/18/2018 06:30 PM, Gregory Farnum wrote: On Thu, May 17, 2018 at 9:05 AM Andras Pataki mailto:apat...@flatironinstitute.org>> wrote: I've been trying to wrap my head around crush rules, and I need some

[ceph-users] CephFS/ceph-fuse performance

2018-06-06 Thread Andras Pataki
We're using CephFS with Luminous 12.2.5 and the fuse client (on CentOS 7.4, kernel 3.10.0-693.5.2.el7.x86_64).  Performance has been very good generally, but we're currently running into some strange performance issues with one of our applications.  The client in this case is on a higher latenc

Re: [ceph-users] CephFS/ceph-fuse performance

2018-06-06 Thread Andras Pataki
Hi Greg, The docs say that client_cache_size is the number of inodes that are cached, not bytes of data.  Is that incorrect? Andras On 06/06/2018 11:25 AM, Gregory Farnum wrote: On Wed, Jun 6, 2018 at 5:52 AM, Andras Pataki wrote: We're using CephFS with Luminous 12.2.5 and the

Re: [ceph-users] CephFS/ceph-fuse performance

2018-06-06 Thread Andras Pataki
x27;t look at the caches any longer of the client. Andras On 06/06/2018 12:22 PM, Andras Pataki wrote: Hi Greg, The docs say that client_cache_size is the number of inodes that are cached, not bytes of data.  Is that incorrect? Andras On 06/06/2018 11:25 AM, Gregory Farnum wrote: On Wed

[ceph-users] CephFS fuse client users stuck

2017-03-13 Thread Andras Pataki
Dear Cephers, We're using the ceph file system with the fuse client, and lately some of our processes are getting stuck seemingly waiting for fuse operations. At the same time, the cluster is healthy, no slow requests, all OSDs up and running, and both the MDS and the fuse client think that

Re: [ceph-users] CephFS fuse client users stuck

2017-03-14 Thread Andras Pataki
x27;ve also tried kick_stale_sessions on the fuse client, which didn't help (I guess since it doesn't think the session is stale). Let me know if there is anything else I can do to help. Andras On 03/13/2017 06:08 PM, John Spray wrote: On Mon, Mar 13, 2017 at 8:15 PM, Andras Pataki

Re: [ceph-users] CephFS fuse client users stuck

2017-03-14 Thread Andras Pataki
On 03/14/2017 12:55 PM, John Spray wrote: On Tue, Mar 14, 2017 at 2:10 PM, Andras Pataki wrote: Hi John, I've checked the MDS session list, and the fuse client does appear on that with 'state' as 'open'. So both the fuse client and the MDS agree on an open connection. A

[ceph-users] CephFS: ceph-fuse segfaults

2017-03-29 Thread Andras Pataki
Below is a crash we had on a few machines with the ceph-fuse client on the latest Jewel release 10.2.6. A total of 5 ceph-fuse processes crashed more or less the same way at different times. The full logs are at http://voms.simonsfoundation.org:50013/9SXnEpflYPmE6UhM9EgOR3us341eqym/ceph-20170

Re: [ceph-users] CephFS fuse client users stuck

2017-03-31 Thread Andras Pataki
0331/ Some help/advice with this would very much be appreciated. Thanks in advance! Andras On 03/14/2017 12:55 PM, John Spray wrote: On Tue, Mar 14, 2017 at 2:10 PM, Andras Pataki wrote: Hi John, I've checked the MDS session list, and the fuse client does appear on that with 'stat

Re: [ceph-users] CephFS fuse client users stuck

2017-03-31 Thread Andras Pataki
27 PM, Andras Pataki wrote: Hi John, It took a while but I believe now I have a reproducible test case for the capabilities being lost issue in CephFS I wrote about a couple of weeks ago. The quick summary of problem is that often processes hang using CephFS either for a while or sometimes indefin

Re: [ceph-users] CephFS fuse client users stuck

2017-04-06 Thread Andras Pataki
ation is trying to achieve, that'd be super helpful. Andras On 03/31/2017 02:07 PM, Andras Pataki wrote: Several clients on one node also works well for me (I guess the fuse client arbitrates then and the MDS perhaps doesn't need to do so much). So the clients need to be on diff

Re: [ceph-users] fsping, why you no work no mo?

2017-04-13 Thread Andras Pataki
Hi Dan, I don't have a solution to the problem, I can only second that we've also been seeing strange problems when more than one node accesses the same file in ceph and at least one of them opens it for writing. I've tried verbose logging on the client (fuse), and it seems that the fuse cli

[ceph-users] Degraded objects while OSD is being added/filled

2017-06-21 Thread Andras Pataki
Hi cephers, I noticed something I don't understand about ceph's behavior when adding an OSD. When I start with a clean cluster (all PG's active+clean) and add an OSD (via ceph-deploy for example), the crush map gets updated and PGs get reassigned to different OSDs, and the new OSD starts gett

Re: [ceph-users] move directories in cephfs

2018-12-10 Thread Andras Pataki
Moving data between pools when a file is moved to a different directory is most likely problematic - for example an inode can be hard linked to two different directories that are in two different pools - then what happens to the file?  Unix/posix semantics don't really specify a parent director

[ceph-users] Ceph monitors overloaded on large cluster restart

2018-12-19 Thread Andras Pataki
Dear ceph users, We have a large-ish ceph cluster with about 3500 osds.  We run 3 mons on dedicated hosts, and the mons typically use a few percent of a core, and generate about 50Mbits/sec network traffic.  They are connected at 20Gbits/sec (bonded dual 10Gbit) and are running on 2x14 core se

Re: [ceph-users] Ceph monitors overloaded on large cluster restart

2018-12-19 Thread Andras Pataki
Forgot to mention: all nodes are on Luminous 12.2.8 currently on CentOS 7.5. On 12/19/18 5:34 PM, Andras Pataki wrote: Dear ceph users, We have a large-ish ceph cluster with about 3500 osds.  We run 3 mons on dedicated hosts, and the mons typically use a few percent of a core, and generate

Re: [ceph-users] Ceph monitors overloaded on large cluster restart

2018-12-19 Thread Andras Pataki
luggish on a 3400 osd cluster, perhaps taking a few 10s of seconds). the pgs should be active+clean at this point. 5. unset nodown, noin, noout. which should change nothing provided all went well. Hope that helps for next time! Dan On Wed, Dec 19, 2018 at 11:39 PM Andras Pataki mailto:ap

[ceph-users] cephfs kernel client instability

2018-12-26 Thread Andras Pataki
We've been using ceph-fuse with a pretty good stability record (against the Luminous 12.2.8 back end).  Unfortunately ceph-fuse has extremely poor small file performance (understandably), so we've been testing the kernel client.  The latest RedHat kernel 3.10.0-957.1.3.el7.x86_64 seems to work

Re: [ceph-users] cephfs kernel client instability

2019-01-03 Thread Andras Pataki
ne of these mon communication sessions only lasts half a second.  Then it reconnects to another mon, and gets the same result, etc.  Any way around this? Andras On 12/26/18 7:55 PM, Andras Pataki wrote: We've been using ceph-fuse with a pretty good stability record (against the Luminous 1

Re: [ceph-users] cephfs kernel client instability

2019-01-15 Thread Andras Pataki
maps need to be reencoded (to jewel), or how to improve this behavior would be much appreciated.  We would really be interested in using the kernel client instead of fuse, but this seems to be a stumbling block. Thanks, Andras On 1/3/19 6:49 AM, Andras Pataki wrote: I wonder if anyone could

Re: [ceph-users] cephfs kernel client instability

2019-01-16 Thread Andras Pataki
Hi Ilya/Kjetil, I've done some debugging and tcpdump-ing to see what the interaction between the kernel client and the mon looks like.  Indeed - CEPH_MSG_MAX_FRONT defined as 16Mb seems low for the default mon messages for our cluster (with osd_mon_messages_max at 100).  We have about 3500 os

Re: [ceph-users] cephfs kernel client instability

2019-01-24 Thread Andras Pataki
at 7:12 PM Andras Pataki wrote: Hi Ilya/Kjetil, I've done some debugging and tcpdump-ing to see what the interaction between the kernel client and the mon looks like. Indeed - CEPH_MSG_MAX_FRONT defined as 16Mb seems low for the default mon messages for our cluster (with osd_mon_messages_max at

[ceph-users] Mimic and cephfs

2019-02-25 Thread Andras Pataki
Hi ceph users, As I understand, cephfs in Mimic had significant issues up to and including version 13.2.2.  With some critical patches in Mimic 13.2.4, is cephfs now production quality in Mimic?  Are there folks out there using it in a production setting?  If so, could you share your experien

[ceph-users] Poor cephfs (ceph_fuse) write performance in Mimic

2019-04-04 Thread Andras Pataki
Hi cephers, I'm working through our testing cycle to upgrade our main ceph cluster from Luminous to Mimic, and I ran into a problem with ceph_fuse.  With Luminous, a single client can pretty much max out a 10Gbps network connection writing sequentially on our cluster with Luminous ceph_fuse. 

[ceph-users] ceph-fuse segfaults in 14.2.2

2019-09-04 Thread Andras Pataki
Dear ceph users, After upgrading our ceph-fuse clients to 14.2.2, we've been seeing sporadic segfaults with not super revealing stack traces: in thread 7fff5a7fc700 thread_name:ceph-fuse  ceph version 14.2.2 (4f8fa0a0024755aae7d95567c63f11d6862d55be) nautilus (stable)  1: (()+0x

[ceph-users] Nautilus - inconsistent PGs - stat mismatch

2019-10-21 Thread Andras Pataki
We have a new ceph Nautilus setup (Nautilus from scratch - not upgraded): # ceph versions {     "mon": {     "ceph version 14.2.4 (75f4de193b3ea58512f204623e6c5a16e6c1e1ba) nautilus (stable)": 3     },     "mgr": {     "ceph version 14.2.4 (75f4de193b3ea58512f204623e6c5a16e6c1e1ba) nau

Re: [ceph-users] clust recovery stuck

2019-10-22 Thread Andras Pataki
Hi Philipp, Given 256 PG's triple replicated onto 4 OSD's you might be encountering the "PG overdose protection" of OSDs.  Take a look at 'ceph osd df' and see the number of PG's that are mapped to each OSD (last column or near the last).  The default limit is 200, so if any OSD exceeds that,

[ceph-users] Mimic - cephfs scrub errors

2019-11-15 Thread Andras Pataki
Dear cephers, We've had a few (dozen or so) rather odd scrub errors in our Mimic (13.2.6) cephfs: 2019-11-15 07:52:52.614 7fffcc41f700  0 log_channel(cluster) log [DBG] : 2.b5b scrub starts 2019-11-15 07:52:55.190 7fffcc41f700 -1 log_channel(cluster) log [ERR] : 2.b5b shard 599 soid 2:dad015