[ceph-users] Re: Ceph RBD - High IOWait during the Writes

2020-11-12 Thread athreyavc
>From different search results I read, disabling cephx can help. Also https://static.linaro.org/connect/san19/presentations/san19-120.pdf recommended some settings changes for the bluestore cache. [osd] bluestore cache autotune = 0 bluestore_cache_kv_ratio = 0.2 bluestore_cache_meta_ratio = 0.8 b

[ceph-users] Re: Nautilus - osdmap not trimming

2020-11-12 Thread m . sliwinski
Hi Thanks for the reply. Yeah, i restarted all of the mon servers, in sequence, and yesterday just leader alone without any success. Reports: root@monb01:~# ceph report | grep committed report 4002437698 "monmap_first_committed": 1, "monmap_last_committed": 6, "osdmap_first_commit

[ceph-users] Re: Ceph RBD - High IOWait during the Writes

2020-11-12 Thread Edward kalk
for certain CPU architecture, disable spectre and meltdown mitigations. (be certain network to physical nodes is secure from internet access) (use apt proxy, http(s), curl proxy servers) Try to toggle on or off the physical on disk cache. (raid controller command) ^I had same issue, doing both o

[ceph-users] Re: Unable to clarify error using vfs_ceph (Samba gateway for CephFS)

2020-11-12 Thread Frank Schilder
You might face the same issue I had. vfs_ceph wants to have a key for the root of the cephfs, it is cutrently not possible to restrict access to a sub-directory mount. For this reason, I decided to go for a re-export of a kernel client mount. I consider this a serious security issue in vfs_ceph

[ceph-users] Re: Nautilus - osdmap not trimming

2020-11-12 Thread Dan van der Ster
This is weird -- afaict it should be trimming. Can you revert your custom paxos and osdmap options to their defaults, then restart your mon leader, then wait 5 minutes, then finally generate some new osdmap churn (e.g. ceph osd pool set xx min_size 2, redundantly). Then please again share the relev

[ceph-users] Re: NoSuchKey on key that is visible in s3 list/radosgw bk

2020-11-12 Thread Janek Bevendorff
Here is a bug report concerning (probably) this exact issue: https://tracker.ceph.com/issues/47866 I left a comment describing the situation and my (limited) experiences with it. On 11/11/2020 10:04, Janek Bevendorff wrote: Yeah, that seems to be it. There are 239 objects prefixed .8naRUH

[ceph-users] Re: Ceph RBD - High IOWait during the Writes

2020-11-12 Thread athreyavc
Hi, Thanks for the email, But we are not using RAID at all, we are using HBAs LSI HBA 9400-8e. Eash HDD is configured as an OSD. On Thu, Nov 12, 2020 at 12:19 PM Edward kalk wrote: > for certain CPU architecture, disable spectre and meltdown mitigations. > (be certain network to physical nodes

[ceph-users] Re: Nautilus - osdmap not trimming

2020-11-12 Thread m . sliwinski
Hi I removed related options excluding "mon_debug_block_osdmap_trim false" Logs below. I'm not sure how to extract required information so i just used grep. If it's not enough then please let me know. I can also upload etire log somewhere if required. root@monb01:~# grep trim ceph-mo

[ceph-users] Re: NoSuchKey on key that is visible in s3 list/radosgw bk

2020-11-12 Thread huxia...@horebdata.cn
which Ceph versions are affected by this RGW bug/issues? Luminous, Mimic, Octupos, or the latest? any idea? samuel huxia...@horebdata.cn From: EDH - Manuel Rios Date: 2020-11-12 14:27 To: Janek Bevendorff; Rafael Lopez CC: Robin H. Johnson; ceph-users Subject: [ceph-users] Re: NoSuchKey on

[ceph-users] Re: NoSuchKey on key that is visible in s3 list/radosgw bk

2020-11-12 Thread Janek Bevendorff
I have never seen this on Luminous. I recently upgraded to Octopus and the issue started occurring only few weeks later. On 12/11/2020 16:37, huxia...@horebdata.cn wrote: which Ceph versions are affected by this RGW bug/issues? Luminous, Mimic, Octupos, or the latest? any idea? samuel

[ceph-users] Re: Unable to clarify error using vfs_ceph (Samba gateway for CephFS)

2020-11-12 Thread Matt Larson
Thank you Frank, That was a good suggestion to make sure the mount wasn't the issue. I tried changing the `client.samba.upload` to have read access directly to '/' rather than '/upload' and to also change smb.conf to directly use 'path = /'. Still getting the same issue (log level 10 content belo

[ceph-users] Re: NoSuchKey on key that is visible in s3 list/radosgw bk

2020-11-12 Thread EDH - Manuel Rios
This same error caused us to wipe a full cluster of 300TB... will be related to some rados index/database bug not to s3. As Janek exposed is a mayor issue, because the error silent happend and you can only detect it with S3, when you're going to delete/purge a S3 bucket. Dropping NoSuchKey. Err

[ceph-users] Re: Rados Crashing

2020-11-12 Thread Brent Kennedy
I didn't know there was a replacement for the radosgw role! I saw in the ceph-ansible project mention of a radosgw load balancer but since I use haproxy, I didn't dig into that. Is that what you are referring to? Otherwise, I cant seem to find any mention of cive being replaced. For the issue be

[ceph-users] Re: Unable to clarify error using vfs_ceph (Samba gateway for CephFS)

2020-11-12 Thread Frank Schilder
You might need to give read permissions to the ceph config and key file for the user that runs the SAMBA service (samba?). Either add the SAMBA user to the group ceph, or change the group of the file. The statement "/" file not found could just be an obfuscating message on an actual security/pe

[ceph-users] Re: NoSuchKey on key that is visible in s3 list/radosgw bk

2020-11-12 Thread huxia...@horebdata.cn
Looks like this is a very dangerous bug for data safety. Hope the bug would be quickly identified and fixed. best regards, Samuel huxia...@horebdata.cn From: Janek Bevendorff Date: 2020-11-12 18:17 To: huxia...@horebdata.cn; EDH - Manuel Rios; Rafael Lopez CC: Robin H. Johnson; ceph-users S

[ceph-users] Re: (Ceph Octopus) Repairing a neglected Ceph cluster - Degraded Data Reduncancy, all PGs degraded, undersized, not scrubbed in time

2020-11-12 Thread Phil Merricks
Thanks for the reply Robert. Could you briefly explain the issue with the current setup and "what good looks like" here, or point me to some documentation that would help me figure that out myself? I'm guessing here it has something to do with the different sizes and types of dial, and possibly

[ceph-users] Re: Unable to clarify error using vfs_ceph (Samba gateway for CephFS)

2020-11-12 Thread Brad Hubbard
I don't know much about the vfs plugin (nor cephfs for that matter) but I would suggest enabling client debug logging on the machine so you can see what the libcephfs code is doing since that's likely where the ENOENT is coming from. https://docs.ceph.com/en/latest/rados/troubleshooting/log-and-de

[ceph-users] Autoscale - enable or not on main pool?

2020-11-12 Thread Brent Kennedy
I recently setup a new octopus cluster and was testing the autoscale feature. Used ceph-ansible so its enabled by default. Anyhow, I have three other clusters that are on nautilus, so I wanted to see if it made sense to enable it there on the main pool. Here is a print out of the autoscale st

[ceph-users] Re: question about rgw delete speed

2020-11-12 Thread Brent Kennedy
Ceph is definitely a good choice for storing millions of files. It sounds like you plan to use this like s3, so my first question would be: Are the deletes done for a specific reason? ( e.g. the files are used for a process and discarded ) If its an age thing, you can set the files to expir

[ceph-users] Re: Is there a way to make Cephfs kernel client to write data to ceph osd smoothly with buffer io

2020-11-12 Thread Frank Schilder
Yes, that's right. It would be nice if there was a mount option to have such parameters adjusted on a per-file system basis. I should mention that I observed a significant performance improvement for HDD throughput of the local disk as well when adjusting these parameters for ceph. This is larg

[ceph-users] Re: question about rgw delete speed

2020-11-12 Thread Nathan Fish
>From what we have experienced, our delete speed scales with the CPU available to the MDS. And the MDS only seems to scale to 2-4 CPUs per daemon, so for our biggest filesystem, we have 5 active MDS daemons. Migrations reduced performance a lot, but pinning fixed that. Even better is just getting t

[ceph-users] Tracing in ceph

2020-11-12 Thread Seena Fallah
Hi all, Does this project work with the latest zipkin apis? https://github.com/ceph/babeltrace-zipkin Also what do you prefer to trace requests for rgw and rbd in ceph? Thanks. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an em

[ceph-users] which of cpu frequency and number of threads servers osd better?

2020-11-12 Thread Tony Liu
Hi, For example, 16 threads with 3.2GHz and 32 threads with 3.0GHz, which makes 11 OSDs (10x12TB HDD and 1x960GB SSD) with better performance? Thanks! Tony ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le.

[ceph-users] Re: which of cpu frequency and number of threads servers osd better?

2020-11-12 Thread Nathan Fish
>From what I've seen, OSD daemons tend to bottleneck on the first 2 threads, while getting some use out of another 2. So 32 threads at 3.0 would be a lot better. Note that you may get better performance splitting off some of that SSD for block.db partitions or at least block.wal for the HDDs. On T

[ceph-users] Re: which of cpu frequency and number of threads servers osd better?

2020-11-12 Thread Tony Liu
Thanks Nathan! Tony > -Original Message- > From: Nathan Fish > Sent: Thursday, November 12, 2020 7:43 PM > To: Tony Liu > Cc: ceph-users@ceph.io > Subject: Re: [ceph-users] which of cpu frequency and number of threads > servers osd better? > > From what I've seen, OSD daemons tend to bot

[ceph-users] Re: Cephfs Kernel client not working properly without ceph cluster IP

2020-11-12 Thread Amudhan P
Hi Eugen, The issue looks fixed now in my kernel client mount works fine without cluster IP. I have re-run "ceph config set osd cluster_network 10.100.4.0/24" and restarted all service. Eearlier it was run with "ceph config set global cluster_network 10.100.4.0/24". I have run the command output

[ceph-users] Re: Not able to read file from ceph kernel mount

2020-11-12 Thread Amudhan P
Hi, This issue is fixed now after setting cluster_IP to only osd's. Mount works perfectly fine. "ceph config set osd cluster_network 10.100.4.0/24" regards Amudhan On Sat, Nov 7, 2020 at 10:09 PM Amudhan P wrote: > Hi, > > At last, the problem fixed for now by adding cluster network IP to the

[ceph-users] Re: which of cpu frequency and number of threads servers osd better?

2020-11-12 Thread Martin Verges
Hello Tony, as it is HDD, your CPU won't be a bottleneck at all. Both CPUs are overprovisioned. -- Martin Verges Managing director Mobile: +49 174 9335695 E-Mail: martin.ver...@croit.io Chat: https://t.me/MartinVerges croit GmbH, Freseniusstr. 31h, 81247 Munich CEO: Martin Verges - VAT-ID: DE31

[ceph-users] Re: which of cpu frequency and number of threads servers osd better?

2020-11-12 Thread Tony Liu
You all mentioned first 2T and another 2T. Could you give more details how OSD works with multi-thread, or share the link if it's already documented somewhere? Is it always 4T, or start with 1T and grow up to 4T? Is it max 4T? Does each T run different job or just multiple instances of the same jo