[ceph-users] Re: OSD slow ops warning not clearing after OSD down

2023-01-16 Thread Christian Rohmann
Hello, On 04/05/2021 09:49, Frank Schilder wrote: I created a ticket: https://tracker.ceph.com/issues/50637 We just observed this very issue on Pacific (16.2.10) , which I also commented on the ticket. I wonder if this case is so seldom, first having some issues causing slow ops and then a t

[ceph-users] Re: PG_BACKFILL_FULL

2023-01-16 Thread Iztok Gregori
Thank for your response and advice. On 16/01/23 15:17, Boris Behrens wrote: Hmm.. I ran into some similar issue. IMHO there are two ways to work around the problem until the new disk in place: 1. change the backfill full threshold (I use these commands: https://www.suse.com/support/kb/doc/?id

[ceph-users] Re: Mysterious HDD-Space Eating Issue

2023-01-16 Thread duluxoz
Hi All, Thanks to Eneko Lacunza, E Taka, and Anthony D'Atri for replying - all that advice was really helpful. So, we finally tracked down our "disk eating monster" (sort of). We've got a "runaway" ceph-guest-NN that is filling up its log file (/var/log/ceph/ceph-guest-NN.log) and eventually

[ceph-users] Unable to subscribe

2023-01-16 Thread Abhinav Singh
I m unable to subscribe even after sending mail to ceph-leave-user email id. Is there any other way to unsbuscribe? Thanks & Regards Abhinav Singh ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: MDS stuck in "up:replay"

2023-01-16 Thread Kotresh Hiremath Ravishankar
Hi Thomas, Sorry, I misread the mds state to be stuck in 'up:resolve' state. The mds is stuck in 'up:replay' which means the MDS taking over a failed rank. This state represents that the MDS is recovering its journal and other metadata. I notice that there are two filesystems 'cephfs' and 'cephfs

[ceph-users] Re: pg mapping verification

2023-01-16 Thread Christopher Durham
Yes, I know you can do that. I parsed this very output, then wrote a script to verify. I did not read in the crush map, but used my own knowledge of the crush rule to verify the placements. It would be nice to have a way to do this for an arbitrary crush rule and a pool using it. -Chris -

[ceph-users] Re: osd_memory_target values

2023-01-16 Thread Ernesto Puerta
Hi Mevludin, Dashboard just exposes the Ceph configuration overrides: daemon > global, but since this setting is only consumed by OSD daemons, you could use either (global or osd). Kind Regards, Ernesto On Mon, Jan 16, 2023 at 5:40 PM Mevludin Blazevic wrote: > Hi all, > > for a ceph cluster

[ceph-users] Building Ceph containers

2023-01-16 Thread Matthew Vernon
Hi, Is it possible/supported to build Ceph containers on Debian? The build instructions[0] talk about building packages (incl. .debs), but now building containers. Cephadm only supports containerised deployments, but our local policy is that we should only deploy containers we've built ourse

[ceph-users] Re: Useful MDS configuration for heavily used Cephfs

2023-01-16 Thread Anthony D'Atri
Visualization of the min_alloc_size / wasted space dynamic: https://docs.google.com/spreadsheets/d/1rpGfScgG-GLoIGMJWDixEkqs-On9w8nAUToPQjN8bDI/edit#gid=358760253 Bluestore Space Amplification Cheat Sheet docs.google.com > On Jan 16, 2023, at 3:49 AM, Frank Schilder wrote: > > If you have m

[ceph-users] Problem with IO after renaming File System .data pool

2023-01-16 Thread murilo
Good morning everyone. On this Thursday night we went through an accident, where they accidentally renamed the .data pool of a File System making it instantly inaccessible, when renaming it again to the correct name it was possible to mount and list the files, but could not read or write. When

[ceph-users] Re: Corrupt bluestore after sudden reboot (17.2.5)

2023-01-16 Thread dongdong . tao
Hi Peter, Could you add debug_bluestore = 20 to your ceph.conf and restart the OSD, then send the log after it crashes? And I believe this is worth opening a tracker ticket : https://tracker.ceph.com/projects/bluestore Thanks, Dongdong ___ ceph-users

[ceph-users] osd_memory_target values

2023-01-16 Thread Mevludin Blazevic
Hi all, for a ceph cluster with RAM size of 256 GB per node, I would increase the osd_memory_target from default 4GB up to 12GB. Through the ceph dashboard, different values are given to set the new value (global, mon, ..., osd). Is there any difference between them? From my point of view, I

[ceph-users] Re: rbd-mirror ceph quincy Not able to find rbd_mirror_journal_max_fetch_bytes config in rbd mirror

2023-01-16 Thread ankit raikwar
@Eugen Block Thank for your response , i tryed both option but i don't see ant effect on the replication speed . can you or any one suggest any other way because it's so slow we are not able to continue with this slow speed . please help or suggest any configuration.

[ceph-users] Re: Useful MDS configuration for heavily used Cephfs

2023-01-16 Thread Darren Soothill
There are a few details missing to allow people to provide you with advice. How many files are you expecting to be in this 100TB of capacity? This really dictates what you are looking for. It could be full of 4K files which is a very different proposition to it being full of 100M files. What s

[ceph-users] Re: MDS error

2023-01-16 Thread afsmaira
Aditional information: - We already tried to restart services and hole machine - Part of jounalctl: jan 13 02:40:18 s1.ceph.infra.ufscar.br ceph-bab39b74-c93a-4e34-aae9-a44a5569d52c-mon-s1[6343]: debug 2023-01-13T05:40:18.653+ 7fc370b64700 0 log_channel(cluster) log [WRN] : Replacing daem

[ceph-users] Re: Telemetry service is temporarily down

2023-01-16 Thread Yaarit Hatuka
Hi everyone, Our telemetry service is up and running again. Thanks Adam Kraitman and Dan Mick for restoring the service. We thank you for your patience and appreciate your contribution to the project! Thanks, Yaarit On Tue, Jan 3, 2023 at 3:14 PM Yaarit Hatuka wrote: > Hi everyone, > > We are

[ceph-users] Problem with IO after renaming File System .data pool

2023-01-16 Thread Murilo Morais
Good morning everyone. That night we went through an accident, where they accidentally renamed the .data pool of a File System making it instantly inaccessible, when renaming it again to the correct name it was possible to mount and list the files, but could not read or write. When trying to write

[ceph-users] Re: iscsi target lun error

2023-01-16 Thread Xiubo Li
On 12/01/2023 20:42, Frédéric Nass wrote: Hi Xiubo, Randy, This is due to ' host.containers.internal' being added to the container's /etc/hosts since Podman 4.1+. Okay. When creating the gateway it will try to get the hostname from the container as the final gateway name, not the gateway n

[ceph-users] Re: BlueFS spillover warning gone after upgrade to Quincy

2023-01-16 Thread Igor Fedotov
Hi Benoit and Peter, looks like your findings are valid and spillover alert is broken for now.  I've just created https://tracker.ceph.com/issues/58440 to track this. Thanks, Igor On 1/13/2023 9:54 AM, Benoît Knecht wrote: Hi Peter, On Thursday, January 12th, 2023 at 15:12, Peter van Heus

[ceph-users] rbd-mirror | ceph quincy Not able to find rbd_mirror_journal_max_fetch_bytes config in rbd mirror

2023-01-16 Thread ankit raikwar
Hello All, In the ceph quincy Not able to find rbd_mirror_journal_max_fetch_bytes config in rbd mirror i configured the ceph cluster almost 400 tb and enable the rbd-mirror in the starting stage i'm able to achive the almost 9 GB speed , but after the rebalane completed

[ceph-users] nfs RGW export makes nfs-gnaesha server in crash loop

2023-01-16 Thread Ben Gao
Hi, This is running Quincy 17.2.5 deployed by rook on k8s. RGW nfs export will crash Ganesha server pod. CephFS export works just fine. Here are steps of it: 1, create export: bash-4.4$ ceph nfs export create rgw --cluster-id nfs4rgw --pseudo-path /bucketexport --bucket testbk { "bind":

[ceph-users] [rgw] Upload object with bad performance after the cluster running few months

2023-01-16 Thread can zhu
ceph version is: 16.2.10 Use “rclone” tools to upload the big object: rclone copy ./zh-cn_windows_10_business_editions_version_21h1_updated_jul_2021_x64_dvd_f49026f5.iso smd:test --progres Transferred: 670 MiB / 5.293 GiB, 12%, 15.195 MiB/s, ETA 5m12s Transferred: 0 / 1, 0% Elapsed time: 46.0s Tra

[ceph-users] Move bucket between realms

2023-01-16 Thread mahnoosh shahidi
Hi all, Is there any way in rgw to move a bucket from one realm to another one in the same cluster? Best regards, Mahnoosh ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Move bucket between realms

2023-01-16 Thread mahnoosh shahidi
Hi all, Is there any way in rgw to move a bucket from one realm to another one in the same cluster? Best regards, Mahnoosh ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] nfs RGW export makes nfs-gnaesha server in crash loop

2023-01-16 Thread Ben
Hi, This is running Quincy 17.2.5 deployed by rook on k8s. RGW nfs export will crash Ganesha server pod. CephFS export works just fine. Here are steps of it: 1, create export: bash-4.4$ ceph nfs export create rgw --cluster-id nfs4rgw --pseudo-path /bucketexport --bucket testbk { "bind": "/

[ceph-users] Issues with cephadm adopt cluster with name

2023-01-16 Thread armsby
I am trying to adopt a cluster with cephadm, and everything was ok when it came to mon and mgr servers, But when I try to run "cephadm adopt --name osd.340 --style legacy —cluster prod" It runs everything, but when the container starts, it says that it can not open /etc/ceph/prod.conf as it bi

[ceph-users] Mysterious HDD-Space Eating Issue

2023-01-16 Thread matthew
Hi Guys, I've got a funny one I'm hoping someone can point me in the right direction with: We've got three identical(?) Ceph nodes running 4 OSDs, Mon, Man, and iSCSI G/W each (we're only a small shop) on Rocky Linux 8 / Ceph Quincy. Everything is running fine, no bottle-necks (as far as we ca

[ceph-users] Re: OSD crash on Onode::put

2023-01-16 Thread Dongdong Tao
Hi Frank, I don't have an operational workaround, the patch https://github.com/ceph/ceph/pull/46911/commits/f43f596aac97200a70db7a70a230eb9343018159 is simple and can be applied cleanly. Yes, restarting the OSD will clear pool entries, you can restart it when the bluestore_onode items are very lo

[ceph-users] Retrieve number of read/write operations for a particular file in Cephfs

2023-01-16 Thread thanh son le
Hi, I have been studying the document from Ceph and Rados but I could not find any metrics to measure the number of read/write operations for each file. I understand that Cephfs is the front-end, the file is going to be stored as an object in the OSD and I have found that Ceph provides a Cache Tie

[ceph-users] Re: 2 pgs backfill_toofull but plenty of space

2023-01-16 Thread Torkil Svensgaard
On 10-01-2023 18:59, Fox, Kevin M wrote: What else is going on? (ceph -s). If there is a lot of data being shuffled around, it may just be because its waiting for some other actions to complete first. There's a bit going on but if it is waiting for something else it shouldn't be backfill_too

[ceph-users] NoSuchBucket when bucket exists ..

2023-01-16 Thread Shashi Dahal
Hi, In a working All-in-one test setup ( where making the bucket public works from the browser) radosgw-admin bucket list [ "711138fc95764303b83002c567ce0972/demo" ] I have another cluster where openstack and ceph are separate. I have set same config options in ceph.conf .. rgw_enable_apis

[ceph-users] ceph orch cannot refresh

2023-01-16 Thread Nicola Mori
Dear Ceph users, after a host failure in my cluster (quincy 17.2.3 managed by cephadm) it seems that ceph orch got somehow stuck and it cannot operate. For example, it seems that it cannot refresh the status of several services since about 20 hours: # ceph orch ls NAME

[ceph-users] bidirectional rbd-mirroring

2023-01-16 Thread Aielli, Elia
Hi all, I've a working couple of cluster configured with rbd mirror, Master cluster is production, Backup cluster is DR. Right now all is working good with Master configured in "tx-only" and Backup in "rx-tx". I'd like to modify Master direction to rx-tx so I'm already prepared for a failover afte

[ceph-users] Re: PG_BACKFILL_FULL

2023-01-16 Thread Boris Behrens
Hmm.. I ran into some similar issue. IMHO there are two ways to work around the problem until the new disk in place: 1. change the backfill full threshold (I use these commands: https://www.suse.com/support/kb/doc/?id=19724) 2. reweight the backfill full OSDs just a little bit, so they move da

[ceph-users] PG_BACKFILL_FULL

2023-01-16 Thread Iztok Gregori
Hi to all! We are in a situation where we have 3 PG in "active+remapped+backfill_toofull". It happened when we executed a "gentle-reweight" to zero of one OSD (osd.77) to swap it with a new one (the current one registered some read errors and it's to be replaced just-in-case). # ceph healt

[ceph-users] RGW - large omaps even when buckets are sharded

2023-01-16 Thread Boris Behrens
Hi, since last week the scrubbing results in large omap warning. After some digging I've got these results: # searching for indexes with large omaps: $ for i in `rados -p eu-central-1.rgw.buckets.index ls`; do rados -p eu-central-1.rgw.buckets.index listomapkeys $i | wc -l | tr -d '\n' >> omap

[ceph-users] Re: Useful MDS configuration for heavily used Cephfs

2023-01-16 Thread E Taka
Thanks, Frank, for these detailed insights! I really appreciate your help. Am Mo., 16. Jan. 2023 um 09:49 Uhr schrieb Frank Schilder : > Hi, we are using ceph fs for data on an HPC cluster and, looking at your > file size distribution, I doubt that MDS performance is a bottleneck. Your > limiting

[ceph-users] Re: MDS stuck in "up:replay"

2023-01-16 Thread Thomas Widhalm
Hi Kotresh, Thanks for your reply! I only have one rank. Here's the output of all MDS I have: ### [ceph: root@ceph06 /]# ceph tell mds.mds01.ceph05.pqxmvt status 2023-01-16T08:55:26.055+ 7f3412ffd700 0 client.61249926 ms_handle_reset on v2:192.168.23.65:6800/2680651694 202

[ceph-users] Re: Useful MDS configuration for heavily used Cephfs

2023-01-16 Thread Frank Schilder
Hi, we are using ceph fs for data on an HPC cluster and, looking at your file size distribution, I doubt that MDS performance is a bottleneck. Your limiting factors are super-small files and IOP/s budget of the fs data pool. On our system, we moved these workloads to an all-flash beegfs. Ceph is