[ceph-users] Re: Slow recovery and inaccurate recovery figures since Quincy upgrade

2023-10-03 Thread Sridhar Seshasayee
Hello Iain, Does anyone have any ideas of what could be the issue here or anywhere we > can check what is going on?? > > You could be hitting the slow backfill/recovery issue with mclock_scheduler. Could you please provide the output of the following commands? 1. ceph versions 2. ceph config get

[ceph-users] S3 user with more than 1000 buckets

2023-10-03 Thread Thomas Bennett
Hi, I'm running a Ceph 17.2.5 Rados Gateway and I have a user with more than 1000 buckets. When the client tries to list all their buckets using s3cmd, rclone and python boto3, they all three only ever return the first 1000 bucket names. I can confirm the buckets are all there (and more than 1000

[ceph-users] Re: S3 user with more than 1000 buckets

2023-10-03 Thread Jonas Nemeiksis
Hi, You should increase these default settings: rgw_list_buckets_max_chunk // for buckets rgw_max_listing_results // for objects On Tue, Oct 3, 2023 at 12:59 PM Thomas Bennett wrote: > Hi, > > I'm running a Ceph 17.2.5 Rados Gateway and I have a user with more than > 1000 buckets. > > When the

[ceph-users] Performance drop and retransmits with CephFS

2023-10-03 Thread Tom Wezepoel
Hi all, Have a question regarding CephFS and write performance. Possibly I am overlooking a setting. We recently started using Ceph, where we want to use CephFS as a shared storage system for a Sync-and-Share solution. Now we are still in a testing phase, where we are also mainly looking at the p

[ceph-users] Ceph Quarterly (CQ) - Issue #2

2023-10-03 Thread Zac Dover
The second issue of "Ceph Quarterly" is attached to this email. Ceph Quarterly (or "CQ") is an overview of the past three months of upstream Ceph development. We provide CQ in three formats: A4, letter, and plain text wrapped at 80 columns. Two news items arrived after the deadline for typesett

[ceph-users] Re: S3 user with more than 1000 buckets

2023-10-03 Thread Thomas Bennett
Hi Jonas, Thanks :) that solved my issue. It would seem to me that this is heading towards something that the clients s3 should paginate, but I couldn't find any documentation on how to paginate bucket listings. All the information points to paginating object listing - which makes sense. Just fo

[ceph-users] Re: S3 user with more than 1000 buckets

2023-10-03 Thread Janne Johansson
Den tis 3 okt. 2023 kl 11:59 skrev Thomas Bennett : > Hi, > > I'm running a Ceph 17.2.5 Rados Gateway and I have a user with more than > 1000 buckets. > > When the client tries to list all their buckets using s3cmd, rclone and > python boto3, they all three only ever return the first 1000 bucket n

[ceph-users] Re: S3 user with more than 1000 buckets

2023-10-03 Thread Casey Bodley
On Tue, Oct 3, 2023 at 9:06 AM Thomas Bennett wrote: > > Hi Jonas, > > Thanks :) that solved my issue. > > It would seem to me that this is heading towards something that the clients > s3 should paginate, but I couldn't find any documentation on how to > paginate bucket listings. the s3 ListBucke

[ceph-users] Re: S3 user with more than 1000 buckets

2023-10-03 Thread Matt Benjamin
Hi Thomas, If I'm not mistaken, the RGW will paginate ListBuckets essentially like ListObjectsv1 if the S3 client provides the appropriate "marker" parameter values. COS does this too, I noticed. I'm not sure which S3 clients can be relied on to do this, though. Matt On Tue, Oct 3, 2023 at 9:0

[ceph-users] Re: Slow recovery and inaccurate recovery figures since Quincy upgrade

2023-10-03 Thread Iain Stott
Hi Sridhar, Thanks for the response, I have added the output you requested below, I have attached the output from the last command in a file as it was rather long. We did try to set high_recovery_ops but it didn't seem to have any visible effect. root@gb4-li-cephgw-001 ~ # ceph versions { "

[ceph-users] is the rbd mirror journal replayed on primary after a crash?

2023-10-03 Thread Scheurer François
Hello Short question regarding journal-based rbd mirroring. ▪IO path with journaling w/o cache: a. Create an event to describe the update b. Asynchronously append event to journal object c. Asynchronously update image once event is safe d. Complete IO to client once update is safe [cf. htt

[ceph-users] Re: Impacts on doubling the size of pgs in a rbd pool?

2023-10-03 Thread Hervé Ballans
Hi all, Sorry for the reminder, but does anyone have any advice on how to deal with this? Many thanks! Hervé Le 29/09/2023 à 11:34, Hervé Ballans a écrit : Hi all, I have a Ceph cluster on Quincy (17.2.6), with 3 pools (1 rbd and 1 CephFS volume), each configured with 3 replicas. $ sudo

[ceph-users] Re: Impacts on doubling the size of pgs in a rbd pool?

2023-10-03 Thread Michel Jouvin
Hi Herve, Why you don't use the automatic adjustment of the number of PGs. This makes life much easier and works well. Cheers, Michel Le 03/10/2023 à 17:06, Hervé Ballans a écrit : Hi all, Sorry for the reminder, but does anyone have any advice on how to deal with this? Many thanks! Her

[ceph-users] Re: cephfs health warn

2023-10-03 Thread Ben
Yes, I am. 8 active + 2 standby, no subtree pinning. What if I restart the mds with trimming issues? Trying to figure out what happens with restarting. Venky Shankar 于2023年10月3日周二 12:39写道: > Hi Ben, > > Are you using multimds without subtree pinning? > > On Tue, Oct 3, 2023 at 10:00 AM Ben wrot

[ceph-users] Re: S3 user with more than 1000 buckets

2023-10-03 Thread Thomas Bennett
Thanks for all the responses, much appreciated. Upping the chunk size fixes my problem in the short term but I upgrade to 17.2.6 :) Kind regards, Tom On Tue, 3 Oct 2023 at 15:28, Matt Benjamin wrote: > Hi Thomas, > > If I'm not mistaken, the RGW will paginate ListBuckets essentially like > Lis

[ceph-users] Re: rgw: disallowing bucket creation for specific users?

2023-10-03 Thread Matthias Ferdinand
On Sun, Oct 01, 2023 at 12:00:58PM +0200, Peter Goron wrote: > Hi Matthias, > > One possible way to achieve your need is to set a quota on number of > buckets at user level (see > https://docs.ceph.com/en/reef/radosgw/admin/#quota-management). Quotas are > under admin control. thanks a lot, rath

[ceph-users] ceph osd down doesn't seem to work

2023-10-03 Thread Simon Oosthoek
Hi I'm trying to mark one OSD as down, so we can clean it out and replace it. It keeps getting medium read errors, so it's bound to fail sooner rather than later. When I command ceph from the mon to mark the osd down, it doesn't actually do it. When the service on the osd stops, it is also ma

[ceph-users] Re: Impacts on doubling the size of pgs in a rbd pool?

2023-10-03 Thread David C.
Hi, Michel, the pool already appears to be in automatic autoscale ("autoscale_mode on"). If you're worried (if, for example, the platform is having trouble handling a large data shift) then you can set the parameter to warn (like the rjenkis pool). If not, as Hervé says, the transition to 2048

[ceph-users] Re: ceph osd down doesn't seem to work

2023-10-03 Thread Josh Baergen
Hi Simon, If the OSD is actually up, using 'ceph osd down` will cause it to flap but come back immediately. To prevent this, you would want to 'ceph osd set noup'. However, I don't think this is what you actually want: > I'm thinking (but perhaps incorrectly?) that it would be good to keep the OS

[ceph-users] Re: ceph osd down doesn't seem to work

2023-10-03 Thread Anthony D'Atri
And unless you *need* a given ailing OSD to be up because it's the only copy of data, you may get better recovery/backfill results by stopping the service for that OSD entirely, so that the recovery reads all to to healthier OSDs. > On Oct 3, 2023, at 12:21, Josh Baergen wrote: > > Hi Simon, >

[ceph-users] Re: ceph osd down doesn't seem to work

2023-10-03 Thread Simon Oosthoek
Hoi Josh, thanks for the explanation, I want to mark it out, not down :-) Most use of our cluster is in EC 8+3 or 5+4 pools, so one missing osd isn't bad, but if some of the blocks can still be read it may help to move them to safety. (This is how I imagine things anyway ;-) I'll have to loo

[ceph-users] VM hangs when overwriting a file on erasure coded RBD

2023-10-03 Thread Peter Linder
Dear all, I have a problem that after an OSD host lost connection to the sync/cluster rear network for many hours (the public network was online), a test VM using RBD cant overwrite its files. I can create a new file inside it just fine, but not overwrite it, the process just hangs. The VM's

[ceph-users] Re: [EXTERNAL] [Pacific] ceph orch device ls do not returns any HDD

2023-10-03 Thread Patrick Bégou
Hi all, still stuck with this problem. I've deployed octopus and all my HDD have been setup as osd. Fine. I've upgraded to pacific and 2 osd have failed. They have been automatically removed and upgrade finishes. Cluster Health is finaly OK, no data loss. But now I cannot re-add these osd wi

[ceph-users] ceph luminous client connect to ceph reef always permission denied

2023-10-03 Thread Pureewat Kaewpoi
Hi All ! We have a new installed cluster with ceph reef. but our old client still using ceph luminous. The problem is when using any command to ceph cluster It will hang and no any output. This is a output from command ceph osd pool ls --debug-ms 1 2023-10-02 23:35:22.727089 7fc93807c700 1 Pr

[ceph-users] ingress of haproxy is down after I specify the haproxy.cfg in quincy

2023-10-03 Thread wjsherry075
Hello, I have a haproxy problem in ceph quincy 17.2.6. Ununtu 22.04 The haproxy image can't be up after I specify the haproxy.cfg, and there is no error in the logs. I set the haproxy.cfg: ceph config-key set mgr/cephadm/services/ingress/haproxy.cfg -i haproxy.cfg If I remove the haproxy, and le

[ceph-users] Re: radosgw-admin sync error trim seems to do nothing

2023-10-03 Thread Matthew Darwin
Hello all, Any solution to this? I want to trim the error log to get rid of the warnings: Large omap object found. Object: 13:a24ff46e:::sync.error-log.2:head PG: 13.762ff245 (13.5) Key count: 236174 Size (bytes): 58472797 Seems similar report: https://tracker.ceph.com/issues/62845 On 202

[ceph-users] Re: cephfs health warn

2023-10-03 Thread Venky Shankar
Hi Ben, On Tue, Oct 3, 2023 at 8:56 PM Ben wrote: > > Yes, I am. 8 active + 2 standby, no subtree pinning. What if I restart the > mds with trimming issues? Trying to figure out what happens with restarting. We have come across instances in the past where multimds without subtree pinning can le

[ceph-users] Re: cephfs health warn

2023-10-03 Thread Ben
Hi Venky, thanks for help on this. Will change to multimds with subtree pinning. For the moment, it needs to get the segments list items go by loop of expiring -> expired -> trimmed. It is observed that each problematic mds has a few expiring segment stuck in the road of trimming. the segment lis

[ceph-users] Re: Slow recovery and inaccurate recovery figures since Quincy upgrade

2023-10-03 Thread Sridhar Seshasayee
To help complete the recovery, you can temporarily try disabling scrub and deep scrub operations by running: ceph osd set noscrub ceph osd set nodeep-scrub This should help speed up the recovery process. Once the recovery is done, you can unset the above scrub flags and revert the mClock profile