[ceph-users] Re: RGW objects has same marker and bucket id in different buckets.

2021-04-21 Thread Matt Benjamin
Hi Morphin, Yes, this is by design. When an RGW object has tail chunks and is copied so as to duplicate an entire tail chunk, RGW causes the coincident chunk(s) to be shared. Tail chunks are refcounted to avoid leaks. Matt On Wed, Apr 21, 2021 at 4:21 PM by morphin wrote: > > Hello. > > I hav

[ceph-users] Re: EC Backfill Observations

2021-04-21 Thread Josh Baergen
> Yes, the reservation mechanism is rather complex and intertwined with > the recovery state machine. There was some discussion about this > (including the idea of backoffs) before: Thanks! Josh ___ ceph-users mailing list -- ceph-users@ceph.io To unsub

[ceph-users] Re: EC Backfill Observations

2021-04-21 Thread Josh Baergen
Hey Josh, Thanks for the info! > With respect to reservations, it seems like an oversight that > we don't reserve other shards for backfilling. We reserve all > shards for recovery [0]. Very interesting that there is a reservation difference between backfill and recovery. > On the other hand, o

[ceph-users] RGW objects has same marker and bucket id in different buckets.

2021-04-21 Thread by morphin
Hello. I have a rgw s3 user and the user have 2 bucket. I tried to copy objects from old.bucket to new.bucket with rclone. (in the rgw client server) After I checked the object with "radosgw-admin --bucket=new.bucket object stat $i" and I saw old.bucket id and marker id also old bucket name in the

[ceph-users] Re: MDS_TRIM 1 MDSs behind on trimming and

2021-04-21 Thread Flemming Frandsen
I'll be damned. I restarted the wedged mds and after a reasonable amount of time the standby mds finished replaying and became active. The cluster is now healthy and it seems the apps I have running on top of cephfs have sorted themselves out too, I guess all the MDS really needed was a stern bul

[ceph-users] Re: MDS_TRIM 1 MDSs behind on trimming and

2021-04-21 Thread Dan van der Ster
No no pinning now won't help anything... I was asking to understand if it's likely there is balancing happening actively now. If you don't pin, then it's likely. Try the debug logs. And check the exports using something like : ceph daemon mds.b get subtrees | jq '.[] | [.dir.path, .auth_first, .e

[ceph-users] Re: MDS_TRIM 1 MDSs behind on trimming and

2021-04-21 Thread Flemming Frandsen
No, I don't. I guess I could pin a large part of the tree, if that's something that's likely to help. On Wed, 21 Apr 2021 at 21:02, Dan van der Ster wrote: > You don't pin subtrees ? > I would guess that something in the workload changed and it's triggering a > particularly bad behavior in the

[ceph-users] Re: MDS_TRIM 1 MDSs behind on trimming and

2021-04-21 Thread Dan van der Ster
You don't pin subtrees ? I would guess that something in the workload changed and it's triggering a particularly bad behavior in the md balancer. Increase debug_mds gradually on both mds's; hopefully that gives a hint as to what it's doing. .. dan On Wed, Apr 21, 2021, 8:48 PM Flemming Frandsen

[ceph-users] Re: MDS_TRIM 1 MDSs behind on trimming and

2021-04-21 Thread Flemming Frandsen
Not as of yet, it's steadily getting further behind. We're now up to 6797 segments and there's still the same 14 long-running operations that are all "cleaned up request". Something is blocking trimming, normally I'd follow the advice of restarting the mds: https://docs.ceph.com/en/latest/cephfs/

[ceph-users] New Ceph cluster- having issue with one monitor

2021-04-21 Thread Robert W. Eckert
Hi, I have pieced together some pcs which I had been using to run a windows DFS cluster. the 3 servers all have 3 4Tb Hard Drives and 1 2Tb SSD, but they have different CPUs All of them are running RHEL8, and have 2.5 Gbps NICs in them. The install was with cephadm, and the ceph processes are

[ceph-users] Re: MDS_TRIM 1 MDSs behind on trimming and

2021-04-21 Thread Dan van der Ster
Did this eventually clear? We had something like this happen once when we changed an md export pin for a very top level directory from mds.3 to mds.0. This triggered so much subtree export work that it took something like 30 minutes to complete. In our case the md segments kept growing into a few 1

[ceph-users] Re: osd nearfull is not detected

2021-04-21 Thread Dan van der Ster
Are you currently doing IO on the relevant pool? Maybe nearfull isn't reported until some pgstats are reported. Otherwise sorry I haven't seen this. Dan On Wed, Apr 21, 2021, 8:05 PM Konstantin Shalygin wrote: > Hi, > > On the adopted cluster Prometheus was triggered for "osd full > 90%" >

[ceph-users] Re: EC Backfill Observations

2021-04-21 Thread Josh Durgin
On 4/21/21 9:29 AM, Josh Baergen wrote: Hey Josh, Thanks for the info! With respect to reservations, it seems like an oversight that we don't reserve other shards for backfilling. We reserve all shards for recovery [0]. Very interesting that there is a reservation difference between backfill

[ceph-users] Re: MDS_TRIM 1 MDSs behind on trimming and

2021-04-21 Thread Flemming Frandsen
I've gone through the clients mentioned by the ops in flight and none of them are connected any more. The number of segments that the MDS is behind on is rising steadily and the ops_in_flight remain, this feels a lot like a catastrophe brewing. The documentation suggests trying to restart the MDS

[ceph-users] MDS_TRIM 1 MDSs behind on trimming and

2021-04-21 Thread Flemming Frandsen
I've just spent a couple of hours waiting for an MDS server to replay a journal that it was behind on and it seems to be getting worse. The system is not terribly busy, but there are 14 ops in flight that are very old and do not seem to go away on their own. Is there anything I can do to unwedge

[ceph-users] Re: MDS replay takes forever and cephfs is down

2021-04-21 Thread Flemming Frandsen
On Wed, 21 Apr 2021 at 16:57, Patrick Donnelly wrote: > It's probably that you have a very large journal (behind on trimming). > Hmm, yes, that might be, is that related to "MDSs behind on trimming" warning? According to the documentation that has to do with trimming the cache. Is there any wa

[ceph-users] Re: Getting `InvalidInput` when trying to create a notification topic with Kafka endpoint

2021-04-21 Thread Yuval Lifshitz
Hi Istvan, Can you please share the relevant part for the radosgw log, indicating which input was invalid? The only way I managed to reproduce that error is by sending the request to a non-HTTPS radosgw (which does not seem to be your case). In such a case it replies with "InvalidInput" because we

[ceph-users] Re: Swift Stat Timeout

2021-04-21 Thread Dylan Griff
Just to close the loop on this one in case someone reads this in the future. We were encountering this bug: https://tracker.ceph.com/issues/44671 And updating to 14.2.20 solved it. Cheers, Dylan On Thu, 2021-04-15 at 18:48 +, Dylan Griff wrote: > > Just some more info on this, it started

[ceph-users] Re: _delete_some new onodes has appeared since PG removal started

2021-04-21 Thread Dan van der Ster
hdd only. ~160k objects per PG. The flapping is pretty rare -- we've moved hundreds of PGs today and only one flap. (this is with osd_heartbeat_grace =45. with the default 20s we had one flap per ~hour) -- dan On Wed, Apr 21, 2021 at 5:20 PM Konstantin Shalygin wrote: > > This is hdd or hybrids

[ceph-users] Re: _delete_some new onodes has appeared since PG removal started

2021-04-21 Thread Konstantin Shalygin
This is hdd or hybrids OSD's? How much obj per PG avg? k Sent from my iPhone > On 21 Apr 2021, at 17:44, Dan van der Ster wrote: > > Here's a tracker: https://tracker.ceph.com/issues/50466 > > bluefs_buffered_io is indeed enabled on this cluster, but I suspect it > doesn't help for this prec

[ceph-users] Re: MDS replay takes forever and cephfs is down

2021-04-21 Thread Patrick Donnelly
On Wed, Apr 21, 2021 at 7:39 AM Flemming Frandsen wrote: > > I tried restarting an MDS server using: systemctl restart > ceph-mds@ardmore.service > > This caused the standby server to enter replay state and the fs started > hanging for several minutes. > > In a slight panic I restarted the other m

[ceph-users] Re: _delete_some new onodes has appeared since PG removal started

2021-04-21 Thread Dan van der Ster
Here's a tracker: https://tracker.ceph.com/issues/50466 bluefs_buffered_io is indeed enabled on this cluster, but I suspect it doesn't help for this precise issue because the collection isn't repeated fully listed any more. -- dan On Wed, Apr 21, 2021 at 4:22 PM Igor Fedotov wrote: > > Hi Dan,

[ceph-users] MDS replay takes forever and cephfs is down

2021-04-21 Thread Flemming Frandsen
I tried restarting an MDS server using: systemctl restart ceph-mds@ardmore.service This caused the standby server to enter replay state and the fs started hanging for several minutes. In a slight panic I restarted the other mds server, which was replaced by the standby server and it almost immedi

[ceph-users] Re: _delete_some new onodes has appeared since PG removal started

2021-04-21 Thread Konstantin Shalygin
Just for the record - I enable this for all OSD's on this clusters k > On 21 Apr 2021, at 17:22, Igor Fedotov wrote: > > Curious if you had bluefs_buffered_io set to true when faced that? ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscr

[ceph-users] Re: _delete_some new onodes has appeared since PG removal started

2021-04-21 Thread Konstantin Shalygin
Nope, umpap is currently impossible on this clusters 😬 due client lib (guys works on update now). ID CLASS WEIGHT REWEIGHT SIZERAW USE DATAOMAP METAAVAIL %USE VAR PGS STATUS TYPE NAME -166 10.94385- 11 TiB 382 GiB 317 GiB 64 KiB 66 GiB 11 TiB 3.42 1.00 -

[ceph-users] Re: _delete_some new onodes has appeared since PG removal started

2021-04-21 Thread Igor Fedotov
Hi Dan, I recall no relevant tracker, feel free to create. Curious if you had bluefs_buffered_io set to true when faced that? Thanks, Igor On 4/21/2021 4:37 PM, Dan van der Ster wrote: Do we have a tracker for this? We should ideally be able to remove that final collection_list from the op

[ceph-users] Re: _delete_some new onodes has appeared since PG removal started

2021-04-21 Thread Dan van der Ster
Yes, with the fixes in 14.2.19 PG removal is really much much much better than before. But on some clusters (in particular with rocksdb on the hdd) there is still a rare osd flap at the end of the PG removal -- indicated by the logs I shared earlier. Our workaround to prevent that new flap is to i

[ceph-users] Re: _delete_some new onodes has appeared since PG removal started

2021-04-21 Thread Konstantin Shalygin
Dan, you about this issue [1] ? I was start to backfill new OSD's on 14.2.19: pool have 2048PG with 7.14G objects... avg number of PG is 3486328 objects [1] https://tracker.ceph.com/issues/47044 k > On 21 Apr 2021, at 16:37, Dan van der Ster wrote:

[ceph-users] Metrics for object sizes

2021-04-21 Thread Szabo, Istvan (Agoda)
Hi, Is there any clusterwise metric regarding object sizes? I'd like to collect some information about the users what is the object sizes in their buckets. This message is confidential and is for the sole use of the intended recipient(s). It may also be privi

[ceph-users] Re: _delete_some new onodes has appeared since PG removal started

2021-04-21 Thread Dan van der Ster
Do we have a tracker for this? We should ideally be able to remove that final collection_list from the optimized pg removal. It can take a really long time and lead to osd flapping: 2021-04-21 15:23:37.003 7f51c273c700 1 heartbeat_map is_healthy 'OSD::osd_op_tp thread 0x7f51a3e81700' had timed o

[ceph-users] Re: HBA vs caching Raid controller

2021-04-21 Thread Marc
> > This is what I have when I query prometheus, most hdd's are still sata > 5400rpm, there are also some ssd's. I also did not optimize cpu > frequency settings. (forget about the instance=c03, that is just because > the data comes from mgr c03, these drives are on different hosts) > > > > ceph_os

[ceph-users] Re: ceph orch upgrade fails when pulling container image

2021-04-21 Thread Julian Fölsch
Hello, We have circumvented the need to create an account by using Sonatype Nexus to proxy Docker Hub. This also allowed us to keep our Ceph hosts disconnected from the internet. Kind regards, Julian Fölsch Am 21.04.21 um 10:35 schrieb Robert Sander: Hi, Am 21.04.21 um 10:14 schrieb Robert

[ceph-users] Re: ceph orch upgrade fails when pulling container image

2021-04-21 Thread Robert Sander
Hi, Am 21.04.21 um 10:14 schrieb Robert Sander: > How do I update a Ceph cluster in this situation? I learned that I need to create an account on the website hub.docker.com to be able to download Ceph container images in the future. With the credentials I need to run "docker login" on each node

[ceph-users] ceph orch upgrade fails when pulling container image

2021-04-21 Thread Robert Sander
Hi, # docker pull ceph/ceph:v16.2.1 Error response from daemon: toomanyrequests: You have reached your pull rate limit. You may increase the limit by authenticating and upgrading: https://www.docker.com/increase-rate-limit How do I update a Ceph cluster in this situation? Regards -- Robert Sand

[ceph-users] Getting `InvalidInput` when trying to create a notification topic with Kafka endpoint

2021-04-21 Thread Szabo, Istvan (Agoda)
Hi Ceph Users, Here is the latest request I tried but still not working curl -v -H 'Date: Tue, 20 Apr 2021 16:05:47 +' -H 'Authorization: AWS :' -L -H 'content-type: application/x-www-form-urlencoded' -k -X POST https://servername -d Action=CreateTopic&Name=test-ceph-event-replication&Attri