[ceph-users] Re: Reef: RGW logrotate "breaks" log_to_file

2025-06-25 Thread Reid Guyett
Is there any negative to adding notif-worker0 to the logrotate pkill list? On Wed, Jun 25, 2025 at 8:32 AM Casey Bodley wrote: > On Wed, Jun 25, 2025 at 8:29 AM Casey Bodley wrote: > > > > hi Eugen, > > > > this is tracked in https://tracker.ceph.com/issues/71156, and a fix > > has merged for t

[ceph-users] Re: radosgw daemons with "stuck ops"

2025-01-27 Thread Reid Guyett
mal. Anybody shed some light on what the logs mean? Also is there a way to link the req 3636357367581543624 to logs at lower levels? I thought that I would just need to convert to hex but it doesn't line up with the started/completed requests. Thanks, Reid On Mon, Jan 27, 2025 at 11:36 AM Joshua

[ceph-users] radosgw daemons with "stuck ops"

2025-01-27 Thread Reid Guyett
Hello, We are experiencing slowdowns on one of our radosgw clusters. We restart the radosgw daemons every 2 hours and things start getting slow after an hour and a half. The avg get/put latencies go from 20ms/400ms to 1s/5s+ according to the metrics. When I stop traffic to one of the radosgw daemo

[ceph-users] Re: Watcher Issue

2025-01-23 Thread Reid Guyett
Hi, I've had a similar issue but outside of ceph-csi. Running a CRUD test to (create, map, write, read, unmap, and delete) an RBD in a short amount of time can result in it having a stuck watcher. I assume it is from mapping and unmapping very quickly (under 30 sec). What I have found is if you re

[ceph-users] Lifecycle Stuck PROCESSING and UNINITIAL

2024-10-17 Thread Reid Guyett
Hello, I am experiencing an issue where it seems all lifecycles are showing either PROCESSING or UNINITIAL. > # radosgw-admin lc list > [ > { > "bucket": > ":tesra:5e9bc383-f7bd-4fd1-b607-1e563bfe0011.833499554.20", > "shard": "lc.0", > "started": "Thu, 17 Oct 2024 00:

[ceph-users] Re: Radosgw bucket check fix doesn't do anything

2024-09-20 Thread Reid Guyett
ot;${key}" ; done < > UploadId-to-shard.txt > rmomapkey.log > > The difference with your case is that we could list them with 'aws s3api > list-multipart-uploads', but maybe you can identify the ompakeys to remove > based on the 'invalid_multipart_entries

[ceph-users] Re: Radosgw bucket check fix doesn't do anything

2024-09-19 Thread Reid Guyett
at 5:50 AM Frédéric Nass < frederic.n...@univ-lorraine.fr> wrote: > Oh, by the way, since 35470 is near two times 18k, couldn't it be that the > source bucket is versioned and the destination bucket only got the most > recent copy of each object? > > Regards, > Frédéric

[ceph-users] Re: Radosgw bucket check fix doesn't do anything

2024-09-18 Thread Reid Guyett
--bucket mimir-prod | jq -r '.Uploads[] | > "--key \"\(.Key)\" --upload-id \(.UploadId)"' > abort-multipart-upload.txt > > ~/ max=$(cat abort-multipart-upload.txt | wc -l); i=1; while read -r line; > do echo -n "$i/$max"; ((i=i+1)); eval &quo

[ceph-users] Radosgw bucket check fix doesn't do anything

2024-09-17 Thread Reid Guyett
Hello, I recently moved a bucket from 1 cluster to another cluster using rclone. I noticed that the source bucket had around 35k objects and the destination bucket only had around 18k objects after the sync was completed. Source bucket stats showed: > radosgw-admin bucket stats --bucket mimir-pr

[ceph-users] Re: RBD Stuck Watcher

2024-07-30 Thread Reid Guyett
Hi, It sounds similar. How would I best be able to confirm it? Logs? Which log/message if so? Thanks On Thu, Jul 25, 2024 at 6:11 AM Ilya Dryomov wrote: > On Wed, Jul 3, 2024 at 5:45 PM Reid Guyett wrote: > > > > Hi, > > > > I have a small script in a Docker containe

[ceph-users] RBD Stuck Watcher

2024-07-03 Thread Reid Guyett
Hi, I have a small script in a Docker container we use for a type of CRUD test to monitor availability. The script uses Python librbd/librados and is launched by Telegraf input.exec. It does the following: 1. Creates an rbd image 2. Writes a small amount of data to the rbd 3. Reads the d

[ceph-users] Re: CORS Problems

2024-06-05 Thread Reid Guyett
Hi, There is a bug with preflight on PUT requests: https://tracker.ceph.com/issues/64308. We have worked around it by stripping the query parameters of OPTIONS requests to the RGWs. Nginx proxy config: if ($request_method = OPTIONS) { rewrite ^\/(.+)$ /$1? break; } Regards, Reid On Wed, Jun

[ceph-users] Traefik front end with RGW

2024-05-23 Thread Reid Guyett
Hello, We are considering moving from Nginx to Traefik as the frontend for our RGW services. Prior to putting into production I ran it through s3-tests and noticed that all of the tests involving metadata (x-amz-meta-*) are failing because they are expected to be lowercase (test_s3.py::test_object

[ceph-users] RGW services crashing randomly with same message

2024-04-03 Thread Reid Guyett
Hello, We are currently experiencing a lot of rgw service crashes that all seem to terminate with the same message. We have kept our RGW services at 17.2.5 but the rest of the cluster is 17.2.7 due to a bug introduced in 17.2.7. terminate called after throwing an instance of > 'ceph::buffer::v15_