[ceph-users] Re: Storing 20 billions of immutable objects in Ceph, 75% <16KB

2021-02-22 Thread Benoît Knecht
Hi, On Sunday, February 21st, 2021 at 12:39, Loïc Dachary wrote: > For the record, here is a summary of the key takeaways from this conversation > (so far): > > - Ambry[0] is a perfect match and I'll keep exploring it[1]. > - To keep billions of small objects manageable, they must be packed

[ceph-users] Re: Resolving LARGE_OMAP_OBJECTS

2021-03-04 Thread Benoît Knecht
Hi Drew, On Thursday, March 4th, 2021 at 15:18, Drew Weaver wrote: > Howdy, the dashboard on our cluster keeps showing LARGE_OMAP_OBJECTS. > > I went through this document > > https://www.suse.com/support/kb/doc/?id=19698 > > I've found that we have a total of 5 buckets, each one is owned by

[ceph-users] Re: Resolving LARGE_OMAP_OBJECTS

2021-03-05 Thread Benoît Knecht
On Friday, March 5th, 2021 at 15:20, Drew Weaver wrote: > Sorry to sound clueless but no matter what I search for on El Goog I can't > figure out how to answer the question as to whether dynamic sharding is > enabled in our environment. > > It's not configured as true in the config files, but it

[ceph-users] Re: Resolving LARGE_OMAP_OBJECTS

2021-03-08 Thread Benoît Knecht
Hi Drew, On Friday, March 5th, 2021 at 20:22, Drew Weaver wrote: > Sorry for multi-reply, I got that command to run: > > for obj in $(rados -p default.rgw.buckets.index ls | grep > 2b67ef7c-2015-4ca0-bf50-b7595d01e46e.74194.637); do printf "%-60s %7d\n" $obj > $(rados -p default.rgw.buckets.ind

[ceph-users] Re: Unable to delete bucket - endless multipart uploads?

2021-03-08 Thread Benoît Knecht
Hi Dave, On Tuesday, February 23rd, 2021 at 21:56, David Monschein wrote: > We're running 14.2.16 on our RGWs and OSD nodes. Anyone have any ideas? Is > it possible to target this bucket via rados directly to try and delete it? > > I'm weary of doing stuff like that though. Thanks in advance. I

[ceph-users] Re: Resolving LARGE_OMAP_OBJECTS

2021-03-30 Thread Benoît Knecht
Hi David, On Tuesday, March 30th, 2021 at 00:50, David Orman wrote: > Sure enough, it is more than 200,000, just as the alert indicates. > However, why did it not reshard further? Here's the kicker - we only > see this with versioned buckets/objects. I don't see anything in the > documentation th

[ceph-users] Re: BlueFS spillover warning gone after upgrade to Quincy

2023-01-12 Thread Benoît Knecht
Hi Peter, On Thursday, January 12th, 2023 at 15:12, Peter van Heusden wrote: > I have a Ceph installation where some of the OSDs were misconfigured to use > 1GB SSD partitions for rocksdb. This caused a spillover ("BlueFS spillover > detected"). I recently upgraded to quincy using cephadm (17.2.

[ceph-users] High ceph_osd_commit_latency_ms on Toshiba MG07ACA14TE HDDs

2020-06-24 Thread Benoît Knecht
Hi, We have a Nautilus (14.2.9) Ceph cluster with two types of HDDs: - TOSHIBA MG07ACA14TE [1] - HGST HUH721212ALE604 [2] They're all bluestore OSDs with no separate DB+WAL and part of the same pool. We noticed that while the HGST OSDs have a commit latency of about 15ms, the Toshiba OSDs h

[ceph-users] Re: High ceph_osd_commit_latency_ms on Toshiba MG07ACA14TE HDDs

2020-06-24 Thread Benoît Knecht
Thank you all for your answers, this was really helpful! Stefan Priebe wrote: > yes we have the same issues and switched to seagate for those reasons. > you can fix at least a big part of it by disabling the write cache of > those drives - generally speaking it seems the toshiba firmware is > brok

[ceph-users] Re: High ceph_osd_commit_latency_ms on Toshiba MG07ACA14TE HDDs

2020-06-24 Thread Benoît Knecht
Hi Igor, Igor Fedotov wrote: > for the sake of completeness one more experiment please if possible: > > turn off write cache for HGST drives and measure commit latency once again. I just did the same experiment with HGST drives, and disabling the write cache on those drives brought the latency do

[ceph-users] Re: Problem with OSD::osd_op_tp thread had timed out and other connected issues

2020-07-20 Thread Benoît Knecht
Hi Jan, Jan Pekař wrote: > Also I'm concerned, that this OSD restart caused data degradation and > recovery - cluster should be clean immediately after OSD up when no > client was uploading/modifying data during my tests. We're experiencing the same thing on our 14.2.10 cluster. After marking a

[ceph-users] Re: Usable space vs. Overhead

2020-07-29 Thread Benoît Knecht
Aren't you just looking at the same thing from two different perspective? In one case you say: I have 100% of useful data, and I need to add 50% of parity for a total of 150% raw data. In the other, you say: Out of 100% of raw data, 2/3 is useful data, 1/3 is parity, which gives you your 33.3%