Dear fellow cephers,
we got a problem with ceph df: ceph df reports incorrect USED. It would be
great if someone could look at this, if a ceph operator doesn't discover this
issue, they might run out of space without noticing.
This has been reported before but didn't get much attention:
https:
Hi community,
I am have multiple bucket was delete but lifecycle of bucket still exist,
how i can delete it with radosgw-admin, because user can't access to bucket
for delete lifecycle. User for this bucket does not exist.
root@ceph:~# radosgw-admin lc list
[
{
"bucket": ":r30203:f3fe
Hi,
I think running "lc reshard fix" will fix this.
Matt
On Wed, Dec 6, 2023 at 5:48 AM VÔ VI wrote:
> Hi community,
>
> I am have multiple bucket was delete but lifecycle of bucket still exist,
> how i can delete it with radosgw-admin, because user can't access to bucket
> for delete lifecyc
Hey Frank,
+1 to this, we've seen it a few times now. I've attached an output of ceph
df from an internal cluster we have with the same issue.
[root@Cluster1 ~]# ceph df
--- RAW STORAGE ---
CLASS SIZE AVAILUSED RAW USED %RAW USED
fast_nvme 596 GiB 595 GiB 50 MiB 1.0 GiB
Le 06/12/2023 à 00:11, Rich Freeman a écrit :
On Tue, Dec 5, 2023 at 6:35 AM Patrick Begou
wrote:
Ok, so I've misunderstood the meaning of failure domain. If there is no
way to request using 2 osd/node and node as failure domain, with 5 nodes
k=3+m=1 is not secure enough and I will have to use
Hi Patrick,
Yes K and M are chunks, but the default crush map is a chunk per host,
which is probably the best way to do it, but I'm no expert. I'm not sure
why you would want to do a crush map with 2 chunks per host and min size 4
as it' s just asking for trouble at some point, in my opinion. Any
Hi,
the post linked in the previous message is a good source for different
approaches.
To provide some first-hand experience, I was operating a pool with a 6+2 EC
profile on 4 hosts for a while (until we got more hosts) and the "subdivide a
physical host into 2 crush-buckets" approach is actua
Closing the loop (blocked waiting for Neha's input): how are we using Gibba
on a day-to-day basis? Is it only used for checking reef point releases?
- To be discussed again next week, as Neha had a conflict
[Nizam] http://old.ceph.com/pgcalc is not working anymore, is there any
replacement for
On Wed, Dec 6, 2023 at 8:26 AM Laura Flores wrote:
> Closing the loop (blocked waiting for Neha's input): how are we using
> Gibba on a day-to-day basis? Is it only used for checking reef point
> releases?
>
>- To be discussed again next week, as Neha had a conflict
>
> [Nizam] http://old.cep
On Wed, Dec 6, 2023 at 9:25 AM Patrick Begou
wrote:
>
> My understood was that k and m were for EC chunks not hosts. 🙁 Of
> course if k and m are hosts the best choice would be k=2 and m=2.
A few others have already replied - as they said if the failure domain
is set to host then it will put only
Le 06/12/2023 à 16:21, Frank Schilder a écrit :
Hi,
the post linked in the previous message is a good source for different
approaches.
To provide some first-hand experience, I was operating a pool with a 6+2 EC profile on 4
hosts for a while (until we got more hosts) and the "subdivide a phys
Dear all,
I am reaching out regarding an issue with our Ceph cluster that has been
recurring every six hours. Upon investigating the problem using the "ceph
daemon dump_historic_slow_ops" command, I observed that the issue appears to be
related to slow operations, specifically getting stuck at
Hi Peter,
try to set the cluster to nosnaptrim
If this helps, you might need to upgrade to pacific, because you are hit by the
pg dups bug.
See: https://www.clyso.com/blog/how-to-identify-osds-affected-by-pg-dup-bug/
Mit freundlichen Grüßen
- Boris Behrens
> Am 06.12.2023 um 19:01 schrieb
Thank you for pointing this out. I did check my cluster by using the article
given command, it over 17 million PG dups over each OSDs.
May I know if the snaptrim activity takes place every six hours? If I disable
the snaptrim, will it stop the slow ops temporarily before my performing
version u
Snaptrim is the process of removing a snapshot and reclaiming diskspace after
it got deleted. I don't know how the ceph internals work, but it helped for us.
You can try to move the snaptrim into specific timeframes and limit it to one
per osd. Also sleeping (3s worked for us) between the delet
Hi Matt,
Thanks Matt but it's not working for me, after run radosgw-admin lc reshard
fix don't change anything
Vào Th 4, 6 thg 12, 2023 vào lúc 20:18 Matt Benjamin
đã viết:
> Hi,
>
> I think running "lc reshard fix" will fix this.
>
> Matt
>
>
> On Wed, Dec 6, 2023 at 5:48 AM VÔ VI wrote:
>
Hi, did you unmount your clients after the cluster poweroff? You could
also enable debug logs in mds to see more information. Are there any
blocked requests? You can query the mds daemon via cephadm shell or
with ad admin keyring like this:
# ceph tell mds.cephfs.storage.lgmyqv dump_blocked
On Thu, Dec 7, 2023 at 12:49 PM Eugen Block wrote:
>
> Hi, did you unmount your clients after the cluster poweroff?
If this is the case, then a remount would kick things back working.
> You could
> also enable debug logs in mds to see more information. Are there any
> blocked requests? You can q
18 matches
Mail list logo