[ceph-users] Refill snaptrim queue after triggering bug #54396

2022-06-28 Thread Kári Bertilsson
Hello all, I triggered the bug at https://tracker.ceph.com/issues/54396 some months ago. Been looking ever since for a way to reclaim space that is stuck in use. Updated to and running currently ceph 17.2.0. I have already tried adding/removing snapshots to trigger refresh. Repeering all pg's. Re

[ceph-users] Re: CephFS snaptrim bug?

2022-06-25 Thread Kári Bertilsson
Hello I am also having this issue after having set osd_pg_max_concurrent_snap_trims = 0 previously to pause the snaptrim. I upgraded to ceph 17.2.0. Have tried restarting, repeering, deep-scrubbing all OSD's, so far nothing works. For one of the affected pools `cephfs_10k` I have tested removing

[ceph-users] Multiple OSD crashing within short timeframe in production cluster running pacific

2021-09-13 Thread Kári Bertilsson
Hello everyone, I have been running ceph for the last 2 years, with great experience so far. Yesterday I started encountering some strange issues. All OSD's are part of an erasure coding pool with k=8 m=2 and host failure domain. Neither the crashing OSD's yesterday nor today show any symptoms of

[ceph-users] Unfound objects after upgrading from octopus to pacific

2021-07-18 Thread Kári Bertilsson
Hello I am running EC 8+2 setup on proxmox. Upgraded from octopus 15.2.11 to pacific 16.2.4. Before upgrading the cluster was in a healthy state and all 2.449 PG's were active+clean. After restarting all OSD's on pacific I have 23 PG's showing 1 unfound object and 3 degraded objects each. I have

[ceph-users] How to deal with "inconsistent+failed_repair" pgs on cephfs pool ?

2020-09-07 Thread Kári Bertilsson
Hello I have a couple of pgs with inconsitent+failed_repair after trying "ceph pg repair" on them. All the OSD's are using bluestore. These pgs belong to an erasure coded pool that runs cephfs. I did manage to find details about which objects are inconsistent by using the "rados list-inconsisten

[ceph-users] How to force backfill on undersized pgs ?

2020-06-17 Thread Kári Bertilsson
Hello I'm running ceph 14.2.9. During heavy backfilling due to rebalancing one OSD crashed. I want to recover the data from the lost OSD before continuing the backfilling so i out'ed the lost osd and ran "ceph osd set norebalance". But i'm noticing with the norebalance flag set the system does

[ceph-users] Re: OSD corruption and down PGs

2020-05-12 Thread Kári Bertilsson
se failing disks into? That's what I'm doing > right now with some failing disks. I've recovered 2 out of 6 osds that > failed in this way. I would recommend against using the same cluster for > this, but a stage cluster or something would be great. > > On Tue, May 12, 2

[ceph-users] Re: OSD corruption and down PGs

2020-05-12 Thread Kári Bertilsson
Paul > > -- > Paul Emmerich > > Looking for help with your Ceph cluster? Contact us at https://croit.io > > croit GmbH > Freseniusstr. 31h > 81247 München > www.croit.io > Tel: +49 89 1896585 90 > > > On Tue, May 12, 2020 at 2:07 PM Kári Bertilsson > wrote: &

[ceph-users] Re: OSD corruption and down PGs

2020-05-12 Thread Kári Bertilsson
Yes ceph osd df tree and ceph -s is at https://pastebin.com/By6b1ps1 On Tue, May 12, 2020 at 10:39 AM Eugen Block wrote: > Can you share your osd tree and the current ceph status? > > > Zitat von Kári Bertilsson : > > > Hello > > > > I had an incidence where 3

[ceph-users] OSD corruption and down PGs

2020-05-12 Thread Kári Bertilsson
a minor corruption but at important locations. Any ideas on how to recover this kind of scenario ? Any tips would be highly appreciated. Best regards, Kári Bertilsson ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Ghost usage on pool and unable to reclaim free space.

2020-04-18 Thread Kári Bertilsson
2.6M objects with your purge command, so I guess that > 5.25M objects would be something else like logs. > Have you checked the osd df detail and osd metadata? > I have a case which osds bluestore logs are eating my whole space up, > maybe you are facing a similar one. > > Regar

[ceph-users] Ghost usage on pool and unable to reclaim free space.

2020-04-18 Thread Kári Bertilsson
when i don't have that privilege Best regards Kári Bertilsson ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: multiple pgs down with all disks online

2019-11-05 Thread Kári Bertilsson
ube: https://goo.gl/PGE1Bx > > > Am So., 3. Nov. 2019 um 20:13 Uhr schrieb Kári Bertilsson < > karibert...@gmail.com>: > >> pgs: 14.377% pgs not active >> 3749681/537818808 objects misplaced (0.697%) >> 810 active+clean >

[ceph-users] multiple pgs down with all disks online

2019-11-03 Thread Kári Bertilsson
pgs: 14.377% pgs not active 3749681/537818808 objects misplaced (0.697%) 810 active+clean 156 down 124 active+remapped+backfilling 1 active+remapped+backfill_toofull 1 down+inconsistent when looking at the do

[ceph-users] Re: Dirlisting hangs with cephfs

2019-10-30 Thread Kári Bertilsson
wrote: > Hi Kári, > > what about this: > > health: HEALTH_WARN > 854 pgs not deep-scrubbed in time > > > maybe you should > $ ceph –cluster first pg scrub XX.YY > or > $ ceph –cluster first pg deep-scrub XX.YY > all the PGs. > > > T

[ceph-users] Re: Dirlisting hangs with cephfs

2019-10-29 Thread Kári Bertilsson
I am encountering the dirlist hanging issue on multiple clients and none of them are Ubuntu. Debian buster running kernel 4.19.0-2-amd64. This one was working fine until after ceph was upgraded to nautilus Proxmox running kernels 5.0.21-1-pve and 5.0.18-1-pve On Tue, Oct 29, 2019 at 9:04 PM Nath

[ceph-users] Re: Dirlisting hangs with cephfs

2019-10-29 Thread Kári Bertilsson
PM Kári Bertilsson wrote: > I am noticing i have many entries in `ceph osd blacklist ls` and > dirlisting works again after i removed all entries. > What can cause this and is there any way to disable blacklisting ? > > On Tue, Oct 29, 2019 at 11:56 AM Kári Bertilsson > wr

[ceph-users] Re: Dirlisting hangs with cephfs

2019-10-29 Thread Kári Bertilsson
I am noticing i have many entries in `ceph osd blacklist ls` and dirlisting works again after i removed all entries. What can cause this and is there any way to disable blacklisting ? On Tue, Oct 29, 2019 at 11:56 AM Kári Bertilsson wrote: > The file system was created on luminous and

[ceph-users] Re: Dirlisting hangs with cephfs

2019-10-29 Thread Kári Bertilsson
The file system was created on luminous and the problems started after upgrading from luminous to nautilus. All CephFS configuration should be pretty much default except i enabled snapshots which was disabled by default on luminous. On Tue, Oct 29, 2019 at 11:48 AM Kári Bertilsson wrote: >

[ceph-users] Re: Dirlisting hangs with cephfs

2019-10-29 Thread Kári Bertilsson
clients with older kernels (e.g. 4.15.0-47-generic) work without > interruption on the same CephFS. > > > Lars > > > Mon, 28 Oct 2019 22:10:25 + > Kári Bertilsson ==> Patrick Donnelly < > pdonn...@redhat.com> : > > Any ideas or tips on how to de

[ceph-users] Re: Dirlisting hangs with cephfs

2019-10-28 Thread Kári Bertilsson
Any ideas or tips on how to debug further ? On Mon, Oct 28, 2019 at 7:17 PM Kári Bertilsson wrote: > Hello Patrick, > > Here is output from those commands > https://pastebin.com/yUmuQuYj > > 5 clients have the file system mounted, but only 2 of them have most of > the act

[ceph-users] Re: Dirlisting hangs with cephfs

2019-10-28 Thread Kári Bertilsson
Hello Patrick, Here is output from those commands https://pastebin.com/yUmuQuYj 5 clients have the file system mounted, but only 2 of them have most of the activity. On Mon, Oct 28, 2019 at 6:54 PM Patrick Donnelly wrote: > Hello Kári, > > On Mon, Oct 28, 2019 at 11:14 AM Kári B

[ceph-users] Dirlisting hangs with cephfs

2019-10-28 Thread Kári Bertilsson
This seems to happen mostly when listing folders containing 10k+ folders. The dirlisting hangs indefinitely or until i restart the active MDS and then the hanging "ls" command will finish running. Every time restarting the active MDS fixes the problem for a while.