[ceph-users] Re: Ceph FS not releasing space after file deletion

2019-09-03 Thread Guilherme
Dear CEPHers, Adding some comments to my colleague's post: we are running Mimic 13.2.6 and struggling with 2 issues (that might be related): 1) After a "lack of space" event we've tried to remove a 40TB file. The file is not there anymore, but no space was released. No process is using the file

[ceph-users] Docker & CEPH-CRASH

2021-09-14 Thread Guilherme Geronimo
ther Is there a special way to configure it? Should I create and external volume and run a single instance of it? Thanks! Guilherme Geronimo (aKa Arthur) docker-compose example: services:    osd.106:   container_name: osd106   image: ceph/daemon:latest-nautilus   command: osd_dir

[ceph-users] Re: Docker & CEPH-CRASH

2021-09-15 Thread Guilherme Geronimo
ith 'ceph orch apply -i crash-service.yml' where the yml file could look like this: service_type: crash service_name: crash placement:   host_pattern: '*' Zitat von Guilherme Geronimo : Hey Guys! I'm running my  entire cluster (12hosts/89osds - v15.2.22) on Docker and ev

[ceph-users] Re: ceph cluster warning after adding disk to cluster

2019-09-04 Thread Guilherme Geronimo
Hey hey, First of all: 10GBps connection. Then, some magic commands: # ceph tell 'osd.*' injectargs '--osd-max-backfills 32' # ceph tell 'osd.*' injectargs '--osd-recovery-max-active 12' # ceph tell 'osd.*' injectargs '--osd-recovery-o

[ceph-users] Re: Slow peering caused by "wait for new map"

2019-09-04 Thread Guilherme Geronimo
Hey Bryan, I suppose all nodes are using jumboframes (mtu 9000), right? I would suggest to check OSD->MON communication. Can you send the output os these commands for us? * ceph -s * ceph versions []'s Arthur (aKa Guilherme Geronimo) On 04/09/2019 14:18, Bryan Stillwell wrote:

[ceph-users] Re: Slow peering caused by "wait for new map"

2019-09-04 Thread Guilherme Geronimo
etting NOIN and NOUP flag * Taking the fragile OSD out * restarting the "fragile" OSDs * check if everything is ok look ing their logs * taking off the NOUP flag * Take a coffee and wait till all data are drain []'s Arthur (aKa Guilherme Geronimo) On 04/09/2019 15:32, Bryan Stil

[ceph-users] Re: Slow peering caused by "wait for new map"

2019-09-04 Thread Guilherme Geronimo
Btw: After the storm, I highly suggest you to consider to use Jumboframe. It works like a charm. []'s Arthur (aKa Guilherme Geronimo) On 04/09/2019 15:50, Guilherme Geronimo wrote: I see that you have many inactive PGs, probably because the 6 OSD OUT+DOWN. Problems with "flapp

[ceph-users] Re: ceph cluster warning after adding disk to cluster

2019-09-04 Thread Guilherme Geronimo
I don't think MIMIC support this attribute: https://docs.ceph.com/docs/mimic/rados/configuration/osd-config-ref/ []'s Arthur (aKa Guilherme Geronimo) On 04/09/2019 17:09, solarflow99 wrote: how about also increasing osd_recovery_threads? On Wed, Sep 4, 2019 at 10:47 AM Guilherm

[ceph-users] Re: Ceph FS not releasing space after file deletion

2019-09-16 Thread Guilherme Geronimo
Thank you, Yan. It took like 10 minutes to execute the scan_links. I believe the number of Lost+Found decreased in 60%, but the rest of them are still causing the MDS crash. Any other suggestion? =D []'s Arthur (aKa Guilherme Geronimo) On 10/09/2019 23:51, Yan, Zheng wrote: On Wed,

[ceph-users] Re: Ceph FS not releasing space after file deletion

2019-09-19 Thread Guilherme Geronimo
Here it is: https://pastebin.com/SAsqnWDi The command: timeout 10 rm  /mnt/ceph/lost+found/12430c8 ; umount -f /mnt/ceph On 17/09/2019 00:51, Yan, Zheng wrote: please send me crash log On Tue, Sep 17, 2019 at 12:56 AM Guilherme Geronimo wrote: Thank you, Yan. It took like 10 minutes

[ceph-users] Re: Ceph FS not releasing space after file deletion

2019-09-19 Thread Guilherme Geronimo
Btw: root@deployer:~# cephfs-data-scan -v ceph version 14.2.4 (75f4de193b3ea58512f204623e6c5a16e6c1e1ba) nautilus (stable) On 19/09/2019 13:38, Guilherme Geronimo wrote: Here it is: https://pastebin.com/SAsqnWDi The command: timeout 10 rm  /mnt/ceph/lost+found/12430c8 ; umount -f /mnt

[ceph-users] Re: Ceph FS not releasing space after file deletion

2019-09-22 Thread Guilherme Geronimo
Wow, something is seriously wrong: if I turn on the "debug mds = 10" (on [mds] context) and restart MDS, it instantaneously crashes! If I comment it, everything is alright. https://pastebin.com/GPcKRmR9 On 19/09/2019 22:34, Yan, Zheng wrote: On Fri, Sep 20, 2019 at 12:38 AM

[ceph-users] MDS Crashes on “ceph fs volume v011”

2019-10-15 Thread Guilherme Geronimo
Dear ceph users, we're experiencing a segfault during MDS startup (replay process) which is making our FS inaccessible. MDS log messages: Oct 15 03:41:39.894584 mds1 ceph-mds: -472> 2019-10-15 00:40:30.201 7f3c08f49700 1 -- 192.168.8.195:6800/3181891717 <== osd.26 192.168.8.209:6821/2419345

[ceph-users] MDS Crashes at “ceph fs volume v011”

2019-10-16 Thread Guilherme Geronimo
Dear ceph users, we're experiencing a segfault during MDS startup (replay process) which is making our FS inaccessible. MDS log messages: Oct 15 03:41:39.894584 mds1 ceph-mds: -472> 2019-10-15 00:40:30.201 7f3c08f49700 1 -- 192.168.8.195:6800/3181891717 <== osd.26 192.168.8.209:6821/2419345

[ceph-users] scan_links crashing

2020-03-01 Thread Guilherme Geronimo
Hey guys, I'm trying to solve some Lost+Found errors, but when I try to run the "scan_link" command, it crashes. Any tip? Cheers! ceph cluster version 13.2.6 (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic (stable) cephfs-data-scan version 14.2.7 (3d58626ebeec02d8385a4cefb92c6cbc3a45bfe8

[ceph-users] Docs@RSS

2020-03-20 Thread Guilherme Geronimo
Hey guys, do you know who takes care of the docs.ceph.com? I mean, the tool, not the content. I would suggest to enable its FEED (RSS/ATOM). That would facilitate a lot of things... -- []'s Arthur (aKa Guilherme Geronimo) ___ ceph-users mailing