[ceph-users] Re: [Suspicious newsletter] Re: bluefs_buffered_io turn to true

2021-05-16 Thread Janne Johansson
Den mån 17 maj 2021 kl 08:15 skrev Szabo, Istvan (Agoda) < istvan.sz...@agoda.com>: > What happens if we are using buffered_io and the machine restared due to > some power failure? Everything that was in the cache will be lost or how > ceph handle this? > Not to be picky, but between any client w

[ceph-users] Re: RBD as a boot image [was: libceph: get_reply osd2 tid 1459933 data 3248128 > preallocated 131072, skipping]

2021-05-16 Thread Kees Meijs | Nefos
Hi, This is a chicken and egg problem I guess. The boot process (albeit UEFI or BIOS; given x86) should be able to load boot loader code, a Linux kernel and initial RAM disk (although in some cases a kernel alone could be enough). So yes: use PXE to load a Linux kernel and RAM disk. The RAM

[ceph-users] Octopus MDS hang under heavy setfattr load

2021-05-16 Thread Nigel Williams
One of my colleagues attempted to set quotas on a large number (some dozens) of users with the session below, but it caused the MDS to hang and reject client requests. Offending command was: cat recent-users | xargs -P16 -I% setfattr -n ceph.quota.max_bytes -v 8796093022208 /scratch/% Result was

[ceph-users] Re: RBD as a boot image [was: libceph: get_reply osd2 tid 1459933 data 3248128 > preallocated 131072, skipping]

2021-05-16 Thread Markus Kienast
Hi Nico, we are already doing exactly that: Loading initrd via iPXE which contains the necessary modules and scripts to boot an RBD boot dev. Works just fine. And Ilya just helped to work out the last show stopper, thanks again for that! We are using a modified LTSP system for this. We have pr

[ceph-users] Re: CRUSH rule for EC 6+2 on 6-node cluster

2021-05-16 Thread Dan van der Ster
Hi Bryan, I had to do something similar, and never found a rule to place "up to" 2 chunks per host, so I stayed with the placement of *exactly* 2 chunks per host. But I did this slightly differently to what you wrote earlier: my rule chooses exactly 4 hosts, then chooses exactly 2 osds on each:

[ceph-users] RBD as a boot image [was: libceph: get_reply osd2 tid 1459933 data 3248128 > preallocated 131072, skipping]

2021-05-16 Thread Nico Schottelius
Hey Markus, Ilya, you don't know with how much interest I am following this thread, because ... >> Generally it would be great if you could include the proper initrd code for >> RBD and CephFS root filesystems to the Ceph project. You can happily use my >> code as a starting point. >> >> http

[ceph-users] Re: libceph: get_reply osd2 tid 1459933 data 3248128 > preallocated 131072, skipping

2021-05-16 Thread Markus Kienast
Am So., 16. Mai 2021 um 21:36 Uhr schrieb Ilya Dryomov : > On Sun, May 16, 2021 at 8:06 PM Markus Kienast wrote: > > > > Am So., 16. Mai 2021 um 19:38 Uhr schrieb Ilya Dryomov < > idryo...@gmail.com>: > >> > >> On Sun, May 16, 2021 at 4:18 PM Markus Kienast > wrote: > >> > > >> > Am So., 16. Mai

[ceph-users] Re: libceph: get_reply osd2 tid 1459933 data 3248128 > preallocated 131072, skipping

2021-05-16 Thread Ilya Dryomov
On Sun, May 16, 2021 at 8:06 PM Markus Kienast wrote: > > Am So., 16. Mai 2021 um 19:38 Uhr schrieb Ilya Dryomov : >> >> On Sun, May 16, 2021 at 4:18 PM Markus Kienast wrote: >> > >> > Am So., 16. Mai 2021 um 15:36 Uhr schrieb Ilya Dryomov >> > : >> >> >> >> On Sun, May 16, 2021 at 12:54 PM Mark

[ceph-users] Re: libceph: get_reply osd2 tid 1459933 data 3248128 > preallocated 131072, skipping

2021-05-16 Thread Markus Kienast
Am So., 16. Mai 2021 um 19:38 Uhr schrieb Ilya Dryomov : > On Sun, May 16, 2021 at 4:18 PM Markus Kienast wrote: > > > > Am So., 16. Mai 2021 um 15:36 Uhr schrieb Ilya Dryomov < > idryo...@gmail.com>: > >> > >> On Sun, May 16, 2021 at 12:54 PM Markus Kienast > wrote: > >> > > >> > Hi Ilya, > >>

[ceph-users] Re: libceph: get_reply osd2 tid 1459933 data 3248128 > preallocated 131072, skipping

2021-05-16 Thread Ilya Dryomov
On Sun, May 16, 2021 at 4:18 PM Markus Kienast wrote: > > Am So., 16. Mai 2021 um 15:36 Uhr schrieb Ilya Dryomov : >> >> On Sun, May 16, 2021 at 12:54 PM Markus Kienast wrote: >> > >> > Hi Ilya, >> > >> > unfortunately I can not find any "missing primary copy of ..." error in >> > the logs of my

[ceph-users] Re: v16.2.4 Pacific released

2021-05-16 Thread Wladimir Mutel
Apparently this release has fixed some race condition when an OSD daemon was being started earlier than corresponding osd directory had been created under /var/lib/ceph/osd/ At least my experimental setup of 1 host with 8 SATA HDDs and 1 NVMe is showing signs of life aga

[ceph-users] Re: libceph: get_reply osd2 tid 1459933 data 3248128 > preallocated 131072, skipping

2021-05-16 Thread Markus Kienast
Am So., 16. Mai 2021 um 15:36 Uhr schrieb Ilya Dryomov : > On Sun, May 16, 2021 at 12:54 PM Markus Kienast > wrote: > > > > Hi Ilya, > > > > unfortunately I can not find any "missing primary copy of ..." error in > the logs of my 3 OSDs. > > The NVME disks are also brand new and there is not much

[ceph-users] CephFS Snaptrim stuck?

2021-05-16 Thread Andras Sali
Dear Ceph Users, We are experiencing a strange behaviour on Ceph v15.2.9 that a set of PGs seem to be stuck in active + clean + snaptrim state. (for almost a day now) Usually snaptrim is quite fast (done in a few minutes), however now in the osd logs we see slowly increasing trimq numbers, with s

[ceph-users] Re: libceph: get_reply osd2 tid 1459933 data 3248128 > preallocated 131072, skipping

2021-05-16 Thread Ilya Dryomov
On Sun, May 16, 2021 at 12:54 PM Markus Kienast wrote: > > Hi Ilya, > > unfortunately I can not find any "missing primary copy of ..." error in the > logs of my 3 OSDs. > The NVME disks are also brand new and there is not much traffic on them. > > The only error keyword I find are those two messa

[ceph-users] dedicated metadata servers

2021-05-16 Thread mabi
Hello, On my small Octopus 6 nodes cluster I have only 8 GB per node available and I am co-locating the active and standby MDS on two out of the 3 OSD nodes. But because memory is tight (8GB per node max, limited due to hardware constraints) I was thinking to add two new 8GB nodes dedicated onl

[ceph-users] Re: libceph: get_reply osd2 tid 1459933 data 3248128 > preallocated 131072, skipping

2021-05-16 Thread Markus Kienast
Hi Ilya, unfortunately I can not find any "missing primary copy of ..." error in the logs of my 3 OSDs. The NVME disks are also brand new and there is not much traffic on them. The only error keyword I find are those two messages in osd.0 and osd.1 logs shown below. BTW the error posted before a