[ceph-users] Re: Ceph bucket notification events stop working
Hello Yuval, Thanks for your reply! We continued digging in the problem and we found out that it was caused by a recent change in our infrastructure. Loadbalancer pods were added in front or rgw ones and those were logging an SSL error. As we weren't aware right away of that change we weren't checking the logs of those pods. We have fixed it and it works now. Thanks, Daniel ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: ceph-volume lvm new-db fails
On 11/05/2022 23:21, Joost Nieuwenhuijse wrote: After a reboot the OSD turned out to be corrupt. Not sure if ceph-volume lvm new-db caused the problem, or failed because of another problem. I just ran into the same issue trying to add a db to an existing OSD. Apparently this is a known bug: https://tracker.ceph.com/issues/55260 It's already fixed master, but the backports are all still pending ... Regards Christian ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] librbd 4k read/write?
Good afternoon everybody! I have the following scenario: Pool RBD replication x3 5 hosts with 12 SAS spinning disks each I'm using exactly the following line with FIO to test: fio -ioengine=libaio -direct=1 -invalidate=1 -name=test -bs=4M -size=10G -iodepth=16 -rw=write -filename=./test.img If I increase the blocksize I can easily reach 1.5 GBps or more. But when I use blocksize in 4K I get a measly 12 Megabytes per second, which is quite annoying. I achieve the same rate if rw=read. If I use librbd's cache I get a considerable improvement in writing, but reading remains the same. I already tested with rbd_read_from_replica_policy=balance but I didn't notice any difference. I tried to leave readahead enabled by setting rbd_readahead_disable_after_bytes=0 but I didn't see any difference in sequential reading either. Note: I tested it on another smaller cluster, with 36 SAS disks and got the same result. I don't know exactly what to look for or configure to have any improvement. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: librbd 4k read/write?
On Thu, Aug 10, 2023, 17:36 Murilo Morais wrote: > Good afternoon everybody! > > I have the following scenario: > Pool RBD replication x3 > 5 hosts with 12 SAS spinning disks each > > I'm using exactly the following line with FIO to test: > fio -ioengine=libaio -direct=1 -invalidate=1 -name=test -bs=4M -size=10G > -iodepth=16 -rw=write -filename=./test.img > > If I increase the blocksize I can easily reach 1.5 GBps or more. > > But when I use blocksize in 4K I get a measly 12 Megabytes per second, > This is 3000iops. I would call that bad for 60 drives and a replication of 3. Which amount of iops did you expect? which is quite annoying. I achieve the same rate if rw=read. > > If I use librbd's cache I get a considerable improvement in writing, but > reading remains the same. > > I already tested with rbd_read_from_replica_policy=balance but I didn't > notice any difference. I tried to leave readahead enabled by setting > rbd_readahead_disable_after_bytes=0 but I didn't see any difference in > sequential reading either. > > Note: I tested it on another smaller cluster, with 36 SAS disks and got the > same result. > This I concur is a weird result compared to 60 disks. Are you using the same disks and all other parameters the same, like the replication factor? Is the performance really the same? Maybe the 5 host cluster is not saturated by your current fio test. Try running 2 or 4 in parallel. > > I don't know exactly what to look for or configure to have any improvement. > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io > ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: librbd 4k read/write?
> I have the following scenario: > Pool RBD replication x3 > 5 hosts with 12 SAS spinning disks each > > I'm using exactly the following line with FIO to test: > fio -ioengine=libaio -direct=1 -invalidate=1 -name=test -bs=4M -size=10G > -iodepth=16 -rw=write -filename=./test.img > > If I increase the blocksize I can easily reach 1.5 GBps or more. > > But when I use blocksize in 4K I get a measly 12 Megabytes per second, > which is quite annoying. I achieve the same rate if rw=read. > > If I use librbd's cache I get a considerable improvement in writing, but > reading remains the same. > > I already tested with rbd_read_from_replica_policy=balance but I didn't > notice any difference. I tried to leave readahead enabled by setting > rbd_readahead_disable_after_bytes=0 but I didn't see any difference in > sequential reading either. > > Note: I tested it on another smaller cluster, with 36 SAS disks and got the > same result. > > I don't know exactly what to look for or configure to have any improvement. What are you expecting? This is what I have on a vm with an rbd from a hdd pool [@~]# fio -ioengine=libaio -direct=1 -invalidate=1 -name=test -bs=4k -size=1G -iodepth=16 -rw=write -filename=./test.img test: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16 fio-3.7 Starting 1 process Jobs: 1 (f=1): [W(1)][100.0%][r=0KiB/s,w=57.5MiB/s][r=0,w=14.7k IOPS][eta 00m:00s] [@~]# fio -ioengine=libaio -direct=1 -invalidate=1 -name=test -bs=4k -size=1G -iodepth=1 -rw=write -filename=./test.img test: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1 fio-3.7 Starting 1 process Jobs: 1 (f=1): [W(1)][100.0%][r=0KiB/s,w=19.9MiB/s][r=0,w=5090 IOPS][eta 00m:00s] ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: librbd 4k read/write?
> > Good afternoon everybody! > > > > I have the following scenario: > > Pool RBD replication x3 > > 5 hosts with 12 SAS spinning disks each > > > > I'm using exactly the following line with FIO to test: > > fio -ioengine=libaio -direct=1 -invalidate=1 -name=test -bs=4M -size=10G > > -iodepth=16 -rw=write -filename=./test.img > > > > If I increase the blocksize I can easily reach 1.5 GBps or more. > > > > But when I use blocksize in 4K I get a measly 12 Megabytes per second, > > > This is 3000iops. I would call that bad for 60 drives and a replication of > 3. Which amount of iops did you expect? > How is this related to 60 drives? His test is only on 3 drives at a time not? ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: librbd 4k read/write?
> > Good afternoon everybody! > > I have the following scenario: > Pool RBD replication x3 > 5 hosts with 12 SAS spinning disks each Old hardware? SAS is mostly dead. > I'm using exactly the following line with FIO to test: > fio -ioengine=libaio -direct=1 -invalidate=1 -name=test -bs=4M -size=10G > -iodepth=16 -rw=write -filename=./test.img On what kind of client? > If I increase the blocksize I can easily reach 1.5 GBps or more. > > But when I use blocksize in 4K I get a measly 12 Megabytes per second, > which is quite annoying. I achieve the same rate if rw=read. If your client is VM especially, check if you have IOPS throttling. With small block sizes you'll throttle IOPS long before bandwidth. > Note: I tested it on another smaller cluster, with 36 SAS disks and got the > same result. SAS has a price premium over SATA, and still requires an HBA. Many chassis vendors really want you to buy an anachronistic RoC HBA. Eschewing SAS and the HBA helps close the gap to justify SSDs, the TCO just doesn't favor spinners. > Maybe the 5 host cluster is not > saturated by your current fio test. Try running 2 or 4 in parallel. Agreed that Ceph is a scale out solution, not DAS, but note the difference reported with a larger block size. >How is this related to 60 drives? His test is only on 3 drives at a time not? RBD volumes by and large will live on most or all OSDs in the pool. > > I don't know exactly what to look for or configure to have any improvement. > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: librbd 4k read/write?
Hi, You can use the following formula to roughly calculate the IOPS you can get from a cluster: (Drive_IOPS * Number_of_Drives * 0.75) / Cluster_Size. For example, for 60 10K rpm SAS drives each capable of 200 4K IOPS and a replicated pool with size 3: (~200 * 60 * 0.75) / 3 = ~3000 IOPS with block size = 4K. That's what the OP is getting, give or take. /Z On Thu, 10 Aug 2023 at 20:20, Anthony D'Atri wrote: > > > > > > Good afternoon everybody! > > > > I have the following scenario: > > Pool RBD replication x3 > > 5 hosts with 12 SAS spinning disks each > > Old hardware? SAS is mostly dead. > > > I'm using exactly the following line with FIO to test: > > fio -ioengine=libaio -direct=1 -invalidate=1 -name=test -bs=4M -size=10G > > -iodepth=16 -rw=write -filename=./test.img > > On what kind of client? > > > If I increase the blocksize I can easily reach 1.5 GBps or more. > > > > But when I use blocksize in 4K I get a measly 12 Megabytes per second, > > which is quite annoying. I achieve the same rate if rw=read. > > If your client is VM especially, check if you have IOPS throttling. With > small block sizes you'll throttle IOPS long before bandwidth. > > > Note: I tested it on another smaller cluster, with 36 SAS disks and got > the > > same result. > > SAS has a price premium over SATA, and still requires an HBA. Many > chassis vendors really want you to buy an anachronistic RoC HBA. > > Eschewing SAS and the HBA helps close the gap to justify SSDs, the TCO > just doesn't favor spinners. > > > Maybe the 5 host cluster is not > > saturated by your current fio test. Try running 2 or 4 in parallel. > > > Agreed that Ceph is a scale out solution, not DAS, but note the difference > reported with a larger block size. > > >How is this related to 60 drives? His test is only on 3 drives at a time > not? > > RBD volumes by and large will live on most or all OSDs in the pool. > > > > > > > > I don't know exactly what to look for or configure to have any > improvement. > > ___ > > ceph-users mailing list -- ceph-users@ceph.io > > To unsubscribe send an email to ceph-users-le...@ceph.io > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io > ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: librbd 4k read/write?
Em qui., 10 de ago. de 2023 às 12:47, Hans van den Bogert < hansbog...@gmail.com> escreveu: > On Thu, Aug 10, 2023, 17:36 Murilo Morais wrote: > > > Good afternoon everybody! > > > > I have the following scenario: > > Pool RBD replication x3 > > 5 hosts with 12 SAS spinning disks each > > > > I'm using exactly the following line with FIO to test: > > fio -ioengine=libaio -direct=1 -invalidate=1 -name=test -bs=4M -size=10G > > -iodepth=16 -rw=write -filename=./test.img > > > > If I increase the blocksize I can easily reach 1.5 GBps or more. > > > > But when I use blocksize in 4K I get a measly 12 Megabytes per second, > > > This is 3000iops. I would call that bad for 60 drives and a replication of > 3. Which amount of iops did you expect? > > which is quite annoying. I achieve the same rate if rw=read. > > > > If I use librbd's cache I get a considerable improvement in writing, but > > reading remains the same. > > > > I already tested with rbd_read_from_replica_policy=balance but I didn't > > notice any difference. I tried to leave readahead enabled by setting > > rbd_readahead_disable_after_bytes=0 but I didn't see any difference in > > sequential reading either. > > > > Note: I tested it on another smaller cluster, with 36 SAS disks and got > the > > same result. > > > This I concur is a weird result compared to 60 disks. Are you using the > same disks and all other parameters the same, like the replication factor? > Is the performance really the same? Maybe the 5 host cluster is not > saturated by your current fio test. Try running 2 or 4 in parallel. > Yes is yes. I will try with others in parallel and compare the results. > > > > > I don't know exactly what to look for or configure to have any > improvement. > > ___ > > ceph-users mailing list -- ceph-users@ceph.io > > To unsubscribe send an email to ceph-users-le...@ceph.io > > > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io > ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: librbd 4k read/write?
Em qui., 10 de ago. de 2023 às 13:01, Marc escreveu: > > I have the following scenario: > > Pool RBD replication x3 > > 5 hosts with 12 SAS spinning disks each > > > > I'm using exactly the following line with FIO to test: > > fio -ioengine=libaio -direct=1 -invalidate=1 -name=test -bs=4M -size=10G > > -iodepth=16 -rw=write -filename=./test.img > > > > If I increase the blocksize I can easily reach 1.5 GBps or more. > > > > But when I use blocksize in 4K I get a measly 12 Megabytes per second, > > which is quite annoying. I achieve the same rate if rw=read. > > > > If I use librbd's cache I get a considerable improvement in writing, but > > reading remains the same. > > > > I already tested with rbd_read_from_replica_policy=balance but I didn't > > notice any difference. I tried to leave readahead enabled by setting > > rbd_readahead_disable_after_bytes=0 but I didn't see any difference in > > sequential reading either. > > > > Note: I tested it on another smaller cluster, with 36 SAS disks and got > the > > same result. > > > > I don't know exactly what to look for or configure to have any > improvement. > > What are you expecting? > I expected something a little better (at least in reading), since the other one, with less disks, is showing the same rates. :( > > This is what I have on a vm with an rbd from a hdd pool > > > > I'm using exactly this in libvirt. > > [@~]# fio -ioengine=libaio -direct=1 -invalidate=1 -name=test -bs=4k > -size=1G -iodepth=16 -rw=write -filename=./test.img > test: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) > 4096B-4096B, ioengine=libaio, iodepth=16 > fio-3.7 > Starting 1 process > Jobs: 1 (f=1): [W(1)][100.0%][r=0KiB/s,w=57.5MiB/s][r=0,w=14.7k IOPS][eta > 00m:00s] > With writeback I get constant 100 Megs, which is pretty good. I can live with writeback. > > > [@~]# fio -ioengine=libaio -direct=1 -invalidate=1 -name=test -bs=4k > -size=1G -iodepth=1 -rw=write -filename=./test.img > test: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) > 4096B-4096B, ioengine=libaio, iodepth=1 > fio-3.7 > Starting 1 process > Jobs: 1 (f=1): [W(1)][100.0%][r=0KiB/s,w=19.9MiB/s][r=0,w=5090 IOPS][eta > 00m:00s] > > > Thanks for showing your results, it's something I can compare to. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: librbd 4k read/write?
It makes sense. Em qui., 10 de ago. de 2023 às 16:04, Zakhar Kirpichenko escreveu: > Hi, > > You can use the following formula to roughly calculate the IOPS you can > get from a cluster: (Drive_IOPS * Number_of_Drives * 0.75) / Cluster_Size. > > For example, for 60 10K rpm SAS drives each capable of 200 4K IOPS and a > replicated pool with size 3: (~200 * 60 * 0.75) / 3 = ~3000 IOPS with block > size = 4K. > > That's what the OP is getting, give or take. > > /Z > > On Thu, 10 Aug 2023 at 20:20, Anthony D'Atri wrote: > >> >> >> > >> > Good afternoon everybody! >> > >> > I have the following scenario: >> > Pool RBD replication x3 >> > 5 hosts with 12 SAS spinning disks each >> >> Old hardware? SAS is mostly dead. >> >> > I'm using exactly the following line with FIO to test: >> > fio -ioengine=libaio -direct=1 -invalidate=1 -name=test -bs=4M -size=10G >> > -iodepth=16 -rw=write -filename=./test.img >> >> On what kind of client? >> >> > If I increase the blocksize I can easily reach 1.5 GBps or more. >> > >> > But when I use blocksize in 4K I get a measly 12 Megabytes per second, >> > which is quite annoying. I achieve the same rate if rw=read. >> >> If your client is VM especially, check if you have IOPS throttling. With >> small block sizes you'll throttle IOPS long before bandwidth. >> >> > Note: I tested it on another smaller cluster, with 36 SAS disks and got >> the >> > same result. >> >> SAS has a price premium over SATA, and still requires an HBA. Many >> chassis vendors really want you to buy an anachronistic RoC HBA. >> >> Eschewing SAS and the HBA helps close the gap to justify SSDs, the TCO >> just doesn't favor spinners. >> >> > Maybe the 5 host cluster is not >> > saturated by your current fio test. Try running 2 or 4 in parallel. >> >> >> Agreed that Ceph is a scale out solution, not DAS, but note the >> difference reported with a larger block size. >> >> >How is this related to 60 drives? His test is only on 3 drives at a time >> not? >> >> RBD volumes by and large will live on most or all OSDs in the pool. >> >> >> >> >> > >> > I don't know exactly what to look for or configure to have any >> improvement. >> > ___ >> > ceph-users mailing list -- ceph-users@ceph.io >> > To unsubscribe send an email to ceph-users-le...@ceph.io >> ___ >> ceph-users mailing list -- ceph-users@ceph.io >> To unsubscribe send an email to ceph-users-le...@ceph.io >> > ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io