[ceph-users] Re: Ceph bucket notification events stop working

2023-08-10 Thread daniel . yordanov1
Hello Yuval, 

Thanks for your reply!
We continued digging in the problem and we found out that it was caused by a 
recent change in our infrastructure. 
Loadbalancer pods were added in front or rgw ones and those were logging an SSL 
error. 
As we weren't aware right away of that change we weren't checking the logs of 
those pods. 
We have fixed it and it works now. 

Thanks, 
Daniel
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: ceph-volume lvm new-db fails

2023-08-10 Thread Christian Rohmann



On 11/05/2022 23:21, Joost Nieuwenhuijse wrote:
After a reboot the OSD turned out to be corrupt. Not sure if 
ceph-volume lvm new-db caused the problem, or failed because of 
another problem.



I just ran into the same issue trying to add a db to an existing OSD.
Apparently this is a known bug: https://tracker.ceph.com/issues/55260

It's already fixed master, but the backports are all still pending ...



Regards

Christian
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] librbd 4k read/write?

2023-08-10 Thread Murilo Morais
Good afternoon everybody!

I have the following scenario:
Pool RBD replication x3
5 hosts with 12 SAS spinning disks each

I'm using exactly the following line with FIO to test:
fio -ioengine=libaio -direct=1 -invalidate=1 -name=test -bs=4M -size=10G
-iodepth=16 -rw=write -filename=./test.img

If I increase the blocksize I can easily reach 1.5 GBps or more.

But when I use blocksize in 4K I get a measly 12 Megabytes per second,
which is quite annoying. I achieve the same rate if rw=read.

If I use librbd's cache I get a considerable improvement in writing, but
reading remains the same.

I already tested with rbd_read_from_replica_policy=balance but I didn't
notice any difference. I tried to leave readahead enabled by setting
rbd_readahead_disable_after_bytes=0 but I didn't see any difference in
sequential reading either.

Note: I tested it on another smaller cluster, with 36 SAS disks and got the
same result.

I don't know exactly what to look for or configure to have any improvement.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: librbd 4k read/write?

2023-08-10 Thread Hans van den Bogert
On Thu, Aug 10, 2023, 17:36 Murilo Morais  wrote:

> Good afternoon everybody!
>
> I have the following scenario:
> Pool RBD replication x3
> 5 hosts with 12 SAS spinning disks each
>
> I'm using exactly the following line with FIO to test:
> fio -ioengine=libaio -direct=1 -invalidate=1 -name=test -bs=4M -size=10G
> -iodepth=16 -rw=write -filename=./test.img
>
> If I increase the blocksize I can easily reach 1.5 GBps or more.
>
> But when I use blocksize in 4K I get a measly 12 Megabytes per second,
>
This is 3000iops. I would call that bad for 60 drives and a replication of
3. Which amount of iops did you expect?

which is quite annoying. I achieve the same rate if rw=read.
>
> If I use librbd's cache I get a considerable improvement in writing, but
> reading remains the same.
>
> I already tested with rbd_read_from_replica_policy=balance but I didn't
> notice any difference. I tried to leave readahead enabled by setting
> rbd_readahead_disable_after_bytes=0 but I didn't see any difference in
> sequential reading either.
>
> Note: I tested it on another smaller cluster, with 36 SAS disks and got the
> same result.
>
This I concur is a weird result compared to 60 disks. Are you using the
same disks and all other parameters the same, like the replication factor?
Is the performance really the same? Maybe the 5 host cluster is not
saturated by your current fio test. Try running 2 or 4 in parallel.

>
> I don't know exactly what to look for or configure to have any improvement.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: librbd 4k read/write?

2023-08-10 Thread Marc
> I have the following scenario:
> Pool RBD replication x3
> 5 hosts with 12 SAS spinning disks each
> 
> I'm using exactly the following line with FIO to test:
> fio -ioengine=libaio -direct=1 -invalidate=1 -name=test -bs=4M -size=10G
> -iodepth=16 -rw=write -filename=./test.img
> 
> If I increase the blocksize I can easily reach 1.5 GBps or more.
> 
> But when I use blocksize in 4K I get a measly 12 Megabytes per second,
> which is quite annoying. I achieve the same rate if rw=read.
> 
> If I use librbd's cache I get a considerable improvement in writing, but
> reading remains the same.
> 
> I already tested with rbd_read_from_replica_policy=balance but I didn't
> notice any difference. I tried to leave readahead enabled by setting
> rbd_readahead_disable_after_bytes=0 but I didn't see any difference in
> sequential reading either.
> 
> Note: I tested it on another smaller cluster, with 36 SAS disks and got the
> same result.
> 
> I don't know exactly what to look for or configure to have any improvement.

What are you expecting?

This is what I have on a vm with an rbd from a hdd pool




[@~]# fio -ioengine=libaio -direct=1 -invalidate=1 -name=test -bs=4k -size=1G 
-iodepth=16 -rw=write -filename=./test.img
test: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, 
ioengine=libaio, iodepth=16
fio-3.7
Starting 1 process
Jobs: 1 (f=1): [W(1)][100.0%][r=0KiB/s,w=57.5MiB/s][r=0,w=14.7k IOPS][eta 
00m:00s]


[@~]# fio -ioengine=libaio -direct=1 -invalidate=1 -name=test -bs=4k -size=1G 
-iodepth=1 -rw=write -filename=./test.img
test: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, 
ioengine=libaio, iodepth=1
fio-3.7
Starting 1 process
Jobs: 1 (f=1): [W(1)][100.0%][r=0KiB/s,w=19.9MiB/s][r=0,w=5090 IOPS][eta 
00m:00s]



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: librbd 4k read/write?

2023-08-10 Thread Marc
> > Good afternoon everybody!
> >
> > I have the following scenario:
> > Pool RBD replication x3
> > 5 hosts with 12 SAS spinning disks each
> >
> > I'm using exactly the following line with FIO to test:
> > fio -ioengine=libaio -direct=1 -invalidate=1 -name=test -bs=4M -size=10G
> > -iodepth=16 -rw=write -filename=./test.img
> >
> > If I increase the blocksize I can easily reach 1.5 GBps or more.
> >
> > But when I use blocksize in 4K I get a measly 12 Megabytes per second,
> >
> This is 3000iops. I would call that bad for 60 drives and a replication of
> 3. Which amount of iops did you expect?
> 

How is this related to 60 drives? His test is only on 3 drives at a time not? 
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: librbd 4k read/write?

2023-08-10 Thread Anthony D'Atri



> 
> Good afternoon everybody!
> 
> I have the following scenario:
> Pool RBD replication x3
> 5 hosts with 12 SAS spinning disks each

Old hardware?  SAS is mostly dead.

> I'm using exactly the following line with FIO to test:
> fio -ioengine=libaio -direct=1 -invalidate=1 -name=test -bs=4M -size=10G
> -iodepth=16 -rw=write -filename=./test.img

On what kind of client?  

> If I increase the blocksize I can easily reach 1.5 GBps or more.
> 
> But when I use blocksize in 4K I get a measly 12 Megabytes per second,
> which is quite annoying. I achieve the same rate if rw=read.

If your client is VM especially, check if you have IOPS throttling. With small 
block sizes you'll throttle IOPS long before bandwidth.

> Note: I tested it on another smaller cluster, with 36 SAS disks and got the
> same result.

SAS has a price premium over SATA, and still requires an HBA.  Many chassis 
vendors really want you to buy an anachronistic RoC HBA.

Eschewing SAS and the HBA helps close the gap to justify SSDs, the TCO just 
doesn't favor spinners.

> Maybe the 5 host cluster is not
> saturated by your current fio test. Try running 2 or 4 in parallel.


Agreed that Ceph is a scale out solution, not DAS, but note the difference 
reported with a larger block size.

>How is this related to 60 drives? His test is only on 3 drives at a time not? 

RBD volumes by and large will live on most or all OSDs in the pool.




> 
> I don't know exactly what to look for or configure to have any improvement.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: librbd 4k read/write?

2023-08-10 Thread Zakhar Kirpichenko
Hi,

You can use the following formula to roughly calculate the IOPS you can get
from a cluster: (Drive_IOPS * Number_of_Drives * 0.75) / Cluster_Size.

For example, for 60 10K rpm SAS drives each capable of 200 4K IOPS and a
replicated pool with size 3: (~200 * 60 * 0.75) / 3 = ~3000 IOPS with block
size = 4K.

That's what the OP is getting, give or take.

/Z

On Thu, 10 Aug 2023 at 20:20, Anthony D'Atri  wrote:

>
>
> >
> > Good afternoon everybody!
> >
> > I have the following scenario:
> > Pool RBD replication x3
> > 5 hosts with 12 SAS spinning disks each
>
> Old hardware?  SAS is mostly dead.
>
> > I'm using exactly the following line with FIO to test:
> > fio -ioengine=libaio -direct=1 -invalidate=1 -name=test -bs=4M -size=10G
> > -iodepth=16 -rw=write -filename=./test.img
>
> On what kind of client?
>
> > If I increase the blocksize I can easily reach 1.5 GBps or more.
> >
> > But when I use blocksize in 4K I get a measly 12 Megabytes per second,
> > which is quite annoying. I achieve the same rate if rw=read.
>
> If your client is VM especially, check if you have IOPS throttling. With
> small block sizes you'll throttle IOPS long before bandwidth.
>
> > Note: I tested it on another smaller cluster, with 36 SAS disks and got
> the
> > same result.
>
> SAS has a price premium over SATA, and still requires an HBA.  Many
> chassis vendors really want you to buy an anachronistic RoC HBA.
>
> Eschewing SAS and the HBA helps close the gap to justify SSDs, the TCO
> just doesn't favor spinners.
>
> > Maybe the 5 host cluster is not
> > saturated by your current fio test. Try running 2 or 4 in parallel.
>
>
> Agreed that Ceph is a scale out solution, not DAS, but note the difference
> reported with a larger block size.
>
> >How is this related to 60 drives? His test is only on 3 drives at a time
> not?
>
> RBD volumes by and large will live on most or all OSDs in the pool.
>
>
>
>
> >
> > I don't know exactly what to look for or configure to have any
> improvement.
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: librbd 4k read/write?

2023-08-10 Thread Murilo Morais
Em qui., 10 de ago. de 2023 às 12:47, Hans van den Bogert <
hansbog...@gmail.com> escreveu:

> On Thu, Aug 10, 2023, 17:36 Murilo Morais  wrote:
>
> > Good afternoon everybody!
> >
> > I have the following scenario:
> > Pool RBD replication x3
> > 5 hosts with 12 SAS spinning disks each
> >
> > I'm using exactly the following line with FIO to test:
> > fio -ioengine=libaio -direct=1 -invalidate=1 -name=test -bs=4M -size=10G
> > -iodepth=16 -rw=write -filename=./test.img
> >
> > If I increase the blocksize I can easily reach 1.5 GBps or more.
> >
> > But when I use blocksize in 4K I get a measly 12 Megabytes per second,
> >
> This is 3000iops. I would call that bad for 60 drives and a replication of
> 3. Which amount of iops did you expect?
>
> which is quite annoying. I achieve the same rate if rw=read.
> >
> > If I use librbd's cache I get a considerable improvement in writing, but
> > reading remains the same.
> >
> > I already tested with rbd_read_from_replica_policy=balance but I didn't
> > notice any difference. I tried to leave readahead enabled by setting
> > rbd_readahead_disable_after_bytes=0 but I didn't see any difference in
> > sequential reading either.
> >
> > Note: I tested it on another smaller cluster, with 36 SAS disks and got
> the
> > same result.
> >
> This I concur is a weird result compared to 60 disks. Are you using the
> same disks and all other parameters the same, like the replication factor?
> Is the performance really the same? Maybe the 5 host cluster is not
> saturated by your current fio test. Try running 2 or 4 in parallel.
>
Yes is yes. I will try with others in parallel and compare the results.

>
> >
> > I don't know exactly what to look for or configure to have any
> improvement.
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> >
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: librbd 4k read/write?

2023-08-10 Thread Murilo Morais
Em qui., 10 de ago. de 2023 às 13:01, Marc 
escreveu:

> > I have the following scenario:
> > Pool RBD replication x3
> > 5 hosts with 12 SAS spinning disks each
> >
> > I'm using exactly the following line with FIO to test:
> > fio -ioengine=libaio -direct=1 -invalidate=1 -name=test -bs=4M -size=10G
> > -iodepth=16 -rw=write -filename=./test.img
> >
> > If I increase the blocksize I can easily reach 1.5 GBps or more.
> >
> > But when I use blocksize in 4K I get a measly 12 Megabytes per second,
> > which is quite annoying. I achieve the same rate if rw=read.
> >
> > If I use librbd's cache I get a considerable improvement in writing, but
> > reading remains the same.
> >
> > I already tested with rbd_read_from_replica_policy=balance but I didn't
> > notice any difference. I tried to leave readahead enabled by setting
> > rbd_readahead_disable_after_bytes=0 but I didn't see any difference in
> > sequential reading either.
> >
> > Note: I tested it on another smaller cluster, with 36 SAS disks and got
> the
> > same result.
> >
> > I don't know exactly what to look for or configure to have any
> improvement.
>
> What are you expecting?
>
I expected something a little better (at least in reading), since the other
one, with less disks, is showing the same rates. :(

>
> This is what I have on a vm with an rbd from a hdd pool
>
>
> 
>
I'm using exactly this in libvirt.

>
> [@~]# fio -ioengine=libaio -direct=1 -invalidate=1 -name=test -bs=4k
> -size=1G -iodepth=16 -rw=write -filename=./test.img
> test: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T)
> 4096B-4096B, ioengine=libaio, iodepth=16
> fio-3.7
> Starting 1 process
> Jobs: 1 (f=1): [W(1)][100.0%][r=0KiB/s,w=57.5MiB/s][r=0,w=14.7k IOPS][eta
> 00m:00s]
>
With writeback I get constant 100 Megs, which is pretty good.  I can live
with writeback.

>
>
> [@~]# fio -ioengine=libaio -direct=1 -invalidate=1 -name=test -bs=4k
> -size=1G -iodepth=1 -rw=write -filename=./test.img
> test: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T)
> 4096B-4096B, ioengine=libaio, iodepth=1
> fio-3.7
> Starting 1 process
> Jobs: 1 (f=1): [W(1)][100.0%][r=0KiB/s,w=19.9MiB/s][r=0,w=5090 IOPS][eta
> 00m:00s]
>
>
> Thanks for showing your results, it's something I can compare to.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: librbd 4k read/write?

2023-08-10 Thread Murilo Morais
It makes sense.

Em qui., 10 de ago. de 2023 às 16:04, Zakhar Kirpichenko 
escreveu:

> Hi,
>
> You can use the following formula to roughly calculate the IOPS you can
> get from a cluster: (Drive_IOPS * Number_of_Drives * 0.75) / Cluster_Size.
>
> For example, for 60 10K rpm SAS drives each capable of 200 4K IOPS and a
> replicated pool with size 3: (~200 * 60 * 0.75) / 3 = ~3000 IOPS with block
> size = 4K.
>
> That's what the OP is getting, give or take.
>
> /Z
>
> On Thu, 10 Aug 2023 at 20:20, Anthony D'Atri  wrote:
>
>>
>>
>> >
>> > Good afternoon everybody!
>> >
>> > I have the following scenario:
>> > Pool RBD replication x3
>> > 5 hosts with 12 SAS spinning disks each
>>
>> Old hardware?  SAS is mostly dead.
>>
>> > I'm using exactly the following line with FIO to test:
>> > fio -ioengine=libaio -direct=1 -invalidate=1 -name=test -bs=4M -size=10G
>> > -iodepth=16 -rw=write -filename=./test.img
>>
>> On what kind of client?
>>
>> > If I increase the blocksize I can easily reach 1.5 GBps or more.
>> >
>> > But when I use blocksize in 4K I get a measly 12 Megabytes per second,
>> > which is quite annoying. I achieve the same rate if rw=read.
>>
>> If your client is VM especially, check if you have IOPS throttling. With
>> small block sizes you'll throttle IOPS long before bandwidth.
>>
>> > Note: I tested it on another smaller cluster, with 36 SAS disks and got
>> the
>> > same result.
>>
>> SAS has a price premium over SATA, and still requires an HBA.  Many
>> chassis vendors really want you to buy an anachronistic RoC HBA.
>>
>> Eschewing SAS and the HBA helps close the gap to justify SSDs, the TCO
>> just doesn't favor spinners.
>>
>> > Maybe the 5 host cluster is not
>> > saturated by your current fio test. Try running 2 or 4 in parallel.
>>
>>
>> Agreed that Ceph is a scale out solution, not DAS, but note the
>> difference reported with a larger block size.
>>
>> >How is this related to 60 drives? His test is only on 3 drives at a time
>> not?
>>
>> RBD volumes by and large will live on most or all OSDs in the pool.
>>
>>
>>
>>
>> >
>> > I don't know exactly what to look for or configure to have any
>> improvement.
>> > ___
>> > ceph-users mailing list -- ceph-users@ceph.io
>> > To unsubscribe send an email to ceph-users-le...@ceph.io
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
>>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io