[ceph-users] Re: Ceph Nautius not working after setting MTU 9000

2020-05-24 Thread Amudhan P
No, ping with MTU size 9000 didn't work.

On Sun, May 24, 2020 at 12:26 PM Khodayar Doustar 
wrote:

> Does your ping work or not?
>
>
> On Sun, May 24, 2020 at 6:53 AM Amudhan P  wrote:
>
>> Yes, I have set setting on the switch side also.
>>
>> On Sat 23 May, 2020, 6:47 PM Khodayar Doustar, 
>> wrote:
>>
>>> Problem should be with network. When you change MTU it should be changed
>>> all over the network, any single hup on your network should speak and
>>> accept 9000 MTU packets. you can check it on your hosts with "ifconfig"
>>> command and there is also equivalent commands for other network/security
>>> devices.
>>>
>>> If you have just one node which it not correctly configured for MTU 9000
>>> it wouldn't work.
>>>
>>> On Sat, May 23, 2020 at 2:30 PM si...@turka.nl  wrote:
>>>
 Can the servers/nodes ping eachother using large packet sizes? I guess
 not.

 Sinan Polat

 > Op 23 mei 2020 om 14:21 heeft Amudhan P  het
 volgende geschreven:
 >
 > In OSD logs "heartbeat_check: no reply from OSD"
 >
 >> On Sat, May 23, 2020 at 5:44 PM Amudhan P 
 wrote:
 >>
 >> Hi,
 >>
 >> I have set Network switch with MTU size 9000 and also in my netplan
 >> configuration.
 >>
 >> What else needs to be checked?
 >>
 >>
 >>> On Sat, May 23, 2020 at 3:39 PM Wido den Hollander 
 wrote:
 >>>
 >>>
 >>>
  On 5/23/20 12:02 PM, Amudhan P wrote:
  Hi,
 
  I am using ceph Nautilus in Ubuntu 18.04 working fine wit MTU size
 1500
  (default) recently i tried to update MTU size to 9000.
  After setting Jumbo frame running ceph -s is timing out.
 >>>
 >>> Ceph can run just fine with an MTU of 9000. But there is probably
 >>> something else wrong on the network which is causing this.
 >>>
 >>> Check the Jumbo Frames settings on all the switches as well to make
 sure
 >>> they forward all the packets.
 >>>
 >>> This is definitely not a Ceph issue.
 >>>
 >>> Wido
 >>>
 
  regards
  Amudhan P
  ___
  ceph-users mailing list -- ceph-users@ceph.io
  To unsubscribe send an email to ceph-users-le...@ceph.io
 
 >>> ___
 >>> ceph-users mailing list -- ceph-users@ceph.io
 >>> To unsubscribe send an email to ceph-users-le...@ceph.io
 >>>
 >>
 > ___
 > ceph-users mailing list -- ceph-users@ceph.io
 > To unsubscribe send an email to ceph-users-le...@ceph.io

 ___
 ceph-users mailing list -- ceph-users@ceph.io
 To unsubscribe send an email to ceph-users-le...@ceph.io

>>>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph Nautius not working after setting MTU 9000

2020-05-24 Thread Amudhan P
it's a Dell S4048T-ON Switch using 10G Ethernet.

On Sat, May 23, 2020 at 11:05 PM apely agamakou  wrote:

> Hi,
>
> Please check you MTU limit at the switch level, cand check other
> ressources with icmp ping.
> Try to add 14Byte for ethernet header at your switch level mean an MTU of
> 9014 ? are you using juniper ???
>
> Exemple : ping -D -s 9 other_ip
>
>
>
> Le sam. 23 mai 2020 à 15:18, Khodayar Doustar  a
> écrit :
>
>> Problem should be with network. When you change MTU it should be changed
>> all over the network, any single hup on your network should speak and
>> accept 9000 MTU packets. you can check it on your hosts with "ifconfig"
>> command and there is also equivalent commands for other network/security
>> devices.
>>
>> If you have just one node which it not correctly configured for MTU 9000
>> it
>> wouldn't work.
>>
>> On Sat, May 23, 2020 at 2:30 PM si...@turka.nl  wrote:
>>
>> > Can the servers/nodes ping eachother using large packet sizes? I guess
>> not.
>> >
>> > Sinan Polat
>> >
>> > > Op 23 mei 2020 om 14:21 heeft Amudhan P  het
>> > volgende geschreven:
>> > >
>> > > In OSD logs "heartbeat_check: no reply from OSD"
>> > >
>> > >> On Sat, May 23, 2020 at 5:44 PM Amudhan P 
>> wrote:
>> > >>
>> > >> Hi,
>> > >>
>> > >> I have set Network switch with MTU size 9000 and also in my netplan
>> > >> configuration.
>> > >>
>> > >> What else needs to be checked?
>> > >>
>> > >>
>> > >>> On Sat, May 23, 2020 at 3:39 PM Wido den Hollander 
>> > wrote:
>> > >>>
>> > >>>
>> > >>>
>> >  On 5/23/20 12:02 PM, Amudhan P wrote:
>> >  Hi,
>> > 
>> >  I am using ceph Nautilus in Ubuntu 18.04 working fine wit MTU size
>> > 1500
>> >  (default) recently i tried to update MTU size to 9000.
>> >  After setting Jumbo frame running ceph -s is timing out.
>> > >>>
>> > >>> Ceph can run just fine with an MTU of 9000. But there is probably
>> > >>> something else wrong on the network which is causing this.
>> > >>>
>> > >>> Check the Jumbo Frames settings on all the switches as well to make
>> > sure
>> > >>> they forward all the packets.
>> > >>>
>> > >>> This is definitely not a Ceph issue.
>> > >>>
>> > >>> Wido
>> > >>>
>> > 
>> >  regards
>> >  Amudhan P
>> >  ___
>> >  ceph-users mailing list -- ceph-users@ceph.io
>> >  To unsubscribe send an email to ceph-users-le...@ceph.io
>> > 
>> > >>> ___
>> > >>> ceph-users mailing list -- ceph-users@ceph.io
>> > >>> To unsubscribe send an email to ceph-users-le...@ceph.io
>> > >>>
>> > >>
>> > > ___
>> > > ceph-users mailing list -- ceph-users@ceph.io
>> > > To unsubscribe send an email to ceph-users-le...@ceph.io
>> >
>> > ___
>> > ceph-users mailing list -- ceph-users@ceph.io
>> > To unsubscribe send an email to ceph-users-le...@ceph.io
>> >
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
>>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Cephfs IO halt on Node failure

2020-05-24 Thread Amudhan P
Sorry for the late reply.
I have pasted crush map in below url : https://pastebin.com/ASPpY2VB
and this my osd tree output and this issue are only when i use it with
filelayout.

ID CLASS WEIGHTTYPE NAME  STATUS REWEIGHT PRI-AFF
-1   327.48047 root default
-3   109.16016 host strgsrv01
 0   hdd   5.45799 osd.0  up  1.0 1.0
 2   hdd   5.45799 osd.2  up  1.0 1.0
 3   hdd   5.45799 osd.3  up  1.0 1.0
 4   hdd   5.45799 osd.4  up  1.0 1.0
 5   hdd   5.45799 osd.5  up  1.0 1.0
 6   hdd   5.45799 osd.6  up  1.0 1.0
 7   hdd   5.45799 osd.7  up  1.0 1.0
19   hdd   5.45799 osd.19 up  1.0 1.0
20   hdd   5.45799 osd.20 up  1.0 1.0
21   hdd   5.45799 osd.21 up  1.0 1.0
22   hdd   5.45799 osd.22 up  1.0 1.0
23   hdd   5.45799 osd.23 up  1.0 1.0
-5   109.16016 host strgsrv02
 1   hdd   5.45799 osd.1  up  1.0 1.0
 8   hdd   5.45799 osd.8  up  1.0 1.0
 9   hdd   5.45799 osd.9  up  1.0 1.0
10   hdd   5.45799 osd.10 up  1.0 1.0
11   hdd   5.45799 osd.11 up  1.0 1.0
12   hdd   5.45799 osd.12 up  1.0 1.0
24   hdd   5.45799 osd.24 up  1.0 1.0
25   hdd   5.45799 osd.25 up  1.0 1.0
26   hdd   5.45799 osd.26 up  1.0 1.0
27   hdd   5.45799 osd.27 up  1.0 1.0
28   hdd   5.45799 osd.28 up  1.0 1.0
29   hdd   5.45799 osd.29 up  1.0 1.0
-7   109.16016 host strgsrv03
13   hdd   5.45799 osd.13 up  1.0 1.0
14   hdd   5.45799 osd.14 up  1.0 1.0
15   hdd   5.45799 osd.15 up  1.0 1.0
16   hdd   5.45799 osd.16 up  1.0 1.0
17   hdd   5.45799 osd.17 up  1.0 1.0
18   hdd   5.45799 osd.18 up  1.0 1.0
30   hdd   5.45799 osd.30 up  1.0 1.0
31   hdd   5.45799 osd.31 up  1.0 1.0
32   hdd   5.45799 osd.32 up  1.0 1.0
33   hdd   5.45799 osd.33 up  1.0 1.0
34   hdd   5.45799 osd.34 up  1.0 1.0
35   hdd   5.45799 osd.35 up  1.0 1.0

On Tue, May 19, 2020 at 12:16 PM Eugen Block  wrote:

> Was that a typo and you mean you changed min_size to 1? I/O paus with
> min_size 1 and size 2 is unexpected, can you share more details like
> your crushmap and your osd tree?
>
>
> Zitat von Amudhan P :
>
> > Behaviour is same even after setting min_size 2.
> >
> > On Mon 18 May, 2020, 12:34 PM Eugen Block,  wrote:
> >
> >> If your pool has a min_size 2 and size 2 (always a bad idea) it will
> >> pause IO in case of a failure until the recovery has finished. So the
> >> described behaviour is expected.
> >>
> >>
> >> Zitat von Amudhan P :
> >>
> >> > Hi,
> >> >
> >> > Crush rule is "replicated" and min_size 2 actually. I am trying to
> test
> >> > multiple volume configs in a single filesystem
> >> > using file layout.
> >> >
> >> > I have created metadata pool with rep 3 (min_size2 and replicated
> crush
> >> > rule) and data pool with rep 3  (min_size2 and replicated crush rule).
> >> and
> >> > also  I have created multiple (replica 2, ec2-1 & ec4-2) pools and
> added
> >> to
> >> > the filesystem.
> >> >
> >> > Using file layout I have set different data pool to a different
> folders.
> >> so
> >> > I can test different configs in the same filesystem. all data pools
> >> > min_size set to handle single node failure.
> >> >
> >> > Single node failure is handled properly when only having metadata pool
> >> and
> >> > one data pool (rep3).
> >> >
> >> > After adding additional data pool to fs, single node failure scenario
> is
> >> > not working.
> >> >
> >> > regards
> >> > Amudhan P
> >> >
> >> > On Sun, May 17, 2020 at 1:29 AM Eugen Block  wrote:
> >> >
> >> >> What’s your pool configuration wrt min_size and crush rules?
> >> >>
> >> >>
> >> >> Zitat von Amudhan P :
> >> >>
> >> >> > Hi,
> >> >> >
> >> >> > I am using ceph Nautilus cluster with below configuration.
> >> >> >
> >> >> > 3 node's (Ubuntu 18.04) each has 12 OSD's, and mds, mon and mgr are
> >> >> running
> >> >> > in shared mode.
> >> >> >
> >> >> > The client mounted through ceph kernel client.
> >> >> >
> >> >> > I was trying to emulate a node failure when a write and read were
> >> going
> >> >> on
> >> >> > (replica2) pool.
> >> >> >
> >> >> > I was expecting read and write continue after a small pause due to
> a
> >> Node
> >> >> > failure but it halts and never resumes until the failed node is up.
> >> >> >
>

[ceph-users] Re: Ceph Nautius not working after setting MTU 9000

2020-05-24 Thread Khodayar Doustar
So this is your problem, it has nothing to do with Ceph. Just fix the
network or rollback all changes.

On Sun, May 24, 2020 at 9:05 AM Amudhan P  wrote:

> No, ping with MTU size 9000 didn't work.
>
> On Sun, May 24, 2020 at 12:26 PM Khodayar Doustar 
> wrote:
>
> > Does your ping work or not?
> >
> >
> > On Sun, May 24, 2020 at 6:53 AM Amudhan P  wrote:
> >
> >> Yes, I have set setting on the switch side also.
> >>
> >> On Sat 23 May, 2020, 6:47 PM Khodayar Doustar, 
> >> wrote:
> >>
> >>> Problem should be with network. When you change MTU it should be
> changed
> >>> all over the network, any single hup on your network should speak and
> >>> accept 9000 MTU packets. you can check it on your hosts with "ifconfig"
> >>> command and there is also equivalent commands for other
> network/security
> >>> devices.
> >>>
> >>> If you have just one node which it not correctly configured for MTU
> 9000
> >>> it wouldn't work.
> >>>
> >>> On Sat, May 23, 2020 at 2:30 PM si...@turka.nl  wrote:
> >>>
>  Can the servers/nodes ping eachother using large packet sizes? I guess
>  not.
> 
>  Sinan Polat
> 
>  > Op 23 mei 2020 om 14:21 heeft Amudhan P  het
>  volgende geschreven:
>  >
>  > In OSD logs "heartbeat_check: no reply from OSD"
>  >
>  >> On Sat, May 23, 2020 at 5:44 PM Amudhan P 
>  wrote:
>  >>
>  >> Hi,
>  >>
>  >> I have set Network switch with MTU size 9000 and also in my netplan
>  >> configuration.
>  >>
>  >> What else needs to be checked?
>  >>
>  >>
>  >>> On Sat, May 23, 2020 at 3:39 PM Wido den Hollander  >
>  wrote:
>  >>>
>  >>>
>  >>>
>   On 5/23/20 12:02 PM, Amudhan P wrote:
>   Hi,
>  
>   I am using ceph Nautilus in Ubuntu 18.04 working fine wit MTU
> size
>  1500
>   (default) recently i tried to update MTU size to 9000.
>   After setting Jumbo frame running ceph -s is timing out.
>  >>>
>  >>> Ceph can run just fine with an MTU of 9000. But there is probably
>  >>> something else wrong on the network which is causing this.
>  >>>
>  >>> Check the Jumbo Frames settings on all the switches as well to
> make
>  sure
>  >>> they forward all the packets.
>  >>>
>  >>> This is definitely not a Ceph issue.
>  >>>
>  >>> Wido
>  >>>
>  
>   regards
>   Amudhan P
>   ___
>   ceph-users mailing list -- ceph-users@ceph.io
>   To unsubscribe send an email to ceph-users-le...@ceph.io
>  
>  >>> ___
>  >>> ceph-users mailing list -- ceph-users@ceph.io
>  >>> To unsubscribe send an email to ceph-users-le...@ceph.io
>  >>>
>  >>
>  > ___
>  > ceph-users mailing list -- ceph-users@ceph.io
>  > To unsubscribe send an email to ceph-users-le...@ceph.io
> 
>  ___
>  ceph-users mailing list -- ceph-users@ceph.io
>  To unsubscribe send an email to ceph-users-le...@ceph.io
> 
> >>>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph Nautius not working after setting MTU 9000

2020-05-24 Thread Amudhan P
I didn't do any changes but started working now with jumbo frames.

On Sun, May 24, 2020 at 1:04 PM Khodayar Doustar 
wrote:

> So this is your problem, it has nothing to do with Ceph. Just fix the
> network or rollback all changes.
>
> On Sun, May 24, 2020 at 9:05 AM Amudhan P  wrote:
>
>> No, ping with MTU size 9000 didn't work.
>>
>> On Sun, May 24, 2020 at 12:26 PM Khodayar Doustar 
>> wrote:
>>
>> > Does your ping work or not?
>> >
>> >
>> > On Sun, May 24, 2020 at 6:53 AM Amudhan P  wrote:
>> >
>> >> Yes, I have set setting on the switch side also.
>> >>
>> >> On Sat 23 May, 2020, 6:47 PM Khodayar Doustar, 
>> >> wrote:
>> >>
>> >>> Problem should be with network. When you change MTU it should be
>> changed
>> >>> all over the network, any single hup on your network should speak and
>> >>> accept 9000 MTU packets. you can check it on your hosts with
>> "ifconfig"
>> >>> command and there is also equivalent commands for other
>> network/security
>> >>> devices.
>> >>>
>> >>> If you have just one node which it not correctly configured for MTU
>> 9000
>> >>> it wouldn't work.
>> >>>
>> >>> On Sat, May 23, 2020 at 2:30 PM si...@turka.nl 
>> wrote:
>> >>>
>>  Can the servers/nodes ping eachother using large packet sizes? I
>> guess
>>  not.
>> 
>>  Sinan Polat
>> 
>>  > Op 23 mei 2020 om 14:21 heeft Amudhan P  het
>>  volgende geschreven:
>>  >
>>  > In OSD logs "heartbeat_check: no reply from OSD"
>>  >
>>  >> On Sat, May 23, 2020 at 5:44 PM Amudhan P 
>>  wrote:
>>  >>
>>  >> Hi,
>>  >>
>>  >> I have set Network switch with MTU size 9000 and also in my
>> netplan
>>  >> configuration.
>>  >>
>>  >> What else needs to be checked?
>>  >>
>>  >>
>>  >>> On Sat, May 23, 2020 at 3:39 PM Wido den Hollander <
>> w...@42on.com>
>>  wrote:
>>  >>>
>>  >>>
>>  >>>
>>   On 5/23/20 12:02 PM, Amudhan P wrote:
>>   Hi,
>>  
>>   I am using ceph Nautilus in Ubuntu 18.04 working fine wit MTU
>> size
>>  1500
>>   (default) recently i tried to update MTU size to 9000.
>>   After setting Jumbo frame running ceph -s is timing out.
>>  >>>
>>  >>> Ceph can run just fine with an MTU of 9000. But there is probably
>>  >>> something else wrong on the network which is causing this.
>>  >>>
>>  >>> Check the Jumbo Frames settings on all the switches as well to
>> make
>>  sure
>>  >>> they forward all the packets.
>>  >>>
>>  >>> This is definitely not a Ceph issue.
>>  >>>
>>  >>> Wido
>>  >>>
>>  
>>   regards
>>   Amudhan P
>>   ___
>>   ceph-users mailing list -- ceph-users@ceph.io
>>   To unsubscribe send an email to ceph-users-le...@ceph.io
>>  
>>  >>> ___
>>  >>> ceph-users mailing list -- ceph-users@ceph.io
>>  >>> To unsubscribe send an email to ceph-users-le...@ceph.io
>>  >>>
>>  >>
>>  > ___
>>  > ceph-users mailing list -- ceph-users@ceph.io
>>  > To unsubscribe send an email to ceph-users-le...@ceph.io
>> 
>>  ___
>>  ceph-users mailing list -- ceph-users@ceph.io
>>  To unsubscribe send an email to ceph-users-le...@ceph.io
>> 
>> >>>
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
>>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph Nautius not working after setting MTU 9000

2020-05-24 Thread Suresh Rama
Ping with 9000 MTU won't get response as I said and it should be 8972. Glad
it is working but you should know what happened to avoid this issue later.

On Sun, May 24, 2020, 3:04 AM Amudhan P  wrote:

> No, ping with MTU size 9000 didn't work.
>
> On Sun, May 24, 2020 at 12:26 PM Khodayar Doustar 
> wrote:
>
> > Does your ping work or not?
> >
> >
> > On Sun, May 24, 2020 at 6:53 AM Amudhan P  wrote:
> >
> >> Yes, I have set setting on the switch side also.
> >>
> >> On Sat 23 May, 2020, 6:47 PM Khodayar Doustar, 
> >> wrote:
> >>
> >>> Problem should be with network. When you change MTU it should be
> changed
> >>> all over the network, any single hup on your network should speak and
> >>> accept 9000 MTU packets. you can check it on your hosts with "ifconfig"
> >>> command and there is also equivalent commands for other
> network/security
> >>> devices.
> >>>
> >>> If you have just one node which it not correctly configured for MTU
> 9000
> >>> it wouldn't work.
> >>>
> >>> On Sat, May 23, 2020 at 2:30 PM si...@turka.nl  wrote:
> >>>
>  Can the servers/nodes ping eachother using large packet sizes? I guess
>  not.
> 
>  Sinan Polat
> 
>  > Op 23 mei 2020 om 14:21 heeft Amudhan P  het
>  volgende geschreven:
>  >
>  > In OSD logs "heartbeat_check: no reply from OSD"
>  >
>  >> On Sat, May 23, 2020 at 5:44 PM Amudhan P 
>  wrote:
>  >>
>  >> Hi,
>  >>
>  >> I have set Network switch with MTU size 9000 and also in my netplan
>  >> configuration.
>  >>
>  >> What else needs to be checked?
>  >>
>  >>
>  >>> On Sat, May 23, 2020 at 3:39 PM Wido den Hollander  >
>  wrote:
>  >>>
>  >>>
>  >>>
>   On 5/23/20 12:02 PM, Amudhan P wrote:
>   Hi,
>  
>   I am using ceph Nautilus in Ubuntu 18.04 working fine wit MTU
> size
>  1500
>   (default) recently i tried to update MTU size to 9000.
>   After setting Jumbo frame running ceph -s is timing out.
>  >>>
>  >>> Ceph can run just fine with an MTU of 9000. But there is probably
>  >>> something else wrong on the network which is causing this.
>  >>>
>  >>> Check the Jumbo Frames settings on all the switches as well to
> make
>  sure
>  >>> they forward all the packets.
>  >>>
>  >>> This is definitely not a Ceph issue.
>  >>>
>  >>> Wido
>  >>>
>  
>   regards
>   Amudhan P
>   ___
>   ceph-users mailing list -- ceph-users@ceph.io
>   To unsubscribe send an email to ceph-users-le...@ceph.io
>  
>  >>> ___
>  >>> ceph-users mailing list -- ceph-users@ceph.io
>  >>> To unsubscribe send an email to ceph-users-le...@ceph.io
>  >>>
>  >>
>  > ___
>  > ceph-users mailing list -- ceph-users@ceph.io
>  > To unsubscribe send an email to ceph-users-le...@ceph.io
> 
>  ___
>  ceph-users mailing list -- ceph-users@ceph.io
>  To unsubscribe send an email to ceph-users-le...@ceph.io
> 
> >>>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: question on ceph node count

2020-05-24 Thread tim taler
yep, my fault I meant replication = 3 

> > but aren't PGs checksummed so from the remaining PG (given the
> > checksum would be right) two new copies could be created?
>
> Assuming again 3R on 5 nodes, failure domain of host, if 2 nodes go down, 
> there will be 1/3 copies available.  Normally a 3R pool has min_size set to 2.
>
> You can set min_size to 1 temporarily, then those PGs will become active and 
> copies will be created to restore redundancy, but if that remaining OSD is 
> damaged, if there’s a DIMM flake, a cosmic ray, if the wrong OSD crashes or 
> restarts at the wrong time, you can find yourself without the most recent 
> copy of data and be unable to recover.  It’s Russian Roulette.

I see, but wouldn't ceph try to recreate redundancy by it's own
(unless I'm explicitly tell it not to do so)?
And if the I/O and load on the cluster isn't too high disk speed good
net connectivity good it would recover fairly quickly into healthy
redundancy state?

Anyhow, I'm not planing on crashing two nodes ;-) I just wanted to get
a feeling of how much more secure/robust
a setup with five nodes compared to four nodes is.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph Nautius not working after setting MTU 9000

2020-05-24 Thread Martin Verges
Just save yourself the trouble. You won't have any real benefit from MTU
9000. It has some smallish, but it is not worth the effort, problems, and
loss of reliability for most environments.
Try it yourself and do some benchmarks, especially with your regular
workload on the cluster (not the maximum peak performance), then drop the
MTU to default ;).

Please if anyone has other real world benchmarks showing huge differences
in regular Ceph clusters, please feel free to post it here.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am So., 24. Mai 2020 um 15:54 Uhr schrieb Suresh Rama :

> Ping with 9000 MTU won't get response as I said and it should be 8972. Glad
> it is working but you should know what happened to avoid this issue later.
>
> On Sun, May 24, 2020, 3:04 AM Amudhan P  wrote:
>
> > No, ping with MTU size 9000 didn't work.
> >
> > On Sun, May 24, 2020 at 12:26 PM Khodayar Doustar 
> > wrote:
> >
> > > Does your ping work or not?
> > >
> > >
> > > On Sun, May 24, 2020 at 6:53 AM Amudhan P  wrote:
> > >
> > >> Yes, I have set setting on the switch side also.
> > >>
> > >> On Sat 23 May, 2020, 6:47 PM Khodayar Doustar, 
> > >> wrote:
> > >>
> > >>> Problem should be with network. When you change MTU it should be
> > changed
> > >>> all over the network, any single hup on your network should speak and
> > >>> accept 9000 MTU packets. you can check it on your hosts with
> "ifconfig"
> > >>> command and there is also equivalent commands for other
> > network/security
> > >>> devices.
> > >>>
> > >>> If you have just one node which it not correctly configured for MTU
> > 9000
> > >>> it wouldn't work.
> > >>>
> > >>> On Sat, May 23, 2020 at 2:30 PM si...@turka.nl 
> wrote:
> > >>>
> >  Can the servers/nodes ping eachother using large packet sizes? I
> guess
> >  not.
> > 
> >  Sinan Polat
> > 
> >  > Op 23 mei 2020 om 14:21 heeft Amudhan P  het
> >  volgende geschreven:
> >  >
> >  > In OSD logs "heartbeat_check: no reply from OSD"
> >  >
> >  >> On Sat, May 23, 2020 at 5:44 PM Amudhan P 
> >  wrote:
> >  >>
> >  >> Hi,
> >  >>
> >  >> I have set Network switch with MTU size 9000 and also in my
> netplan
> >  >> configuration.
> >  >>
> >  >> What else needs to be checked?
> >  >>
> >  >>
> >  >>> On Sat, May 23, 2020 at 3:39 PM Wido den Hollander <
> w...@42on.com
> > >
> >  wrote:
> >  >>>
> >  >>>
> >  >>>
> >   On 5/23/20 12:02 PM, Amudhan P wrote:
> >   Hi,
> >  
> >   I am using ceph Nautilus in Ubuntu 18.04 working fine wit MTU
> > size
> >  1500
> >   (default) recently i tried to update MTU size to 9000.
> >   After setting Jumbo frame running ceph -s is timing out.
> >  >>>
> >  >>> Ceph can run just fine with an MTU of 9000. But there is
> probably
> >  >>> something else wrong on the network which is causing this.
> >  >>>
> >  >>> Check the Jumbo Frames settings on all the switches as well to
> > make
> >  sure
> >  >>> they forward all the packets.
> >  >>>
> >  >>> This is definitely not a Ceph issue.
> >  >>>
> >  >>> Wido
> >  >>>
> >  
> >   regards
> >   Amudhan P
> >   ___
> >   ceph-users mailing list -- ceph-users@ceph.io
> >   To unsubscribe send an email to ceph-users-le...@ceph.io
> >  
> >  >>> ___
> >  >>> ceph-users mailing list -- ceph-users@ceph.io
> >  >>> To unsubscribe send an email to ceph-users-le...@ceph.io
> >  >>>
> >  >>
> >  > ___
> >  > ceph-users mailing list -- ceph-users@ceph.io
> >  > To unsubscribe send an email to ceph-users-le...@ceph.io
> > 
> >  ___
> >  ceph-users mailing list -- ceph-users@ceph.io
> >  To unsubscribe send an email to ceph-users-le...@ceph.io
> > 
> > >>>
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> >
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] RGW REST API failed request with status code 403

2020-05-24 Thread apely agamakou
Hi,
Since my upgrade from 15.2.1 to 15.2.2 i've got this error message at the
"Object Gateway" section of the dashboard.

RGW REST API failed request with status code 403
(b'{"Code":"InvalidAccessKeyId","RequestId":"tx00017-005ecac06c'
b'-e349-eu-west-1","HostId":"e349-eu-west-1-default"}')

I did try to change my secret-key and access-key without success.
I made a tcpdump, i didn't see any special thing like json escape character
etc ..

Somebody had the same issue ??

Regards.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] RGW Garbage Collector

2020-05-24 Thread EDH - Manuel Rios
Hi,

Im looking for any experience optimizing garbage collector with the next 
configs:

global  advanced rgw_gc_obj_min_wait
global  advanced rgw_gc_processor_max_time
global  advanced rgw_gc_processor_period

By default gc expire objects within 2 hours, we're looking to define expire in 
10 minutes as our S3 cluster got heavy uploads and deletes.

Are those params usable? For us doesn't have sense store delete objects 2 hours 
in a gc.

Regards
Manuel

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: RGW Garbage Collector

2020-05-24 Thread Matt Benjamin
Hi Manuel,

rgw_gc_obj_min_wait -- yes, this is how you control how long rgw waits
before removing the stripes of deleted objects

the following are more gc performance and proportion of available iops:
rgw_gc_processor_max_time -- controls how long gc runs once scheduled;
 a large value might be 3600
rgw_gc_processor_period -- sets the gc cycle;  smaller is more frequent

If you want to make gc more aggressive when it is running, set the
following (can be increased), which more than doubles the :

rgw_gc_max_concurrent_io = 20
rgw_gc_max_trim_chunk = 32

If you want to increase gc fraction of total rgw i/o, increase these
(mostly, concurrent_io).

regards,

Matt

On Sun, May 24, 2020 at 4:02 PM EDH - Manuel Rios
 wrote:
>
> Hi,
>
> Im looking for any experience optimizing garbage collector with the next 
> configs:
>
> global  advanced rgw_gc_obj_min_wait
> global  advanced rgw_gc_processor_max_time
> global  advanced rgw_gc_processor_period
>
> By default gc expire objects within 2 hours, we're looking to define expire 
> in 10 minutes as our S3 cluster got heavy uploads and deletes.
>
> Are those params usable? For us doesn't have sense store delete objects 2 
> hours in a gc.
>
> Regards
> Manuel
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>


-- 

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: RGW Garbage Collector

2020-05-24 Thread EDH - Manuel Rios
Thx Mat for fast response, today night at datacenter adding more OSD for S3.

Will change the params and come back for share experience.

Regards
Manuel


-Mensaje original-
De: Matt Benjamin  
Enviado el: domingo, 24 de mayo de 2020 22:47
Para: EDH - Manuel Rios 
CC: ceph-users@ceph.io
Asunto: Re: [ceph-users] RGW Garbage Collector

Hi Manuel,

rgw_gc_obj_min_wait -- yes, this is how you control how long rgw waits before 
removing the stripes of deleted objects

the following are more gc performance and proportion of available iops:
rgw_gc_processor_max_time -- controls how long gc runs once scheduled;  a large 
value might be 3600 rgw_gc_processor_period -- sets the gc cycle;  smaller is 
more frequent

If you want to make gc more aggressive when it is running, set the following 
(can be increased), which more than doubles the :

rgw_gc_max_concurrent_io = 20
rgw_gc_max_trim_chunk = 32

If you want to increase gc fraction of total rgw i/o, increase these (mostly, 
concurrent_io).

regards,

Matt

On Sun, May 24, 2020 at 4:02 PM EDH - Manuel Rios  
wrote:
>
> Hi,
>
> Im looking for any experience optimizing garbage collector with the next 
> configs:
>
> global  advanced rgw_gc_obj_min_wait
> global  advanced rgw_gc_processor_max_time
> global  advanced rgw_gc_processor_period
>
> By default gc expire objects within 2 hours, we're looking to define expire 
> in 10 minutes as our S3 cluster got heavy uploads and deletes.
>
> Are those params usable? For us doesn't have sense store delete objects 2 
> hours in a gc.
>
> Regards
> Manuel
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an 
> email to ceph-users-le...@ceph.io
>


-- 

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [External Email] Re: Ceph Nautius not working after setting MTU 9000

2020-05-24 Thread Dave Hall

Amudhan,

Here is a trick I've used to test and evaluate Jumbo Frames without 
breaking production traffic:


 * Open a couple root ssh sessions on each of the two systems you want
   to test with.
 o   In one window start a continuous ping to the other system.
 * On both test systems:
 o Look at the output of 'ip route list'.
 o Use 'ip route change' to explicitly set the MTU of all of your
   routes to 1500.
 o Review the output of 'ip route list' to assure you got them all.
 o Use 'ip link set' to set the MTU of the interface.
 + Size of all packets is still 1500 due to the route MTUs
 + Ping should still be working like normal.
 * Using 'ip addr add' to create a test logical IP on each of the two
   systems - use something unrelated to your production IP addresses,
   like 2.2.2.2/24 and 2.2.2.3/24.
 o Go back to the procedure above and find any new routes
   associated with the addresses you just added.
 + Use 'ip addr change' to set the MTU of these new routes to
   8192.
 o Default route not needed.
 * Using the test interfaces try 'ping -s 5000'  or anything above 1500.
 o If this works, you have everything, including your network
   switches, set up correctly, and you're sending packets larger
   that 1500.
 o Yet your SSH sessions and all production traffic are still
   running at MTU 1500.
 o Since the route MTUs for these spare logical IPs is 8192, you
   should be able probe up to 'ping -s 8150' or higher, but I think
   you need to leave space for the IP and ICMP headers.

The nice thing about this is that you haven't disrupted your production 
traffic at this point, and in the worst case you can undo all of these 
changes by rebooting the two test nodes.


If you want to move your production traffic to Jumbo Frames, change the 
appropriate routes to MTU 8192 on all systems.  Then test test test.  
Lastly, change your network configuration on any effected nodes so the 
increased MTU will be reinstated after every reboot.


-Dave

Dave Hall
Binghamton University

On 5/24/2020 9:53 AM, Suresh Rama wrote:

Ping with 9000 MTU won't get response as I said and it should be 8972. Glad
it is working but you should know what happened to avoid this issue later.

On Sun, May 24, 2020, 3:04 AM Amudhan P  wrote:


No, ping with MTU size 9000 didn't work.

On Sun, May 24, 2020 at 12:26 PM Khodayar Doustar 
wrote:


Does your ping work or not?


On Sun, May 24, 2020 at 6:53 AM Amudhan P  wrote:


Yes, I have set setting on the switch side also.

On Sat 23 May, 2020, 6:47 PM Khodayar Doustar, 
wrote:


Problem should be with network. When you change MTU it should be

changed

all over the network, any single hup on your network should speak and
accept 9000 MTU packets. you can check it on your hosts with "ifconfig"
command and there is also equivalent commands for other

network/security

devices.

If you have just one node which it not correctly configured for MTU

9000

it wouldn't work.

On Sat, May 23, 2020 at 2:30 PM si...@turka.nl  wrote:


Can the servers/nodes ping eachother using large packet sizes? I guess
not.

Sinan Polat


Op 23 mei 2020 om 14:21 heeft Amudhan P  het

volgende geschreven:

In OSD logs "heartbeat_check: no reply from OSD"


On Sat, May 23, 2020 at 5:44 PM Amudhan P 

wrote:

Hi,

I have set Network switch with MTU size 9000 and also in my netplan
configuration.

What else needs to be checked?



On Sat, May 23, 2020 at 3:39 PM Wido den Hollander 
wrote:




On 5/23/20 12:02 PM, Amudhan P wrote:
Hi,

I am using ceph Nautilus in Ubuntu 18.04 working fine wit MTU

size

1500

(default) recently i tried to update MTU size to 9000.
After setting Jumbo frame running ceph -s is timing out.

Ceph can run just fine with an MTU of 9000. But there is probably
something else wrong on the network which is causing this.

Check the Jumbo Frames settings on all the switches as well to

make

sure

they forward all the packets.

This is definitely not a Ceph issue.

Wido


regards
Amudhan P
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [External Email] Re: Ceph Nautius not working after setting MTU 9000

2020-05-24 Thread Dave Hall

All,

Regarding Martin's observations about Jumbo Frames

I have recently been gathering some notes from various internet sources 
regarding Linux network performance, and Linux performance in general, 
to be applied to a Ceph cluster I manage but also to the rest of the 
Linux server farm I'm responsible for.


In short, enabling Jumbo Frames without also tuning a number of other 
kernel and NIC attributes will not provide the performance increases 
we'd like to see.  I have not yet had a chance to go through the rest of 
the testing I'd like to do, but  I can confirm (via iperf3) that only 
enabling Jumbo Frames didn't make a significant difference.


Some of the other attributes I'm referring to are incoming and outgoing 
buffer sizes at the NIC, IP, and TCP levels, interrupt coalescing, NIC 
offload functions that should or shouldn't be turned on, packet queuing 
disciplines (tc), the best choice of TCP slow-start algorithms, and 
other TCP features and attributes.


The most off-beat item I saw was something about adding IPTABLES rules 
to bypass CONNTRACK table lookups.


In order to do anything meaningful to assess the effect of all of these 
settings I'd like to figure out how to set them all via Ansible - so 
more to learn before I can give opinions.


-->  If anybody has added this type of configuration to Ceph Ansible, 
I'd be glad for some pointers.


I have started to compile a document containing my notes.  It's rough, 
but I'd be glad to share if anybody is interested.


-Dave

Dave Hall
Binghamton University
 
On 5/24/2020 12:29 PM, Martin Verges wrote:



Just save yourself the trouble. You won't have any real benefit from MTU
9000. It has some smallish, but it is not worth the effort, problems, and
loss of reliability for most environments.
Try it yourself and do some benchmarks, especially with your regular
workload on the cluster (not the maximum peak performance), then drop the
MTU to default ;).

Please if anyone has other real world benchmarks showing huge differences
in regular Ceph clusters, please feel free to post it here.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am So., 24. Mai 2020 um 15:54 Uhr schrieb Suresh Rama :


Ping with 9000 MTU won't get response as I said and it should be 8972. Glad
it is working but you should know what happened to avoid this issue later.

On Sun, May 24, 2020, 3:04 AM Amudhan P  wrote:


No, ping with MTU size 9000 didn't work.

On Sun, May 24, 2020 at 12:26 PM Khodayar Doustar 
wrote:


Does your ping work or not?


On Sun, May 24, 2020 at 6:53 AM Amudhan P  wrote:


Yes, I have set setting on the switch side also.

On Sat 23 May, 2020, 6:47 PM Khodayar Doustar, 
wrote:


Problem should be with network. When you change MTU it should be

changed

all over the network, any single hup on your network should speak and
accept 9000 MTU packets. you can check it on your hosts with

"ifconfig"

command and there is also equivalent commands for other

network/security

devices.

If you have just one node which it not correctly configured for MTU

9000

it wouldn't work.

On Sat, May 23, 2020 at 2:30 PM si...@turka.nl 

wrote:

Can the servers/nodes ping eachother using large packet sizes? I

guess

not.

Sinan Polat


Op 23 mei 2020 om 14:21 heeft Amudhan P  het

volgende geschreven:

In OSD logs "heartbeat_check: no reply from OSD"


On Sat, May 23, 2020 at 5:44 PM Amudhan P 

wrote:

Hi,

I have set Network switch with MTU size 9000 and also in my

netplan

configuration.

What else needs to be checked?



On Sat, May 23, 2020 at 3:39 PM Wido den Hollander <

w...@42on.com

wrote:




On 5/23/20 12:02 PM, Amudhan P wrote:
Hi,

I am using ceph Nautilus in Ubuntu 18.04 working fine wit MTU

size

1500

(default) recently i tried to update MTU size to 9000.
After setting Jumbo frame running ceph -s is timing out.

Ceph can run just fine with an MTU of 9000. But there is

probably

something else wrong on the network which is causing this.

Check the Jumbo Frames settings on all the switches as well to

make

sure

they forward all the packets.

This is definitely not a Ceph issue.

Wido


regards
Amudhan P
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-user

[ceph-users] Re: RGW resharding

2020-05-24 Thread lin yunfan
Can you store your data in different buckets?

linyunfan

Adrian Nicolae  于2020年5月19日周二 下午3:32写道:
>
> Hi,
>
> I have the following Ceph Mimic setup :
>
> - a bunch of old servers with 3-4 SATA drives each (74 OSDs in total)
>
> - index/leveldb is stored on each OSD (so no SSD drives, just SATA)
>
> - the current usage  is :
>
> GLOBAL:
>  SIZEAVAIL   RAW USED %RAW USED
>  542 TiB 105 TiB  437 TiB 80.67
> POOLS:
>  NAME   ID USED%USED MAX
> AVAIL OBJECTS
>  .rgw.root  1  1.1 KiB 0 26
> TiB4
>  default.rgw.control2  0 B 0 26
> TiB8
>  default.rgw.meta   3   20 MiB 0 26
> TiB75357
>  default.rgw.log4  0 B 0 26
> TiB 4271
>  default.rgw.buckets.data   5  290 TiB 85.05 51 TiB
> 78067284
>  default.rgw.buckets.non-ec 6  0 B 0 26
> TiB0
>  default.rgw.buckets.index  7  0 B 0 26
> TiB   603008
>
> - rgw_override_bucket_index_max_shards = 16.   Clients are accessing RGW
> via Swift, not S3.
>
> - the replication schema is EC 4+2.
>
> We are using this Ceph cluster as  a secondary storage for another
> storage infrastructure (which is more expensive) and we are offloading
> cold data (big files with a low number of downloads/reads from our
> customer). This way we can lower the TCO .  So most of the files are big
> ( a few GB at least).
>
>   So far Ceph is doing well considering that I don't have big
> expectations from current hardware.  I'm a bit worried however that we
> have 78 M objects with max_shards=16 and we will probably reach 100M in
> the next few months. Do I need a increase the max shards to ensure the
> stability of the cluster ?  I read that storing more than 1 M of objects
> in a single bucket can lead to OSD's flapping or having io timeouts
> during deep-scrub or even to have ODS's failures due to the leveldb
> compacting all the time if we have a large number of DELETEs.
>
> Any advice would be appreciated.
>
>
> Thank you,
>
> Adrian Nicolae
>
>
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: remove secondary zone from multisite

2020-05-24 Thread Zhenshi Zhou
Did anyone deal with it? Can I just remove the secondary zone from the
cluster?
I'm not sure if this action has any effect on the master zone.

Thanks

Zhenshi Zhou  于2020年5月22日周五 上午11:22写道:

> Hi all,
>
> I'm gonna make my secondary zone offline.
> How to remove the secondary zone from a mutisite?
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io