[ceph-users] Monitors' election failed on VMs : e4 handle_auth_request failed to assign global_id

2020-03-10 Thread Yoann Moulin
Hello,

On a Nautilus cluster, I'd like to move monitors from bare metal servers to VMs 
to prepare a migration.

I have added 3 new monitors on 3 VMs and I'd like to stop the 3 old monitors 
daemon. But I soon as I stop the 3rd old monitor, the cluster stuck
because the election of a new monitor fails.

The 3 old monitors are in 14.2.4-1xenial
The 3 new monitors are in 14.2.7-1bionic

> 2020-03-09 16:06:00.167 7fc4a3138700  1 mon.icvm0017@3(peon).paxos(paxos 
> active c 20918592..20919120) lease_timeout -- calling new election
> 2020-03-09 16:06:02.143 7fc49f931700  1 mon.icvm0017@3(probing) e4 
> handle_auth_request failed to assign global_id

Did I miss something?

In attachment : some logs and ceph.conf

Thanks for your help.

Best,

-- 
Yoann Moulin
EPFL IC-IT

# Please do not change this file directly since it is managed by Ansible and 
will be overwritten
[global]
cluster network = 192.168.47.0/24
fsid = 778234df-5784-4021-b983-0ee1814891be
mon host = 
[v2:10.90.36.16:3300,v1:10.90.36.16:6789],[v2:10.90.36.17:3300,v1:10.90.36.17:6789],[v2:10.90.36.18:3300,v1:10.90.36.18:6789],[v2:10.95.32.45:3300,v1:10.95.32.45:6789],[v2:10.95.32.46:3300,v1:10.95.32.46:6789],[v2:10.95.32.48:3300,v1:10.95.32.48:6789]
mon initial members = 
icadmin006,icadmin007,icadmin008,icvm0017,icvm0018,icvm0022
osd pool default crush rule = -1
osd_crush_chooseleaf_type = 1
osd_op_queue_cut_off = high
osd_pool_default_pg_num = 8
osd_pool_default_pgp_num = 8
public network = 10.90.36.0/24,10.90.47.0/24,10.95.32.0/20

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Monitors' election failed on VMs : e4 handle_auth_request failed to assign global_id

2020-03-10 Thread Paul Emmerich
On Tue, Mar 10, 2020 at 8:18 AM Yoann Moulin  wrote:
> I have added 3 new monitors on 3 VMs and I'd like to stop the 3 old monitors 
> daemon. But I soon as I stop the 3rd old monitor, the cluster stuck
> because the election of a new monitor fails.

By "stop" you mean "stop and then immediately remove before stopping
the next one"? Otherwise that's the problem.

-- 
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90

>
> The 3 old monitors are in 14.2.4-1xenial
> The 3 new monitors are in 14.2.7-1bionic
>
> > 2020-03-09 16:06:00.167 7fc4a3138700  1 mon.icvm0017@3(peon).paxos(paxos 
> > active c 20918592..20919120) lease_timeout -- calling new election
> > 2020-03-09 16:06:02.143 7fc49f931700  1 mon.icvm0017@3(probing) e4 
> > handle_auth_request failed to assign global_id
>
> Did I miss something?
>
> In attachment : some logs and ceph.conf
>
> Thanks for your help.
>
> Best,
>
> --
> Yoann Moulin
> EPFL IC-IT
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Monitors' election failed on VMs : e4 handle_auth_request failed to assign global_id

2020-03-10 Thread Håkan T Johansson


Note that with 6 monitors, quorum requires 4.

So if only 3 are running, the system cannot work.

With one old removed there would be 5 possible, then with quorum of 3.

Best regards,
Håkan


On Tue, 10 Mar 2020, Paul Emmerich wrote:


On Tue, Mar 10, 2020 at 8:18 AM Yoann Moulin  wrote:

I have added 3 new monitors on 3 VMs and I'd like to stop the 3 old monitors 
daemon. But I soon as I stop the 3rd old monitor, the cluster stuck
because the election of a new monitor fails.


By "stop" you mean "stop and then immediately remove before stopping
the next one"? Otherwise that's the problem.

--
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90



The 3 old monitors are in 14.2.4-1xenial
The 3 new monitors are in 14.2.7-1bionic


2020-03-09 16:06:00.167 7fc4a3138700  1 mon.icvm0017@3(peon).paxos(paxos active 
c 20918592..20919120) lease_timeout -- calling new election
2020-03-09 16:06:02.143 7fc49f931700  1 mon.icvm0017@3(probing) e4 
handle_auth_request failed to assign global_id


Did I miss something?

In attachment : some logs and ceph.conf

Thanks for your help.

Best,

--
Yoann Moulin
EPFL IC-IT

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Monitors' election failed on VMs : e4 handle_auth_request failed to assign global_id

2020-03-10 Thread Yoann Moulin
Hello,
> Note that with 6 monitors, quorum requires 4.
>
> So if only 3 are running, the system cannot work.
> 
> With one old removed there would be 5 possible, then with quorum of 3.

Good point! I hadn't thought of that.
Looks like it works if I remove one, thanks a lot!

Best,

Yoann

> On Tue, 10 Mar 2020, Paul Emmerich wrote:
> 
>> On Tue, Mar 10, 2020 at 8:18 AM Yoann Moulin  wrote:
>>> I have added 3 new monitors on 3 VMs and I'd like to stop the 3 old 
>>> monitors daemon. But I soon as I stop the 3rd old monitor, the cluster stuck
>>> because the election of a new monitor fails.
>>
>> By "stop" you mean "stop and then immediately remove before stopping
>> the next one"? Otherwise that's the problem.
>>
>> -- 
>> Paul Emmerich
>>
>> Looking for help with your Ceph cluster? Contact us at https://croit.io
>>
>> croit GmbH
>> Freseniusstr. 31h
>> 81247 München
>> www.croit.io
>> Tel: +49 89 1896585 90
>>
>>>
>>> The 3 old monitors are in 14.2.4-1xenial
>>> The 3 new monitors are in 14.2.7-1bionic
>>>
 2020-03-09 16:06:00.167 7fc4a3138700  1 mon.icvm0017@3(peon).paxos(paxos 
 active c 20918592..20919120) lease_timeout -- calling new election
 2020-03-09 16:06:02.143 7fc49f931700  1 mon.icvm0017@3(probing) e4 
 handle_auth_request failed to assign global_id
>>>
>>> Did I miss something?
>>>
>>> In attachment : some logs and ceph.conf
>>>
>>> Thanks for your help.
>>>
>>> Best,
>>>

-- 
Yoann Moulin
EPFL IC-IT
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Radosgw dynamic sharding jewel -> luminous

2020-03-10 Thread Robert LeBlanc
I don't know if it is that specifically, but they are all are running the
latest version of Luminous and I set the cluster to only allow Luminous
OSDs in. All services have been upgraded to Luminous. Do I need to run a
command to activate the cls_rgw API?

Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1


On Wed, Mar 4, 2020 at 8:33 AM Casey Bodley  wrote:

> Okay, it's sharing the log_pool so you shouldn't need any special pool
> permissions there.
>
> The 'failed to list reshard log entries' error message is coming from a
> call to cls_rgw_reshard_list(), which is a new API in cls_rgw. Have all
> of your osds been upgraded to support that?
>
>
> On 3/4/20 10:22 AM, Robert LeBlanc wrote:
> > On Tue, Mar 3, 2020 at 10:31 AM Casey Bodley  > > wrote:
> >
> > The default value of this reshard pool is
> > "default.rgw.log:reshard". You
> > can check 'radosgw-admin zone get' for the list of pool
> > names/namespaces
> > in use. It may be that your log pool is named ".rgw.log" instead,
> > so you
> > could change your reshard_pool to ".rgw.log:reshard" to share that.
> >
> >
> > Okay, I can now see the pool and the namespace, but I'm still not
> > exactly sure what permissions I need to give to whom to make it work.
> > ```
> > $ radosgw-admin zone get --rgw-zone default
> > {
> >"id": "f6dcc86d-1f20-445f-9b5c-ad1ec5abf5cf",
> >"name": "default",
> >"domain_root": "default.rgw.data.root",
> >"control_pool": "default.rgw.control",
> >"gc_pool": "default.rgw.gc",
> >"lc_pool": "default.rgw.log:lc",
> >"log_pool": "default.rgw.log",
> >"intent_log_pool": "default.rgw.intent-log",
> >"usage_log_pool": "default.rgw.usage",
> >"reshard_pool": "default.rgw.log:reshard",
> >"user_keys_pool": "default.rgw.users.keys",
> >"user_email_pool": "default.rgw.users.email",
> >"user_swift_pool": "default.rgw.users.swift",
> >"user_uid_pool": "default.rgw.users.uid",
> >"system_key": {
> >"access_key": "",
> >"secret_key": ""
> >},
> >"placement_pools": [
> >{
> >"key": "default-placement",
> >"val": {
> >"index_pool": "default.rgw.buckets.index",
> >"data_pool": "default.rgw.buckets.data",
> >"data_extra_pool": "default.rgw.buckets.non-ec",
> >"index_type": 0,
> >"compression": ""
> >}
> >}
> >],
> >"metadata_heap": "",
> >"tier_config": [],
> >"realm_id": ""
> > }
> > ```
> > Also the command would not work without the `--rgw-zone default`
> > option, otherwise I'd get:
> > ```
> > $ radosgw-admin zone get
> > unable to initialize zone: (2) No such file or directory
> > ```
> >
> > Thank you,
> > Robert LeBlanc
> >
> > 
> > Robert LeBlanc
> > PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: ceph-mon store.db disk usage increase on OSD-Host fail

2020-03-10 Thread Hartwig Hauschild
Hi, 

I've done a bit more testing ...

Am 05.03.2020 schrieb Hartwig Hauschild:
> Hi, 
> 
> I'm (still) testing upgrading from Luminous to Nautilus and ran into the
> following situation:
> 
> The lab-setup I'm testing in has three OSD-Hosts. 
> If one of those hosts dies the store.db in /var/lib/ceph/mon/ on all my
> Mon-Nodes starts to rapidly grow in size until either the OSD-host comes
> back up or disks are full.
> 
This also happens when I take one single OSD offline - /var/lib/ceph/mon/
grows from around 100MB to ~2GB in about 5 Minutes, then I aborted the test.
Since we've had an OSD-Host fail over a weekend I know that growing won't
stop until the disk is full and that usually happens in around 20 Minutes,
then taking up 17GB of diskspace.

> On another cluster that's still on Luminous I don't see any growth at all.
> 
Retested that cluster as well, observing the size on disk of
/var/lib/ceph/mon/ suggests, that there's writes and deletes / compactions
going on as it kept floating within +- 5% of the original size.

> Is that a difference in behaviour between Luminous and Nautilus or is that
> caused by the lab-setup only having three hosts and one lost host causing
> all PGs to be degraded at the same time?
> 

I've read somewhere in the docs that I should provide ample space (tens of
GB) for the store.db, found on the ML and Bugtracker that ~100GB might not
be a bad idea and that large clusters may require space on order of
magnitude greater.
Is there some sort of formula I can use to approximate the space required?

Also: is the db supposed to grow this fast in Nautilus when it did not do
that in Luminous? Is that behaviour configurable somewhere?


-- 
Cheers,
Hardy
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: ceph-mon store.db disk usage increase on OSD-Host fail

2020-03-10 Thread Wido den Hollander



On 3/10/20 10:48 AM, Hartwig Hauschild wrote:
> Hi, 
> 
> I've done a bit more testing ...
> 
> Am 05.03.2020 schrieb Hartwig Hauschild:
>> Hi, 
>>
>> I'm (still) testing upgrading from Luminous to Nautilus and ran into the
>> following situation:
>>
>> The lab-setup I'm testing in has three OSD-Hosts. 
>> If one of those hosts dies the store.db in /var/lib/ceph/mon/ on all my
>> Mon-Nodes starts to rapidly grow in size until either the OSD-host comes
>> back up or disks are full.
>>
> This also happens when I take one single OSD offline - /var/lib/ceph/mon/
> grows from around 100MB to ~2GB in about 5 Minutes, then I aborted the test.
> Since we've had an OSD-Host fail over a weekend I know that growing won't
> stop until the disk is full and that usually happens in around 20 Minutes,
> then taking up 17GB of diskspace.
> 
>> On another cluster that's still on Luminous I don't see any growth at all.
>>
> Retested that cluster as well, observing the size on disk of
> /var/lib/ceph/mon/ suggests, that there's writes and deletes / compactions
> going on as it kept floating within +- 5% of the original size.
> 
>> Is that a difference in behaviour between Luminous and Nautilus or is that
>> caused by the lab-setup only having three hosts and one lost host causing
>> all PGs to be degraded at the same time?
>>
> 
> I've read somewhere in the docs that I should provide ample space (tens of
> GB) for the store.db, found on the ML and Bugtracker that ~100GB might not
> be a bad idea and that large clusters may require space on order of
> magnitude greater.
> Is there some sort of formula I can use to approximate the space required?

I don't know about a formula, but make sure you have enough space. MONs
are dedicated nodes in most production environments, so I usually
install a 400 ~ 1000GB SSD just to make sure they don't run out of space.

> 
> Also: is the db supposed to grow this fast in Nautilus when it did not do
> that in Luminous? Is that behaviour configurable somewhere?
> 

The MONs need to cache the OSDMaps when not all PGs are active+clean
thus their database grows.

You can compact RocksDB in the meantime, but it won't last for ever.

Just make sure the MONs have enough space.

Wido

> 
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] reset pgs not deep-scrubbed in time

2020-03-10 Thread Stefan Priebe - Profihost AG
Hello,

is there any way to reset deep-scrubbed time for pgs?

The cluster was accidently in state nodeep-scrub and is now unable to
deep scrub fast enough.

Is there any way to force mark all pgs as deep scrubbed to start from 0
again?

Greets,
Stefan
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] cephfs snap mkdir strange timestamp

2020-03-10 Thread Marc Roos
 

If I make a directory in linux the directory has the date of now, why is 
this not with creating a snap dir? Is this not a bug? One expects this 
to be the same as in linux not

[ @ test]$ mkdir temp

[ @os0 test]$ ls -arltn
total 28
drwxrwxrwt. 27   0   0 20480 Mar 10 11:38 ..
drwxrwxr-x   2 801 801  4096 Mar 10 11:38 temp
drwxrwxr-x   3 801 801  4096 Mar 10 11:38 .


[ @ .snap]# mkdir test
[ @ .snap]# ls -lartn
total 0
drwxr-xr-x 861886554 0 0 8390344070786420358 Jan  1  1970 .
drwxr-xr-x 4 0 0   2 Mar  6 14:43 test
drwxr-xr-x 4 0 0   2 Mar  6 14:43 snap-9
drwxr-xr-x 4 0 0   2 Mar  6 14:43 snap-8
drwxr-xr-x 4 0 0   2 Mar  6 14:43 snap-7
drwxr-xr-x 4 0 0   2 Mar  6 14:43 snap-6
drwxr-xr-x 4 0 0   2 Mar  6 14:43 snap-5
drwxr-xr-x 4 0 0   2 Mar  6 14:43 snap-10
drwxr-xr-x 4 0 0   2 Mar  6 14:43 ..
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: ceph: Can't lookup inode 1 (err: -13)

2020-03-10 Thread Marc Roos


Nobody knows where this is coming from or had something similar?  

-Original Message-
To: ceph-users
Subject: [ceph-users] ceph: Can't lookup inode 1 (err: -13)


For testing purposes I changed the kernel 3.10 for a 5.5, now I am 
getting these messages. I assume the 3.10 was just never displaying 
these. Could this be a problem with my caps of the fs id user?

[Mon Mar  9 23:10:52 2020] ceph: Can't lookup inode 1 (err: -13)
[Mon Mar  9 23:12:03 2020] ceph: Can't lookup inode 1 (err: -13)
[Mon Mar  9 23:13:12 2020] ceph: Can't lookup inode 1 (err: -13)
[Mon Mar  9 23:14:19 2020] ceph: Can't lookup inode 1 (err: -13)


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: ceph: Can't lookup inode 1 (err: -13)

2020-03-10 Thread Paul Mezzanini
We see this constantly and the last time I looked for what it was I came to the 
conclusion that it's because I'm mounting below the root in cephfs and the 
kernel is trying to get quota information for the mount point.  That means it's 
trying to go above by one layer and it can't.

Annoying and harmless.

It makes perfect sense that it showed up for you when it did because there have 
been a lot of improvements in the cephfs quota kernel code.


--
Paul Mezzanini
Sr Systems Administrator / Engineer, Research Computing
Information & Technology Services
Finance & Administration
Rochester Institute of Technology
o:(585) 475-3245 | pfm...@rit.edu

Sent from my phone. Please excuse any brevity or typoos.

CONFIDENTIALITY NOTE: The information transmitted, including attachments, is
intended only for the person(s) or entity to which it is addressed and may
contain confidential and/or privileged material. Any review, retransmission,
dissemination or other use of, or taking of any action in reliance upon this
information by persons or entities other than the intended recipient is
prohibited. If you received this in error, please contact the sender and
destroy any copies of this information.



From: Marc Roos 
Sent: Tuesday, March 10, 2020 6:42:53 AM
To: ceph-users ; Marc Roos 
Subject: [ceph-users] Re: ceph: Can't lookup inode 1 (err: -13)


Nobody knows where this is coming from or had something similar?

-Original Message-
To: ceph-users
Subject: [ceph-users] ceph: Can't lookup inode 1 (err: -13)


For testing purposes I changed the kernel 3.10 for a 5.5, now I am
getting these messages. I assume the 3.10 was just never displaying
these. Could this be a problem with my caps of the fs id user?

[Mon Mar  9 23:10:52 2020] ceph: Can't lookup inode 1 (err: -13)
[Mon Mar  9 23:12:03 2020] ceph: Can't lookup inode 1 (err: -13)
[Mon Mar  9 23:13:12 2020] ceph: Can't lookup inode 1 (err: -13)
[Mon Mar  9 23:14:19 2020] ceph: Can't lookup inode 1 (err: -13)


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] rbd-mirror replay is very slow - but initial bootstrap is fast

2020-03-10 Thread Ml Ml
Hello List,

when i initially enable journal/mirror on an image it gets
bootstrapped to my site-b pretty quickly with 250MB/sec which is about
the IO Write limit.

Once its up2date, the replay is very slow. About 15KB/sec and the
entries_behind_maste is just running away:

root@ceph01:~# rbd --cluster backup mirror pool status rbd-cluster6 --verbose
health: OK
images: 3 total
3 replaying

...

vm-112-disk-0:
  global_id:   60a795c3-9f5d-4be3-b9bd-3df971e531fa
  state:   up+replaying
  description: replaying, master_position=[object_number=623,
tag_tid=3, entry_tid=345567], mirror_position=[object_number=35,
tag_tid=3, entry_tid=18371], entries_behind_master=327196
  last_update: 2020-03-10 11:36:44

...

Write traffic on the source is about 20/25MB/sec.

On the Source i run 14.2.6 and on the destination 12.2.13.

Any idea why the replaying is sooo slow?

Thanks,
Michael
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: cephfs snap mkdir strange timestamp

2020-03-10 Thread Paul Emmerich
There's an xattr for this: ceph.snap.btime IIRC

Paul

-- 
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90

On Tue, Mar 10, 2020 at 11:42 AM Marc Roos  wrote:
>
>
>
> If I make a directory in linux the directory has the date of now, why is
> this not with creating a snap dir? Is this not a bug? One expects this
> to be the same as in linux not
>
> [ @ test]$ mkdir temp
>
> [ @os0 test]$ ls -arltn
> total 28
> drwxrwxrwt. 27   0   0 20480 Mar 10 11:38 ..
> drwxrwxr-x   2 801 801  4096 Mar 10 11:38 temp
> drwxrwxr-x   3 801 801  4096 Mar 10 11:38 .
>
>
> [ @ .snap]# mkdir test
> [ @ .snap]# ls -lartn
> total 0
> drwxr-xr-x 861886554 0 0 8390344070786420358 Jan  1  1970 .
> drwxr-xr-x 4 0 0   2 Mar  6 14:43 test
> drwxr-xr-x 4 0 0   2 Mar  6 14:43 snap-9
> drwxr-xr-x 4 0 0   2 Mar  6 14:43 snap-8
> drwxr-xr-x 4 0 0   2 Mar  6 14:43 snap-7
> drwxr-xr-x 4 0 0   2 Mar  6 14:43 snap-6
> drwxr-xr-x 4 0 0   2 Mar  6 14:43 snap-5
> drwxr-xr-x 4 0 0   2 Mar  6 14:43 snap-10
> drwxr-xr-x 4 0 0   2 Mar  6 14:43 ..
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Nautilus cephfs usage

2020-03-10 Thread Yoann Moulin
Hello,

I have a Nautilus cluster with a cephfs volume, on grafana, it shows that 
cephfs_data pool is almost full[1] but if I give a look to the pool
usage, it looks like I have plenty of space. Which metrics are used by grafana?

1. https://framapic.org/5r7J86s55x6k/jGSIsjEUPYMU.png

pool usage:

> artemis@icitsrv5:~$ ceph df detail
> RAW STORAGE:
> CLASS SIZEAVAIL   USEDRAW USED %RAW USED 
> hdd   662 TiB 296 TiB 366 TiB  366 TiB 55.32 
> TOTAL 662 TiB 296 TiB 366 TiB  366 TiB 55.32 
>  
> POOLS:
> POOL   ID STORED  OBJECTS USED
> %USED MAX AVAIL QUOTA OBJECTS QUOTA BYTES DIRTY   USED 
> COMPR UNDER COMPR 
> .rgw.root   3 8.1 KiB  15 2.8 MiB 
> 063 TiB N/A   N/A  15
> 0 B 0 B 
> default.rgw.control 4 0 B   8 0 B 
> 063 TiB N/A   N/A   8
> 0 B 0 B 
> default.rgw.meta5  26 KiB  85  16 MiB 
> 063 TiB N/A   N/A  85
> 0 B 0 B 
> default.rgw.log 6 0 B 207 0 B 
> 063 TiB N/A   N/A 207
> 0 B 0 B 
> cephfs_data 7 113 TiB 139.34M 186 TiB 
> 49.47   138 TiB N/A   N/A 139.34M
> 0 B 0 B 
> cephfs_metadata 8  54 GiB  10.21M  57 GiB 
>  0.0363 TiB N/A   N/A  10.21M
> 0 B 0 B 
> default.rgw.buckets.data9 122 TiB  54.57M 173 TiB 
> 47.70   138 TiB N/A   N/A  54.57M
> 0 B 0 B 
> default.rgw.buckets.index  10 2.6 GiB  19.97k 2.6 GiB 
> 063 TiB N/A   N/A  19.97k
> 0 B 0 B 
> default.rgw.buckets.non-ec 11  67 MiB 186 102 MiB 
> 063 TiB N/A   N/A 186
> 0 B 0 B 
> device_health_metrics  12 1.2 MiB 145 1.2 MiB 
> 063 TiB N/A   N/A 145
> 0 B 0 B 

Best,

-- 
Yoann Moulin
EPFL IC-IT
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: rbd-mirror replay is very slow - but initial bootstrap is fast

2020-03-10 Thread Jason Dillaman
On Tue, Mar 10, 2020 at 6:47 AM Ml Ml  wrote:
>
> Hello List,
>
> when i initially enable journal/mirror on an image it gets
> bootstrapped to my site-b pretty quickly with 250MB/sec which is about
> the IO Write limit.
>
> Once its up2date, the replay is very slow. About 15KB/sec and the
> entries_behind_maste is just running away:
>
> root@ceph01:~# rbd --cluster backup mirror pool status rbd-cluster6 --verbose
> health: OK
> images: 3 total
> 3 replaying
>
> ...
>
> vm-112-disk-0:
>   global_id:   60a795c3-9f5d-4be3-b9bd-3df971e531fa
>   state:   up+replaying
>   description: replaying, master_position=[object_number=623,
> tag_tid=3, entry_tid=345567], mirror_position=[object_number=35,
> tag_tid=3, entry_tid=18371], entries_behind_master=327196
>   last_update: 2020-03-10 11:36:44
>
> ...
>
> Write traffic on the source is about 20/25MB/sec.
>
> On the Source i run 14.2.6 and on the destination 12.2.13.
>
> Any idea why the replaying is sooo slow?

What is the latency between the two clusters?

I would recommend increasing the "rbd_mirror_journal_max_fetch_bytes"
config setting (defaults to 32KiB) on your destination cluster. i.e.
try adding add "rbd_mirror_journal_max_fetch_bytes = 4194304" to the
"[client]" section of your Ceph configuration file on the node where
"rbd-mirror" daemon is running, and restart it. It defaults to a very
small read size from the remote cluster in a primitive attempt to
reduce the potential memory usage of the rbd-mirror daemon, but it has
the side-effect of slowing down mirroring for links with higher
latencies.

>
> Thanks,
> Michael
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>


-- 
Jason
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: ceph: Can't lookup inode 1 (err: -13)

2020-03-10 Thread Marc Roos
Ok thanks, for letting me know  

-Original Message-
To: ceph-users; 
Subject: Re: [ceph-users] Re: ceph: Can't lookup inode 1 (err: -13)

We see this constantly and the last time I looked for what it was I came 
to the conclusion that it's because I'm mounting below the root in 
cephfs and the kernel is trying to get quota information for the mount 
point.  That means it's trying to go above by one layer and it can't.  


Annoying and harmless. 


It makes perfect sense that it showed up for you when it did because 
there have been a lot of improvements in the cephfs quota kernel code.  



--

Paul Mezzanini

Sr Systems Administrator / Engineer, Research Computing

Information & Technology Services

Finance & Administration

Rochester Institute of Technology

o:(585) 475-3245 | pfm...@rit.edu


Sent from my phone. Please excuse any brevity or typoos. 


CONFIDENTIALITY NOTE: The information transmitted, including 
attachments, is

intended only for the person(s) or entity to which it is addressed and 
may

contain confidential and/or privileged material. Any review, 
retransmission,

dissemination or other use of, or taking of any action in reliance upon 
this

information by persons or entities other than the intended recipient is

prohibited. If you received this in error, please contact the sender and

destroy any copies of this information.





From: Marc Roos 
Sent: Tuesday, March 10, 2020 6:42:53 AM
To: ceph-users ; Marc Roos 

Subject: [ceph-users] Re: ceph: Can't lookup inode 1 (err: -13) 
 

Nobody knows where this is coming from or had something similar?  

-Original Message-
To: ceph-users
Subject: [ceph-users] ceph: Can't lookup inode 1 (err: -13)


For testing purposes I changed the kernel 3.10 for a 5.5, now I am 
getting these messages. I assume the 3.10 was just never displaying 
these. Could this be a problem with my caps of the fs id user?

[Mon Mar  9 23:10:52 2020] ceph: Can't lookup inode 1 (err: -13) [Mon 
Mar  9 23:12:03 2020] ceph: Can't lookup inode 1 (err: -13) [Mon Mar  9 
23:13:12 2020] ceph: Can't lookup inode 1 (err: -13) [Mon Mar  9 
23:14:19 2020] ceph: Can't lookup inode 1 (err: -13)


___
ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an 
email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: cephfs snap mkdir strange timestamp

2020-03-10 Thread Marc Roos
 
Hmmm, but typing a ls -lart is faster than having to lookup in my manual 
how to get such a thing with xattr. I honestly do not get the logics 
about applying everywhere the same date as the parent folder. Totally 
useless information stored. Might as well store nothing. 



-Original Message-
Sent: 10 March 2020 13:51
To: Marc Roos
Cc: ceph-users
Subject: Re: [ceph-users] cephfs snap mkdir strange timestamp

There's an xattr for this: ceph.snap.btime IIRC

Paul

--
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90

On Tue, Mar 10, 2020 at 11:42 AM Marc Roos  
wrote:
>
>
>
> If I make a directory in linux the directory has the date of now, why 
> is this not with creating a snap dir? Is this not a bug? One expects 
> this to be the same as in linux not
>
> [ @ test]$ mkdir temp
>
> [ @os0 test]$ ls -arltn
> total 28
> drwxrwxrwt. 27   0   0 20480 Mar 10 11:38 ..
> drwxrwxr-x   2 801 801  4096 Mar 10 11:38 temp
> drwxrwxr-x   3 801 801  4096 Mar 10 11:38 .
>
>
> [ @ .snap]# mkdir test
> [ @ .snap]# ls -lartn
> total 0
> drwxr-xr-x 861886554 0 0 8390344070786420358 Jan  1  1970 .
> drwxr-xr-x 4 0 0   2 Mar  6 14:43 test
> drwxr-xr-x 4 0 0   2 Mar  6 14:43 snap-9
> drwxr-xr-x 4 0 0   2 Mar  6 14:43 snap-8
> drwxr-xr-x 4 0 0   2 Mar  6 14:43 snap-7
> drwxr-xr-x 4 0 0   2 Mar  6 14:43 snap-6
> drwxr-xr-x 4 0 0   2 Mar  6 14:43 snap-5
> drwxr-xr-x 4 0 0   2 Mar  6 14:43 snap-10
> drwxr-xr-x 4 0 0   2 Mar  6 14:43 ..
> ___
> ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an 
> email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: rbd-mirror replay is very slow - but initial bootstrap is fast

2020-03-10 Thread Jason Dillaman
On Tue, Mar 10, 2020 at 10:36 AM Ml Ml  wrote:
>
> Hello Jason,
>
> thanks for that fast reply.
>
> This is now my /etc/ceph/ceph.conf
>
> [client]
> rbd_mirror_journal_max_fetch_bytes = 4194304
>
>
> I stopped and started my rbd-mirror manually with:
> rbd-mirror -d -c /etc/ceph/ceph.conf
>
> Still same result. Slow speed shown by iftop and entries_behind_master
> keeps increasing a lot if i produce 20MB/sec traffic on that
> replication image.
>
> The latency is like:
>  --- 10.10.50.1 ping statistics ---
> 100 packets transmitted, 100 received, 0% packet loss, time 20199ms
> rtt min/avg/max/mdev = 0.067/0.286/1.418/0.215 ms
>
> iperf from the source node to the destination node (where the
> rbd-mirror runs): 8.92 Gbits/sec
>
> Any other idea?

Do you know the average IO sizes against the primary image? Can you
create a similar image in the secondary cluster and run "fio" or "rbd
bench-write" against it using similar settings to verify that your
secondary cluster can handle the IO load? The initial image sync
portion will be issuing large, whole-object writes whereas the journal
replay will replay the writes exactly as written in the journal.

> Thanks,
> Michael
>
>
>
> On Tue, Mar 10, 2020 at 2:19 PM Jason Dillaman  wrote:
> >
> > On Tue, Mar 10, 2020 at 6:47 AM Ml Ml  wrote:
> > >
> > > Hello List,
> > >
> > > when i initially enable journal/mirror on an image it gets
> > > bootstrapped to my site-b pretty quickly with 250MB/sec which is about
> > > the IO Write limit.
> > >
> > > Once its up2date, the replay is very slow. About 15KB/sec and the
> > > entries_behind_maste is just running away:
> > >
> > > root@ceph01:~# rbd --cluster backup mirror pool status rbd-cluster6 
> > > --verbose
> > > health: OK
> > > images: 3 total
> > > 3 replaying
> > >
> > > ...
> > >
> > > vm-112-disk-0:
> > >   global_id:   60a795c3-9f5d-4be3-b9bd-3df971e531fa
> > >   state:   up+replaying
> > >   description: replaying, master_position=[object_number=623,
> > > tag_tid=3, entry_tid=345567], mirror_position=[object_number=35,
> > > tag_tid=3, entry_tid=18371], entries_behind_master=327196
> > >   last_update: 2020-03-10 11:36:44
> > >
> > > ...
> > >
> > > Write traffic on the source is about 20/25MB/sec.
> > >
> > > On the Source i run 14.2.6 and on the destination 12.2.13.
> > >
> > > Any idea why the replaying is sooo slow?
> >
> > What is the latency between the two clusters?
> >
> > I would recommend increasing the "rbd_mirror_journal_max_fetch_bytes"
> > config setting (defaults to 32KiB) on your destination cluster. i.e.
> > try adding add "rbd_mirror_journal_max_fetch_bytes = 4194304" to the
> > "[client]" section of your Ceph configuration file on the node where
> > "rbd-mirror" daemon is running, and restart it. It defaults to a very
> > small read size from the remote cluster in a primitive attempt to
> > reduce the potential memory usage of the rbd-mirror daemon, but it has
> > the side-effect of slowing down mirroring for links with higher
> > latencies.
> >
> > >
> > > Thanks,
> > > Michael
> > > ___
> > > ceph-users mailing list -- ceph-users@ceph.io
> > > To unsubscribe send an email to ceph-users-le...@ceph.io
> > >
> >
> >
> > --
> > Jason
> >
>


-- 
Jason
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: rbd-mirror replay is very slow - but initial bootstrap is fast

2020-03-10 Thread Ml Ml
Hello Jason,

thanks for that fast reply.

This is now my /etc/ceph/ceph.conf

[client]
rbd_mirror_journal_max_fetch_bytes = 4194304


I stopped and started my rbd-mirror manually with:
rbd-mirror -d -c /etc/ceph/ceph.conf

Still same result. Slow speed shown by iftop and entries_behind_master
keeps increasing a lot if i produce 20MB/sec traffic on that
replication image.

The latency is like:
 --- 10.10.50.1 ping statistics ---
100 packets transmitted, 100 received, 0% packet loss, time 20199ms
rtt min/avg/max/mdev = 0.067/0.286/1.418/0.215 ms

iperf from the source node to the destination node (where the
rbd-mirror runs): 8.92 Gbits/sec

Any other idea?

Thanks,
Michael



On Tue, Mar 10, 2020 at 2:19 PM Jason Dillaman  wrote:
>
> On Tue, Mar 10, 2020 at 6:47 AM Ml Ml  wrote:
> >
> > Hello List,
> >
> > when i initially enable journal/mirror on an image it gets
> > bootstrapped to my site-b pretty quickly with 250MB/sec which is about
> > the IO Write limit.
> >
> > Once its up2date, the replay is very slow. About 15KB/sec and the
> > entries_behind_maste is just running away:
> >
> > root@ceph01:~# rbd --cluster backup mirror pool status rbd-cluster6 
> > --verbose
> > health: OK
> > images: 3 total
> > 3 replaying
> >
> > ...
> >
> > vm-112-disk-0:
> >   global_id:   60a795c3-9f5d-4be3-b9bd-3df971e531fa
> >   state:   up+replaying
> >   description: replaying, master_position=[object_number=623,
> > tag_tid=3, entry_tid=345567], mirror_position=[object_number=35,
> > tag_tid=3, entry_tid=18371], entries_behind_master=327196
> >   last_update: 2020-03-10 11:36:44
> >
> > ...
> >
> > Write traffic on the source is about 20/25MB/sec.
> >
> > On the Source i run 14.2.6 and on the destination 12.2.13.
> >
> > Any idea why the replaying is sooo slow?
>
> What is the latency between the two clusters?
>
> I would recommend increasing the "rbd_mirror_journal_max_fetch_bytes"
> config setting (defaults to 32KiB) on your destination cluster. i.e.
> try adding add "rbd_mirror_journal_max_fetch_bytes = 4194304" to the
> "[client]" section of your Ceph configuration file on the node where
> "rbd-mirror" daemon is running, and restart it. It defaults to a very
> small read size from the remote cluster in a primitive attempt to
> reduce the potential memory usage of the rbd-mirror daemon, but it has
> the side-effect of slowing down mirroring for links with higher
> latencies.
>
> >
> > Thanks,
> > Michael
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> >
>
>
> --
> Jason
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: rbd-mirror replay is very slow - but initial bootstrap is fast

2020-03-10 Thread Ml Ml
Hello Jason,

okay, good hint!

I did not realize, that it will write the journal 1:1 but that makes
sense. I will benchmark it later.

However, my backup cluster is the place where the old spinning rust
will find its last dedication.
Therefore it will never be as fast as the live cluster.

Looking that the modes, i should change from Journal-based to
Snapshot-based mirroring?

Thanks,
Michael

On Tue, Mar 10, 2020 at 3:43 PM Jason Dillaman  wrote:
>
> On Tue, Mar 10, 2020 at 10:36 AM Ml Ml  wrote:
> >
> > Hello Jason,
> >
> > thanks for that fast reply.
> >
> > This is now my /etc/ceph/ceph.conf
> >
> > [client]
> > rbd_mirror_journal_max_fetch_bytes = 4194304
> >
> >
> > I stopped and started my rbd-mirror manually with:
> > rbd-mirror -d -c /etc/ceph/ceph.conf
> >
> > Still same result. Slow speed shown by iftop and entries_behind_master
> > keeps increasing a lot if i produce 20MB/sec traffic on that
> > replication image.
> >
> > The latency is like:
> >  --- 10.10.50.1 ping statistics ---
> > 100 packets transmitted, 100 received, 0% packet loss, time 20199ms
> > rtt min/avg/max/mdev = 0.067/0.286/1.418/0.215 ms
> >
> > iperf from the source node to the destination node (where the
> > rbd-mirror runs): 8.92 Gbits/sec
> >
> > Any other idea?
>
> Do you know the average IO sizes against the primary image? Can you
> create a similar image in the secondary cluster and run "fio" or "rbd
> bench-write" against it using similar settings to verify that your
> secondary cluster can handle the IO load? The initial image sync
> portion will be issuing large, whole-object writes whereas the journal
> replay will replay the writes exactly as written in the journal.
>
> > Thanks,
> > Michael
> >
> >
> >
> > On Tue, Mar 10, 2020 at 2:19 PM Jason Dillaman  wrote:
> > >
> > > On Tue, Mar 10, 2020 at 6:47 AM Ml Ml  wrote:
> > > >
> > > > Hello List,
> > > >
> > > > when i initially enable journal/mirror on an image it gets
> > > > bootstrapped to my site-b pretty quickly with 250MB/sec which is about
> > > > the IO Write limit.
> > > >
> > > > Once its up2date, the replay is very slow. About 15KB/sec and the
> > > > entries_behind_maste is just running away:
> > > >
> > > > root@ceph01:~# rbd --cluster backup mirror pool status rbd-cluster6 
> > > > --verbose
> > > > health: OK
> > > > images: 3 total
> > > > 3 replaying
> > > >
> > > > ...
> > > >
> > > > vm-112-disk-0:
> > > >   global_id:   60a795c3-9f5d-4be3-b9bd-3df971e531fa
> > > >   state:   up+replaying
> > > >   description: replaying, master_position=[object_number=623,
> > > > tag_tid=3, entry_tid=345567], mirror_position=[object_number=35,
> > > > tag_tid=3, entry_tid=18371], entries_behind_master=327196
> > > >   last_update: 2020-03-10 11:36:44
> > > >
> > > > ...
> > > >
> > > > Write traffic on the source is about 20/25MB/sec.
> > > >
> > > > On the Source i run 14.2.6 and on the destination 12.2.13.
> > > >
> > > > Any idea why the replaying is sooo slow?
> > >
> > > What is the latency between the two clusters?
> > >
> > > I would recommend increasing the "rbd_mirror_journal_max_fetch_bytes"
> > > config setting (defaults to 32KiB) on your destination cluster. i.e.
> > > try adding add "rbd_mirror_journal_max_fetch_bytes = 4194304" to the
> > > "[client]" section of your Ceph configuration file on the node where
> > > "rbd-mirror" daemon is running, and restart it. It defaults to a very
> > > small read size from the remote cluster in a primitive attempt to
> > > reduce the potential memory usage of the rbd-mirror daemon, but it has
> > > the side-effect of slowing down mirroring for links with higher
> > > latencies.
> > >
> > > >
> > > > Thanks,
> > > > Michael
> > > > ___
> > > > ceph-users mailing list -- ceph-users@ceph.io
> > > > To unsubscribe send an email to ceph-users-le...@ceph.io
> > > >
> > >
> > >
> > > --
> > > Jason
> > >
> >
>
>
> --
> Jason
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: rbd-mirror replay is very slow - but initial bootstrap is fast

2020-03-10 Thread Jason Dillaman
On Tue, Mar 10, 2020 at 11:53 AM Ml Ml  wrote:
>
> Hello Jason,
>
> okay, good hint!
>
> I did not realize, that it will write the journal 1:1 but that makes
> sense. I will benchmark it later.

Yes, it's replaying the exact IOs again to ensure it's point-in-time consistent.

> However, my backup cluster is the place where the old spinning rust
> will find its last dedication.
> Therefore it will never be as fast as the live cluster.
>
> Looking that the modes, i should change from Journal-based to
> Snapshot-based mirroring?

Well, snapshot-based mirroring hasn't been released yet (technically)
since it's new with Octopus. It might be better in such an
environment, however, since it has the potential to reduce the number
of IOs.

> Thanks,
> Michael
>
> On Tue, Mar 10, 2020 at 3:43 PM Jason Dillaman  wrote:
> >
> > On Tue, Mar 10, 2020 at 10:36 AM Ml Ml  wrote:
> > >
> > > Hello Jason,
> > >
> > > thanks for that fast reply.
> > >
> > > This is now my /etc/ceph/ceph.conf
> > >
> > > [client]
> > > rbd_mirror_journal_max_fetch_bytes = 4194304
> > >
> > >
> > > I stopped and started my rbd-mirror manually with:
> > > rbd-mirror -d -c /etc/ceph/ceph.conf
> > >
> > > Still same result. Slow speed shown by iftop and entries_behind_master
> > > keeps increasing a lot if i produce 20MB/sec traffic on that
> > > replication image.
> > >
> > > The latency is like:
> > >  --- 10.10.50.1 ping statistics ---
> > > 100 packets transmitted, 100 received, 0% packet loss, time 20199ms
> > > rtt min/avg/max/mdev = 0.067/0.286/1.418/0.215 ms
> > >
> > > iperf from the source node to the destination node (where the
> > > rbd-mirror runs): 8.92 Gbits/sec
> > >
> > > Any other idea?
> >
> > Do you know the average IO sizes against the primary image? Can you
> > create a similar image in the secondary cluster and run "fio" or "rbd
> > bench-write" against it using similar settings to verify that your
> > secondary cluster can handle the IO load? The initial image sync
> > portion will be issuing large, whole-object writes whereas the journal
> > replay will replay the writes exactly as written in the journal.
> >
> > > Thanks,
> > > Michael
> > >
> > >
> > >
> > > On Tue, Mar 10, 2020 at 2:19 PM Jason Dillaman  
> > > wrote:
> > > >
> > > > On Tue, Mar 10, 2020 at 6:47 AM Ml Ml  
> > > > wrote:
> > > > >
> > > > > Hello List,
> > > > >
> > > > > when i initially enable journal/mirror on an image it gets
> > > > > bootstrapped to my site-b pretty quickly with 250MB/sec which is about
> > > > > the IO Write limit.
> > > > >
> > > > > Once its up2date, the replay is very slow. About 15KB/sec and the
> > > > > entries_behind_maste is just running away:
> > > > >
> > > > > root@ceph01:~# rbd --cluster backup mirror pool status rbd-cluster6 
> > > > > --verbose
> > > > > health: OK
> > > > > images: 3 total
> > > > > 3 replaying
> > > > >
> > > > > ...
> > > > >
> > > > > vm-112-disk-0:
> > > > >   global_id:   60a795c3-9f5d-4be3-b9bd-3df971e531fa
> > > > >   state:   up+replaying
> > > > >   description: replaying, master_position=[object_number=623,
> > > > > tag_tid=3, entry_tid=345567], mirror_position=[object_number=35,
> > > > > tag_tid=3, entry_tid=18371], entries_behind_master=327196
> > > > >   last_update: 2020-03-10 11:36:44
> > > > >
> > > > > ...
> > > > >
> > > > > Write traffic on the source is about 20/25MB/sec.
> > > > >
> > > > > On the Source i run 14.2.6 and on the destination 12.2.13.
> > > > >
> > > > > Any idea why the replaying is sooo slow?
> > > >
> > > > What is the latency between the two clusters?
> > > >
> > > > I would recommend increasing the "rbd_mirror_journal_max_fetch_bytes"
> > > > config setting (defaults to 32KiB) on your destination cluster. i.e.
> > > > try adding add "rbd_mirror_journal_max_fetch_bytes = 4194304" to the
> > > > "[client]" section of your Ceph configuration file on the node where
> > > > "rbd-mirror" daemon is running, and restart it. It defaults to a very
> > > > small read size from the remote cluster in a primitive attempt to
> > > > reduce the potential memory usage of the rbd-mirror daemon, but it has
> > > > the side-effect of slowing down mirroring for links with higher
> > > > latencies.
> > > >
> > > > >
> > > > > Thanks,
> > > > > Michael
> > > > > ___
> > > > > ceph-users mailing list -- ceph-users@ceph.io
> > > > > To unsubscribe send an email to ceph-users-le...@ceph.io
> > > > >
> > > >
> > > >
> > > > --
> > > > Jason
> > > >
> > >
> >
> >
> > --
> > Jason
> >
>


-- 
Jason
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: rbd-mirror replay is very slow - but initial bootstrap is fast

2020-03-10 Thread Anthony D'Atri
FWIW when using rbd-mirror to migrate volumes between SATA SSD clusters, I 
found that 


   rbd_mirror_journal_max_fetch_bytes:
section: "client"
value: "33554432"

  rbd_journal_max_payload_bytes:
section: "client"
value: “8388608"

Made a world of difference in expediting journal reply on Luminous 12.2.2.  
With defaults, some active voumes would take hours to converge, and a couple 
were falling even more behind.

This was mirroring 1 to 2 volumes at a time.  YMMV.




> On Mar 10, 2020, at 7:36 AM, Ml Ml  wrote:
> 
> Hello Jason,
> 
> thanks for that fast reply.
> 
> This is now my /etc/ceph/ceph.conf
> 
> [client]
> rbd_mirror_journal_max_fetch_bytes = 4194304
> 
> 
> I stopped and started my rbd-mirror manually with:
> rbd-mirror -d -c /etc/ceph/ceph.conf
> 
> Still same result. Slow speed shown by iftop and entries_behind_master
> keeps increasing a lot if i produce 20MB/sec traffic on that
> replication image.
> 
> The latency is like:
> --- 10.10.50.1 ping statistics ---
> 100 packets transmitted, 100 received, 0% packet loss, time 20199ms
> rtt min/avg/max/mdev = 0.067/0.286/1.418/0.215 ms
> 
> iperf from the source node to the destination node (where the
> rbd-mirror runs): 8.92 Gbits/sec
> 
> Any other idea?
> 
> Thanks,
> Michael
> 
> 
> 
> On Tue, Mar 10, 2020 at 2:19 PM Jason Dillaman  wrote:
>> 
>> On Tue, Mar 10, 2020 at 6:47 AM Ml Ml  wrote:
>>> 
>>> Hello List,
>>> 
>>> when i initially enable journal/mirror on an image it gets
>>> bootstrapped to my site-b pretty quickly with 250MB/sec which is about
>>> the IO Write limit.
>>> 
>>> Once its up2date, the replay is very slow. About 15KB/sec and the
>>> entries_behind_maste is just running away:
>>> 
>>> root@ceph01:~# rbd --cluster backup mirror pool status rbd-cluster6 
>>> --verbose
>>> health: OK
>>> images: 3 total
>>>3 replaying
>>> 
>>> ...
>>> 
>>> vm-112-disk-0:
>>>  global_id:   60a795c3-9f5d-4be3-b9bd-3df971e531fa
>>>  state:   up+replaying
>>>  description: replaying, master_position=[object_number=623,
>>> tag_tid=3, entry_tid=345567], mirror_position=[object_number=35,
>>> tag_tid=3, entry_tid=18371], entries_behind_master=327196
>>>  last_update: 2020-03-10 11:36:44
>>> 
>>> ...
>>> 
>>> Write traffic on the source is about 20/25MB/sec.
>>> 
>>> On the Source i run 14.2.6 and on the destination 12.2.13.
>>> 
>>> Any idea why the replaying is sooo slow?
>> 
>> What is the latency between the two clusters?
>> 
>> I would recommend increasing the "rbd_mirror_journal_max_fetch_bytes"
>> config setting (defaults to 32KiB) on your destination cluster. i.e.
>> try adding add "rbd_mirror_journal_max_fetch_bytes = 4194304" to the
>> "[client]" section of your Ceph configuration file on the node where
>> "rbd-mirror" daemon is running, and restart it. It defaults to a very
>> small read size from the remote cluster in a primitive attempt to
>> reduce the potential memory usage of the rbd-mirror daemon, but it has
>> the side-effect of slowing down mirroring for links with higher
>> latencies.
>> 
>>> 
>>> Thanks,
>>> Michael
>>> ___
>>> ceph-users mailing list -- ceph-users@ceph.io
>>> To unsubscribe send an email to ceph-users-le...@ceph.io
>>> 
>> 
>> 
>> --
>> Jason
>> 
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: rbd-mirror replay is very slow - but initial bootstrap is fast

2020-03-10 Thread Jason Dillaman
On Tue, Mar 10, 2020 at 2:31 PM Anthony D'Atri  wrote:
>
> FWIW when using rbd-mirror to migrate volumes between SATA SSD clusters, I 
> found that
>
>
>rbd_mirror_journal_max_fetch_bytes:
> section: "client"
> value: "33554432"
>
>   rbd_journal_max_payload_bytes:
> section: "client"
> value: “8388608"

Indeed, that's a good tweak that applies to the primary-side librbd
client for the mirrored image for IO workloads that routinely issue
large (> than the 16KiB default), sequential writes. This was another
compromise configuration setting to reduce the potential memory
footprint of the rbd-mirror daemon.

> Made a world of difference in expediting journal reply on Luminous 12.2.2.  
> With defaults, some active voumes would take hours to converge, and a couple 
> were falling even more behind.
>
> This was mirroring 1 to 2 volumes at a time.  YMMV.
>
>
>
>
> > On Mar 10, 2020, at 7:36 AM, Ml Ml  wrote:
> >
> > Hello Jason,
> >
> > thanks for that fast reply.
> >
> > This is now my /etc/ceph/ceph.conf
> >
> > [client]
> > rbd_mirror_journal_max_fetch_bytes = 4194304
> >
> >
> > I stopped and started my rbd-mirror manually with:
> > rbd-mirror -d -c /etc/ceph/ceph.conf
> >
> > Still same result. Slow speed shown by iftop and entries_behind_master
> > keeps increasing a lot if i produce 20MB/sec traffic on that
> > replication image.
> >
> > The latency is like:
> > --- 10.10.50.1 ping statistics ---
> > 100 packets transmitted, 100 received, 0% packet loss, time 20199ms
> > rtt min/avg/max/mdev = 0.067/0.286/1.418/0.215 ms
> >
> > iperf from the source node to the destination node (where the
> > rbd-mirror runs): 8.92 Gbits/sec
> >
> > Any other idea?
> >
> > Thanks,
> > Michael
> >
> >
> >
> > On Tue, Mar 10, 2020 at 2:19 PM Jason Dillaman  wrote:
> >>
> >> On Tue, Mar 10, 2020 at 6:47 AM Ml Ml  wrote:
> >>>
> >>> Hello List,
> >>>
> >>> when i initially enable journal/mirror on an image it gets
> >>> bootstrapped to my site-b pretty quickly with 250MB/sec which is about
> >>> the IO Write limit.
> >>>
> >>> Once its up2date, the replay is very slow. About 15KB/sec and the
> >>> entries_behind_maste is just running away:
> >>>
> >>> root@ceph01:~# rbd --cluster backup mirror pool status rbd-cluster6 
> >>> --verbose
> >>> health: OK
> >>> images: 3 total
> >>>3 replaying
> >>>
> >>> ...
> >>>
> >>> vm-112-disk-0:
> >>>  global_id:   60a795c3-9f5d-4be3-b9bd-3df971e531fa
> >>>  state:   up+replaying
> >>>  description: replaying, master_position=[object_number=623,
> >>> tag_tid=3, entry_tid=345567], mirror_position=[object_number=35,
> >>> tag_tid=3, entry_tid=18371], entries_behind_master=327196
> >>>  last_update: 2020-03-10 11:36:44
> >>>
> >>> ...
> >>>
> >>> Write traffic on the source is about 20/25MB/sec.
> >>>
> >>> On the Source i run 14.2.6 and on the destination 12.2.13.
> >>>
> >>> Any idea why the replaying is sooo slow?
> >>
> >> What is the latency between the two clusters?
> >>
> >> I would recommend increasing the "rbd_mirror_journal_max_fetch_bytes"
> >> config setting (defaults to 32KiB) on your destination cluster. i.e.
> >> try adding add "rbd_mirror_journal_max_fetch_bytes = 4194304" to the
> >> "[client]" section of your Ceph configuration file on the node where
> >> "rbd-mirror" daemon is running, and restart it. It defaults to a very
> >> small read size from the remote cluster in a primitive attempt to
> >> reduce the potential memory usage of the rbd-mirror daemon, but it has
> >> the side-effect of slowing down mirroring for links with higher
> >> latencies.
> >>
> >>>
> >>> Thanks,
> >>> Michael
> >>> ___
> >>> ceph-users mailing list -- ceph-users@ceph.io
> >>> To unsubscribe send an email to ceph-users-le...@ceph.io
> >>>
> >>
> >>
> >> --
> >> Jason
> >>
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io



-- 
Jason
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Possible bug with rbd export/import?

2020-03-10 Thread Matt Dunavant
Hello,

I think I've been running into an rbd export/import bug and wanted to see if 
anybody else had any experience.

We're using rbd images for VM drives both with and without custom stripe sizes. 
When we try to export/import the drive to another ceph cluster, the VM always 
comes up in a busted state it can't recover from. This happens both when doing 
this export/import through stdin/stdout and when using a middle machine as a 
temp space. I remember doing this a few times in previous versions without 
error, so I'm not sure if this is a regression or I'm doing something 
different. I'm still testing this to try and track down where the issue is but 
wanted to post this here to see if anybody else has any experience. 

Example command: rbd -c /etc/ceph/cluster1.conf export pool/testvm.boot - | rbd 
-c /etc/ceph/cluster2.conf import - pool/testvm.boot

Current cluster is on 14.2.8 and using Ubuntu 18.04 w/ 5.3.0-40-generic. 

Let me know if I can provide any more details to help track this down.

Thanks,
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Possible bug with rbd export/import?

2020-03-10 Thread Jack
Hi,

Are you exporting rbd image while a VM is running upon it ?
As far as I know, rbd export is not consistent

You should not export an image, but only snapshots:
- create a snapshot of the image
- export the snapshot (rbd export pool/image@snap - | ..)
- drop the snapshot

Regards,


On 3/10/20 8:31 PM, Matt Dunavant wrote:
> Hello,
> 
> I think I've been running into an rbd export/import bug and wanted to see if 
> anybody else had any experience.
> 
> We're using rbd images for VM drives both with and without custom stripe 
> sizes. When we try to export/import the drive to another ceph cluster, the VM 
> always comes up in a busted state it can't recover from. This happens both 
> when doing this export/import through stdin/stdout and when using a middle 
> machine as a temp space. I remember doing this a few times in previous 
> versions without error, so I'm not sure if this is a regression or I'm doing 
> something different. I'm still testing this to try and track down where the 
> issue is but wanted to post this here to see if anybody else has any 
> experience. 
> 
> Example command: rbd -c /etc/ceph/cluster1.conf export pool/testvm.boot - | 
> rbd -c /etc/ceph/cluster2.conf import - pool/testvm.boot
> 
> Current cluster is on 14.2.8 and using Ubuntu 18.04 w/ 5.3.0-40-generic. 
> 
> Let me know if I can provide any more details to help track this down.
> 
> Thanks,
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
> 
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Possible bug with rbd export/import?

2020-03-10 Thread Simon Ironside

On 10/03/2020 19:31, Matt Dunavant wrote:


We're using rbd images for VM drives both with and without custom stripe sizes. 
When we try to export/import the drive to another ceph cluster, the VM always 
comes up in a busted state it can't recover from.


Don't shoot me for asking but is the VM being exported still started up 
and in use? Asking since you don't mention using snapshots.


Simon
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Rados example: create namespace, user for this namespace, read and write objects with created namespace and user

2020-03-10 Thread Rodrigo Severo - Fábrica
Hi,


I'm trying to create a namespace in rados, create a user that has
access to this created namespace and with rados command line utility
read and write objects in this created namespace using the created
user.

I can't find an example on how to do it.

Can someone point me to such example or show me how to do it?


Regards,

Rodrigo Severo
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] FW: Warning: could not send message for past 4 hours

2020-03-10 Thread Marc Roos
 

-Original Message-
From: Mail Delivery Subsystem [mailto:MAILER-DAEMON] 
Sent: 10 March 2020 19:01
Subject: Warning: could not send message for past 4 hours

**
**  THIS IS A WARNING MESSAGE ONLY  **
**  YOU DO NOT NEED TO RESEND YOUR MESSAGE  **
**

The original message was received at Tue, 10 Mar 2020 15:00:07 +0100 
from localhost.localdomain [127.0.0.1]

   - Transcript of session follows -
451 croit.io: Name server timeout
Warning: message still undelivered after 4 hours Will keep trying until 
message is 5 days old

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: rbd-mirror replay is very slow - but initial bootstrap is fast

2020-03-10 Thread Anthony D'Atri

>> 
>> FWIW when using rbd-mirror to migrate volumes between SATA SSD clusters, I 
>> found that
>> 
>> 
>>   rbd_mirror_journal_max_fetch_bytes:
>>section: "client"
>>value: "33554432"
>> 
>>  rbd_journal_max_payload_bytes:
>>section: "client"
>>value: “8388608"
> 
> Indeed, that's a good tweak that applies to the primary-side librbd
> client for the mirrored image for IO workloads that routinely issue
> large (> than the 16KiB default), sequential writes. This was another
> compromise configuration setting to reduce the potential memory
> footprint of the rbd-mirror daemon.

Direct advice from you last year ;)

Extrapolating for those who haven’t done much with rbd-mirror, or who find this 
thread in the future, these settings worked well for me migrating at most 2 
active volumes at once, ones where I had no insight into client activity.  YMMV.

Setting these specific values when mirroring an entire pool could well be 
doubleplusungood.

— aad
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Rados example: create namespace, user for this namespace, read and write objects with created namespace and user

2020-03-10 Thread JC Lopez
Hi

no need to craete a namespace. You just specify the namespace you want to 
access.

See https://docs.ceph.com/docs/nautilus/man/8/rados/ the -N cli option

For access to a particular namespace have a look at the example here: 
https://docs.ceph.com/docs/nautilus/rados/operations/user-management/#modify-user-capabilities

Regards
JC


> On Mar 10, 2020, at 13:10, Rodrigo Severo - Fábrica 
>  wrote:
> 
> Hi,
> 
> 
> I'm trying to create a namespace in rados, create a user that has
> access to this created namespace and with rados command line utility
> read and write objects in this created namespace using the created
> user.
> 
> I can't find an example on how to do it.
> 
> Can someone point me to such example or show me how to do it?
> 
> 
> Regards,
> 
> Rodrigo Severo
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
> 
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Bucket notification with kafka error

2020-03-10 Thread 曹 海旺
 HI,
I'm sorry to bother you again

I want to use kafka to queue the notifications , I add a topic named kafka,and 
put the notification config xml

The topic info:

https://sns.amazonaws.com/doc/2010-03-31/";>



sr
kafka


kafka://192.168.3.250:9092

kafka-ack-level=broker&push-endpoint=kafka://192.168.3.250:9092
kafka

arn:aws:sns:default::kafka


sr
kafka_kafka


kafka://192.168.3.250:9092

kafka-ack-level=broker&push-endpoint=kafka://192.168.3.250:9092
kafka

arn:aws:sns:default::kafka


sr
webno


http://192.168.1.114:8080/s3/sn

push-endpoint=http://192.168.1.114:8080/s3/sn
webno

arn:aws:sns:default::webno




c4b84c5b-1e88-4f16-9863-7f68872d91a4.744394.135






and the put notification body :
http://s3.amazonaws.com/doc/2006-03-01/";> 
 kafka arn:aws:sns:default::kafka 
 

The web notification works fine ,but when I use the kafka(version 1.0 jdk 1.8)
I got the debug info :

020-03-11 12:46:38.612 7fd81eeb1700 20 get_system_obj_state: s->obj_tag was set 
empty
2020-03-11 12:46:38.612 7fd81eeb1700 10 cache get: 
name=default.rgw.log++pubsub.user.sr.bucket.osstest/c4b84c5b-1e88-4f16-9863-7f68872d91a4.14175.1
 : hit (requested=0x1, cached=0x17)
2020-03-11 12:46:38.612 7fd81eeb1700 20 notification: 'kafka' on topic: 'kafka' 
and bucket: 'osstest' (unique topic: 'kafka_kafka') apply to event of type: 
's3:ObjectCreated:Put'
2020-03-11 12:46:38.612 7fd81eeb1700  1 ERROR: failed to create push endpoint: 
kafka://192.168.3.250:9092 due to: pubsub endpoint configuration error: unknown 
schema in: kafka://192.168.3.250:9092
2020-03-11 12:46:38.612 7fd81eeb1700  5 req 126 0.186s s3:put_obj WARNING: 
publishing notification failed, with error: -22
2020-03-11 12:46:38.612 7fd81eeb1700  2


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io