[ceph-users] Re: Data loss on appends, prod outage

2021-09-10 Thread 胡 玮文
Thanks for sharing this. Following this thread, I realize we are also affected 
by this bug. We have multiple reports on corrupted tensorboard event file, 
which I think are caused by this bug.

We are using Ubuntu 20.04, the affected kernel version should be HWE kernel > 
5.11 and < 5.11.0-34. The fix for Ubuntu kernel is here: 
https://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/focal/commit/fs/ceph/addr.c?h=hwe-5.11&id=353cafd20b8c28423aeec0c474dab80dbcec3c44

Now we are working on upgrade every client to 5.11.0-34-generic.

Weiwen Hu

发件人: Nathan Fish
发送时间: 2021年9月9日 2:41
收件人: ceph-users
主题: [ceph-users] Re: Data loss on appends, prod outage

The bug appears to have already been reported:
https://apac01.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftracker.ceph.com%2Fissues%2F51948&data=04%7C01%7C%7Ceaa1b6aa0a6d4b04f17008d972f833b2%7C84df9e7fe9f640afb435%7C1%7C0%7C637667232638555408%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=ng81sR414%2F5fD8fDOTNiX4MTRUMTTQiyetkM%2F0F5kt8%3D&reserved=0

Also, it should be noted that the write append bug does sometimes
occur when writing from a single client, so controlling write patterns
is not sufficient to stop data loss.

On Wed, Sep 8, 2021 at 1:39 PM Frank Schilder  wrote:
>
> Can you make the devs aware of the regression?
>
> Best regards,
> =
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
>
> 
> From: Nathan Fish 
> Sent: 08 September 2021 19:33
> To: ceph-users
> Subject: [ceph-users] Re: Data loss on appends, prod outage
>
> Rolling back to kernel 5.4 has resolved the issue.
>
> On Tue, Sep 7, 2021 at 3:51 PM Frank Schilder  wrote:
> >
> > Hi Nathan,
> >
> > > Is this the bug you are referring to? 
> > > https://apac01.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftracker.ceph.com%2Fissues%2F37713&data=04%7C01%7C%7Ceaa1b6aa0a6d4b04f17008d972f833b2%7C84df9e7fe9f640afb435%7C1%7C0%7C637667232638555408%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=L2%2Fz01BiWJCShilUErHJV%2FpD78GujjkJq3j2uMH257c%3D&reserved=0
> >
> > yes, its one of them. I believe there were more such reports.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] mon stucks on probing and out of quorum, after down and restart

2021-09-10 Thread mk
Hi CephFolks,

I have a cluster 14.2.21-22/Ubuntu 18.04 with 3 mon’s. After going down/restart 
of 1 mon(amon3) it stucks on probing and its also out of quorum, we have 
changed nothing and it was working, regardless we have checked tcp ports / 
mtu's of mons are open and reachable.
appreciate any help you can provide.

ceph health detail
HEALTH_WARN 1/3 mons down, quorum amon1,amon2
MON_DOWN 1/3 mons down, quorum amon1,amon2
mon.amon3 (rank 0) addr [v2:172.16.12.83:3300/0,v1:172.16.12.83:6789/0] is 
down (out of quorum)

all mon daemons are up and running, monmaptool printout is also equal on all 
mon nodes and on other healthy mon nodes log there are entries from amon3:

Log from healthy amon1 log:

2021-09-10 10:49:16.657 7fc8285f3700  0 log_channel(audit) log [DBG] : 
from='client.? 172.16.12.83:0/245515484' entity='client.admin' cmd=[{"prefix": 
"status"}]: dispatch
2021-09-10 11:04:16.586 7fc8285f3700  0 log_channel(audit) log [DBG] : 
from='client.? 172.16.12.83:0/2472330429' entity='client.admin' cmd=[{"prefix": 
"status"}]: dispatch
2021-09-10 11:06:06.245 7fc8285f3700  0 log_channel(audit) log [DBG] : 
from='client.? 172.16.12.83:0/161156' entity='client.admin' cmd=[{"prefix": 
"mon getmap"}]: dispatch
2021-09-10 11:19:16.581 7fc8285f3700  0 log_channel(audit) log [DBG] : 
from='client.? 172.16.12.83:0/964654151' entity='client.admin' cmd=[{"prefix": 
"status"}]: dispatch

#

monmaptool --print /tmp/monmap
monmaptool: monmap file /tmp/monmap
epoch 14
fsid 10a40588-d89e-4fac-be88-afcedc1a7372
last_changed 2021-06-22 13:35:56.211202
created 2018-07-10 12:37:03.124657
min_mon_release 14 (nautilus)
0: [v2:172.16.12.83:3300/0,v1:172.16.12.83:6789/0] mon.amon3
1: [v2:172.16.12.81:3300/0,v1:172.16.12.81:6789/0] mon.amon1
2: [v2:172.16.12.82:3300/0,v1:172.16.12.82:6789/0] mon.amon2

##

ceph config get mon.*
WHO MASK LEVELOPTION VALUE   RO
mon  advanced mon_crush_min_required_version firefly *
mon  advanced mon_warn_on_insecure_global_id_reclaim_allowed false

##

ceph --admin-daemon /var/run/ceph/ceph-mon.amon3.asok mon_status
{
"name": "amon3",
"rank": 0,
"state": "probing",
"election_epoch": 0,
"quorum": [],
"features": {
"required_con": "2449958747315912708",
"required_mon": [
"kraken",
"luminous",
"mimic",
"osdmap-prune",
"nautilus"
],
"quorum_con": "0",
"quorum_mon": []
},
"outside_quorum": [
"amon3"
],
"extra_probe_peers": [],
"sync_provider": [],
"monmap": {
"epoch": 14,
"fsid": "10a40588-d89e-4fac-be88-afcedc1a7372",
"modified": "2021-06-22 13:35:56.211202",
"created": "2018-07-10 12:37:03.124657",
"min_mon_release": 14,
"min_mon_release_name": "nautilus",
"features": {
"persistent": [
"kraken",
"luminous",
"mimic",
"osdmap-prune",
"nautilus"
],
"optional": []
},
"mons": [
{
"rank": 0,
"name": "amon3",
"public_addrs": {
"addrvec": [
{
"type": "v2",
"addr": "172.16.12.83:3300",
"nonce": 0
},
{
"type": "v1",
"addr": "172.16.12.83:6789",
"nonce": 0
}
]
},
"addr": "172.16.12.83:6789/0",
"public_addr": "172.16.12.83:6789/0"
},
{
"rank": 1,
"name": "amon1",
"public_addrs": {
"addrvec": [
{
"type": "v2",
"addr": "172.16.12.81:3300",
"nonce": 0
},
{
"type": "v1",
"addr": "172.16.12.81:6789",
"nonce": 0
}
]
},
"addr": "172.16.12.81:6789/0",
"public_addr": "172.16.12.81:6789/0"
},
{
"rank": 2,
"name": "amon2",
"public_addrs": {
"addrvec": [
{
"type": "v2",
"addr": "172.16.12.82:3300",
"nonce": 0
},
{
"type": "v1",
"addr": "172.16.12.82:6789",
 

[ceph-users] Re: Drop of performance after Nautilus to Pacific upgrade

2021-09-10 Thread Luis Domingues
Thanks for your observation.

Indeed, I do not get drop of performance when upgrading from Nautilus to 
Octopus. But even using Pacific 16.1.0, the performance just goes down, so I 
guess we run into the same issue somehow.

I do not think just staying in Octopus is a solution, as it will reach EOF 
eventually. The source of this performance drop is still a mystery to me.

Luis Domingues

‐‐‐ Original Message ‐‐‐

On Tuesday, September 7th, 2021 at 10:51 AM, Martin Mlynář  
wrote:

> Hello,
>
> we've noticed similar issue after upgrading our test 3 node cluster from
>
> 15.2.14-1~bpo10+1 to 16.1.0-1~bpo10+1.
>
> quick tests using rados bench:
>
> 16.2.5-1~bpo10+1:
>
> Total time run:         133.28
>
> Total writes made:      576
>
> Write size:             4194304
>
> Object size:            4194304
>
> Bandwidth (MB/sec):     17.2869
>
> Stddev Bandwidth:       34.1485
>
> Max bandwidth (MB/sec): 204
>
> Min bandwidth (MB/sec): 0
>
> Average IOPS:           4
>
> Stddev IOPS:            8.55426
>
> Max IOPS:               51
>
> Min IOPS:               0
>
> Average Latency(s):     3.59873
>
> Stddev Latency(s):      5.99964
>
> Max latency(s):         30.6307
>
> Min latency(s):         0.0865062
>
> after downgrading OSDs:
>
> 15.2.14-1~bpo10+1:
>
> Total time run:         120.135
>
> Total writes made:      16324
>
> Write size:             4194304
>
> Object size:            4194304
>
> Bandwidth (MB/sec):     543.524
>
> Stddev Bandwidth:       21.7548
>
> Max bandwidth (MB/sec): 580
>
> Min bandwidth (MB/sec): 436
>
> Average IOPS:           135
>
> Stddev IOPS:            5.43871
>
> Max IOPS:               145
>
> Min IOPS:               109
>
> Average Latency(s):     0.117646
>
> Stddev Latency(s):      0.0391269
>
> Max latency(s):         0.544229
>
> Min latency(s):         0.0602735
>
> We currently run on this setup:
>
> {
>
>     "mon": {
>
>     "ceph version 16.2.5 (0883bdea7337b95e4b611c768c0279868462204a)
>
> pacific (stable)": 2
>
>     },
>
>     "mgr": {
>
>     "ceph version 15.2.14 (cd3bb7e87a2f62c1b862ff3fd8b1eec13391a5be)
>
> octopus (stable)": 3
>
>     },
>
>     "osd": {
>
>     "ceph version 15.2.14 (cd3bb7e87a2f62c1b862ff3fd8b1eec13391a5be)
>
> octopus (stable)": 35
>
>     },
>
>     "mds": {},
>
>     "overall": {
>
>     "ceph version 15.2.14 (cd3bb7e87a2f62c1b862ff3fd8b1eec13391a5be)
>
> octopus (stable)": 38,
>
>     "ceph version 16.2.5 (0883bdea7337b95e4b611c768c0279868462204a)
>
> pacific (stable)": 2
>
>     }
>
> }
>
> which solved performance issues. All OSDs were newly created and fully
>
> synced from other nodes when upgrading and downgrading back to 15.2.
>
> Best Regards,
>
> Martin
>
> Dne 05. 09. 21 v 19:45 Luis Domingues napsal(a):
>
> > Hello,
> >
> > I run a test cluster of 3 machines with 24 HDDs each, running bare-metal on 
> > CentOS 8. Long story short, I can have a bandwidth of ~ 1'200 MB/s when I 
> > do a rados bench, writing objects of 128k, when the cluster is installed 
> > with Nautilus.
> >
> > When I upgrade the cluster to Pacific, (using ceph-ansible to deploy and/or 
> > upgrade), my performances drop to ~400 MB/s of bandwidth doing the same 
> > rados bench.
> >
> > I am kind of clueless on what makes the performance drop so much. Does 
> > someone have some ideas where I can dig to find the root of this difference?
> >
> > Thanks,
> >
> > Luis Domingues
>
> Martin Mlynář
>
> ceph-users mailing list -- ceph-users@ceph.io
>
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: mon stucks on probing and out of quorum, after down and restart

2021-09-10 Thread Eugen Block
I don't have an explanation but removing the mon store from the failed  
mon has resolved similar issues in the past. Could you give that a try?



Zitat von mk :


Hi CephFolks,

I have a cluster 14.2.21-22/Ubuntu 18.04 with 3 mon’s. After going  
down/restart of 1 mon(amon3) it stucks on probing and its also out  
of quorum, we have changed nothing and it was working, regardless we  
have checked tcp ports / mtu's of mons are open and reachable.

appreciate any help you can provide.

ceph health detail
HEALTH_WARN 1/3 mons down, quorum amon1,amon2
MON_DOWN 1/3 mons down, quorum amon1,amon2
mon.amon3 (rank 0) addr  
[v2:172.16.12.83:3300/0,v1:172.16.12.83:6789/0] is down (out of  
quorum)


all mon daemons are up and running, monmaptool printout is also  
equal on all mon nodes and on other healthy mon nodes log there are  
entries from amon3:


Log from healthy amon1 log:

2021-09-10 10:49:16.657 7fc8285f3700  0 log_channel(audit) log [DBG]  
: from='client.? 172.16.12.83:0/245515484' entity='client.admin'  
cmd=[{"prefix": "status"}]: dispatch
2021-09-10 11:04:16.586 7fc8285f3700  0 log_channel(audit) log [DBG]  
: from='client.? 172.16.12.83:0/2472330429' entity='client.admin'  
cmd=[{"prefix": "status"}]: dispatch
2021-09-10 11:06:06.245 7fc8285f3700  0 log_channel(audit) log [DBG]  
: from='client.? 172.16.12.83:0/161156' entity='client.admin'  
cmd=[{"prefix": "mon getmap"}]: dispatch
2021-09-10 11:19:16.581 7fc8285f3700  0 log_channel(audit) log [DBG]  
: from='client.? 172.16.12.83:0/964654151' entity='client.admin'  
cmd=[{"prefix": "status"}]: dispatch


#

monmaptool --print /tmp/monmap
monmaptool: monmap file /tmp/monmap
epoch 14
fsid 10a40588-d89e-4fac-be88-afcedc1a7372
last_changed 2021-06-22 13:35:56.211202
created 2018-07-10 12:37:03.124657
min_mon_release 14 (nautilus)
0: [v2:172.16.12.83:3300/0,v1:172.16.12.83:6789/0] mon.amon3
1: [v2:172.16.12.81:3300/0,v1:172.16.12.81:6789/0] mon.amon1
2: [v2:172.16.12.82:3300/0,v1:172.16.12.82:6789/0] mon.amon2

##

ceph config get mon.*
WHO MASK LEVELOPTION VALUE   RO
mon  advanced mon_crush_min_required_version firefly *
mon  advanced mon_warn_on_insecure_global_id_reclaim_allowed false

##

ceph --admin-daemon /var/run/ceph/ceph-mon.amon3.asok mon_status
{
"name": "amon3",
"rank": 0,
"state": "probing",
"election_epoch": 0,
"quorum": [],
"features": {
"required_con": "2449958747315912708",
"required_mon": [
"kraken",
"luminous",
"mimic",
"osdmap-prune",
"nautilus"
],
"quorum_con": "0",
"quorum_mon": []
},
"outside_quorum": [
"amon3"
],
"extra_probe_peers": [],
"sync_provider": [],
"monmap": {
"epoch": 14,
"fsid": "10a40588-d89e-4fac-be88-afcedc1a7372",
"modified": "2021-06-22 13:35:56.211202",
"created": "2018-07-10 12:37:03.124657",
"min_mon_release": 14,
"min_mon_release_name": "nautilus",
"features": {
"persistent": [
"kraken",
"luminous",
"mimic",
"osdmap-prune",
"nautilus"
],
"optional": []
},
"mons": [
{
"rank": 0,
"name": "amon3",
"public_addrs": {
"addrvec": [
{
"type": "v2",
"addr": "172.16.12.83:3300",
"nonce": 0
},
{
"type": "v1",
"addr": "172.16.12.83:6789",
"nonce": 0
}
]
},
"addr": "172.16.12.83:6789/0",
"public_addr": "172.16.12.83:6789/0"
},
{
"rank": 1,
"name": "amon1",
"public_addrs": {
"addrvec": [
{
"type": "v2",
"addr": "172.16.12.81:3300",
"nonce": 0
},
{
"type": "v1",
"addr": "172.16.12.81:6789",
"nonce": 0
}
]
},
"addr": "172.16.12.81:6789/0",
"public_addr": "172.16.12.81:6789/0"
},
{
"rank": 2,
"name": "amon2",
"public_addrs": {
"addrvec": [
{
"type": "v2",
"addr": "172.16.12.82:3300",
  

[ceph-users] The best way of backup S3 buckets

2021-09-10 Thread huxia...@horebdata.cn
Dear Ceph folks,

This is closely related to my previous questions on how to do safely and 
reliabely RadosGW remote replication. 

My major task is to backup S3 buckets. One obvious method is to use Ceph 
RadosGW multisite replication. I am wondering whether this is the best way to 
do S3 storage backup, or are there any better methods or alternatives? I am 
dealing with ca. 5-8TB amount of new data per day

thanks a lot in advance,

Samuel



huxia...@horebdata.cn
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: The best way of backup S3 buckets

2021-09-10 Thread Janne Johansson
Den fre 10 sep. 2021 kl 12:56 skrev huxia...@horebdata.cn
:
> Dear Ceph folks,
> This is closely related to my previous questions on how to do safely and 
> reliabely RadosGW remote replication.
> My major task is to backup S3 buckets. One obvious method is to use Ceph 
> RadosGW multisite replication. I am wondering whether this is the best way to 
> do S3 storage backup, or are there any better methods or alternatives? I am 
> dealing with ca. 5-8TB amount of new data per day

"best" is hard to answer when one doesn't know which of the many
dimensions you are considering.

There might be "fastest", "safest", "puts least load on secondary
site", "transfers least amount of data per sync", "gives guarantees to
always be consistent", "easiest to restart in the middle of a broken
transfer", "is possible for non ceph-admin remote users to initiate"
and probably ten more I can't imagine right now.

Given that different data have different needs (an old index and new
content might be ok, since all objects in the index are reachable,
whereas newer index than data might be a disaster if you read the
index before the data is actually in place)
You will have to weigh several options against each other. Would you
sync a few larger chunks (nights, weekends) or start resync as soon as
one single object appears on the source?

You need to know how to handle deletes, or find out when site A but
not site B has data, or when only site B but not site A has.

-- 
May the most significant bit of your life be positive.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: mon stucks on probing and out of quorum, after down and restart

2021-09-10 Thread mk
Thx Eugen,
just stopping mon and remove/rename only store.db and start mon? 
BR
Max
> On 10. Sep 2021, at 12:50, Eugen Block  wrote:
> 
> I don't have an explanation but removing the mon store from the failed mon 
> has resolved similar issues in the past. Could you give that a try?
> 
> 
> Zitat von mk :
> 
>> Hi CephFolks,
>> 
>> I have a cluster 14.2.21-22/Ubuntu 18.04 with 3 mon’s. After going 
>> down/restart of 1 mon(amon3) it stucks on probing and its also out of 
>> quorum, we have changed nothing and it was working, regardless we have 
>> checked tcp ports / mtu's of mons are open and reachable.
>> appreciate any help you can provide.
>> 
>> ceph health detail
>> HEALTH_WARN 1/3 mons down, quorum amon1,amon2
>> MON_DOWN 1/3 mons down, quorum amon1,amon2
>>mon.amon3 (rank 0) addr [v2:172.16.12.83:3300/0,v1:172.16.12.83:6789/0] 
>> is down (out of quorum)
>> 
>> all mon daemons are up and running, monmaptool printout is also equal on all 
>> mon nodes and on other healthy mon nodes log there are entries from amon3:
>> 
>> Log from healthy amon1 log:
>> 
>> 2021-09-10 10:49:16.657 7fc8285f3700  0 log_channel(audit) log [DBG] : 
>> from='client.? 172.16.12.83:0/245515484' entity='client.admin' 
>> cmd=[{"prefix": "status"}]: dispatch
>> 2021-09-10 11:04:16.586 7fc8285f3700  0 log_channel(audit) log [DBG] : 
>> from='client.? 172.16.12.83:0/2472330429' entity='client.admin' 
>> cmd=[{"prefix": "status"}]: dispatch
>> 2021-09-10 11:06:06.245 7fc8285f3700  0 log_channel(audit) log [DBG] : 
>> from='client.? 172.16.12.83:0/161156' entity='client.admin' 
>> cmd=[{"prefix": "mon getmap"}]: dispatch
>> 2021-09-10 11:19:16.581 7fc8285f3700  0 log_channel(audit) log [DBG] : 
>> from='client.? 172.16.12.83:0/964654151' entity='client.admin' 
>> cmd=[{"prefix": "status"}]: dispatch
>> 
>> #
>> 
>> monmaptool --print /tmp/monmap
>> monmaptool: monmap file /tmp/monmap
>> epoch 14
>> fsid 10a40588-d89e-4fac-be88-afcedc1a7372
>> last_changed 2021-06-22 13:35:56.211202
>> created 2018-07-10 12:37:03.124657
>> min_mon_release 14 (nautilus)
>> 0: [v2:172.16.12.83:3300/0,v1:172.16.12.83:6789/0] mon.amon3
>> 1: [v2:172.16.12.81:3300/0,v1:172.16.12.81:6789/0] mon.amon1
>> 2: [v2:172.16.12.82:3300/0,v1:172.16.12.82:6789/0] mon.amon2
>> 
>> ##
>> 
>> ceph config get mon.*
>> WHO MASK LEVELOPTION VALUE   RO
>> mon  advanced mon_crush_min_required_version firefly *
>> mon  advanced mon_warn_on_insecure_global_id_reclaim_allowed false
>> 
>> ##
>> 
>> ceph --admin-daemon /var/run/ceph/ceph-mon.amon3.asok mon_status
>> {
>>"name": "amon3",
>>"rank": 0,
>>"state": "probing",
>>"election_epoch": 0,
>>"quorum": [],
>>"features": {
>>"required_con": "2449958747315912708",
>>"required_mon": [
>>"kraken",
>>"luminous",
>>"mimic",
>>"osdmap-prune",
>>"nautilus"
>>],
>>"quorum_con": "0",
>>"quorum_mon": []
>>},
>>"outside_quorum": [
>>"amon3"
>>],
>>"extra_probe_peers": [],
>>"sync_provider": [],
>>"monmap": {
>>"epoch": 14,
>>"fsid": "10a40588-d89e-4fac-be88-afcedc1a7372",
>>"modified": "2021-06-22 13:35:56.211202",
>>"created": "2018-07-10 12:37:03.124657",
>>"min_mon_release": 14,
>>"min_mon_release_name": "nautilus",
>>"features": {
>>"persistent": [
>>"kraken",
>>"luminous",
>>"mimic",
>>"osdmap-prune",
>>"nautilus"
>>],
>>"optional": []
>>},
>>"mons": [
>>{
>>"rank": 0,
>>"name": "amon3",
>>"public_addrs": {
>>"addrvec": [
>>{
>>"type": "v2",
>>"addr": "172.16.12.83:3300",
>>"nonce": 0
>>},
>>{
>>"type": "v1",
>>"addr": "172.16.12.83:6789",
>>"nonce": 0
>>}
>>]
>>},
>>"addr": "172.16.12.83:6789/0",
>>"public_addr": "172.16.12.83:6789/0"
>>},
>>{
>>"rank": 1,
>>"name": "amon1",
>>"public_addrs": {
>>"addrvec": [
>>{
>>"type": "v2",
>>"addr": "172.16.12.81:3300",
>>"nonce": 0
>>},
>>{
>>"type": "v1",
>>"addr": "172.16.12.81:6789",
>>"nonce": 0
>> 

[ceph-users] Re: The best way of backup S3 buckets

2021-09-10 Thread mhnx
If you need instant backup and lifecycle rules then Multisite is the best
choice.

If you need daily backup and do not have different ceph cluster, then
rclone will be your best mate.

10 Eyl 2021 Cum 13:56 tarihinde huxia...@horebdata.cn 
şunu yazdı:

> Dear Ceph folks,
>
> This is closely related to my previous questions on how to do safely and
> reliabely RadosGW remote replication.
>
> My major task is to backup S3 buckets. One obvious method is to use Ceph
> RadosGW multisite replication. I am wondering whether this is the best way
> to do S3 storage backup, or are there any better methods or alternatives? I
> am dealing with ca. 5-8TB amount of new data per day
>
> thanks a lot in advance,
>
> Samuel
>
>
>
> huxia...@horebdata.cn
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: mon stucks on probing and out of quorum, after down and restart

2021-09-10 Thread Eugen Block
Yes, give it a try. If the cluster is healthy otherwise it shouldn't  
be a problem.



Zitat von mk :


Thx Eugen,
just stopping mon and remove/rename only store.db and start mon?
BR
Max

On 10. Sep 2021, at 12:50, Eugen Block  wrote:

I don't have an explanation but removing the mon store from the  
failed mon has resolved similar issues in the past. Could you give  
that a try?



Zitat von mk :


Hi CephFolks,

I have a cluster 14.2.21-22/Ubuntu 18.04 with 3 mon’s. After going  
down/restart of 1 mon(amon3) it stucks on probing and its also out  
of quorum, we have changed nothing and it was working, regardless  
we have checked tcp ports / mtu's of mons are open and reachable.

appreciate any help you can provide.

ceph health detail
HEALTH_WARN 1/3 mons down, quorum amon1,amon2
MON_DOWN 1/3 mons down, quorum amon1,amon2
   mon.amon3 (rank 0) addr  
[v2:172.16.12.83:3300/0,v1:172.16.12.83:6789/0] is down (out of  
quorum)


all mon daemons are up and running, monmaptool printout is also  
equal on all mon nodes and on other healthy mon nodes log there  
are entries from amon3:


Log from healthy amon1 log:

2021-09-10 10:49:16.657 7fc8285f3700  0 log_channel(audit) log  
[DBG] : from='client.? 172.16.12.83:0/245515484'  
entity='client.admin' cmd=[{"prefix": "status"}]: dispatch
2021-09-10 11:04:16.586 7fc8285f3700  0 log_channel(audit) log  
[DBG] : from='client.? 172.16.12.83:0/2472330429'  
entity='client.admin' cmd=[{"prefix": "status"}]: dispatch
2021-09-10 11:06:06.245 7fc8285f3700  0 log_channel(audit) log  
[DBG] : from='client.? 172.16.12.83:0/161156'  
entity='client.admin' cmd=[{"prefix": "mon getmap"}]: dispatch
2021-09-10 11:19:16.581 7fc8285f3700  0 log_channel(audit) log  
[DBG] : from='client.? 172.16.12.83:0/964654151'  
entity='client.admin' cmd=[{"prefix": "status"}]: dispatch


#

monmaptool --print /tmp/monmap
monmaptool: monmap file /tmp/monmap
epoch 14
fsid 10a40588-d89e-4fac-be88-afcedc1a7372
last_changed 2021-06-22 13:35:56.211202
created 2018-07-10 12:37:03.124657
min_mon_release 14 (nautilus)
0: [v2:172.16.12.83:3300/0,v1:172.16.12.83:6789/0] mon.amon3
1: [v2:172.16.12.81:3300/0,v1:172.16.12.81:6789/0] mon.amon1
2: [v2:172.16.12.82:3300/0,v1:172.16.12.82:6789/0] mon.amon2

##

ceph config get mon.*
WHO MASK LEVELOPTION VALUE   RO
mon  advanced mon_crush_min_required_version firefly *
mon  advanced mon_warn_on_insecure_global_id_reclaim_allowed false

##

ceph --admin-daemon /var/run/ceph/ceph-mon.amon3.asok mon_status
{
   "name": "amon3",
   "rank": 0,
   "state": "probing",
   "election_epoch": 0,
   "quorum": [],
   "features": {
   "required_con": "2449958747315912708",
   "required_mon": [
   "kraken",
   "luminous",
   "mimic",
   "osdmap-prune",
   "nautilus"
   ],
   "quorum_con": "0",
   "quorum_mon": []
   },
   "outside_quorum": [
   "amon3"
   ],
   "extra_probe_peers": [],
   "sync_provider": [],
   "monmap": {
   "epoch": 14,
   "fsid": "10a40588-d89e-4fac-be88-afcedc1a7372",
   "modified": "2021-06-22 13:35:56.211202",
   "created": "2018-07-10 12:37:03.124657",
   "min_mon_release": 14,
   "min_mon_release_name": "nautilus",
   "features": {
   "persistent": [
   "kraken",
   "luminous",
   "mimic",
   "osdmap-prune",
   "nautilus"
   ],
   "optional": []
   },
   "mons": [
   {
   "rank": 0,
   "name": "amon3",
   "public_addrs": {
   "addrvec": [
   {
   "type": "v2",
   "addr": "172.16.12.83:3300",
   "nonce": 0
   },
   {
   "type": "v1",
   "addr": "172.16.12.83:6789",
   "nonce": 0
   }
   ]
   },
   "addr": "172.16.12.83:6789/0",
   "public_addr": "172.16.12.83:6789/0"
   },
   {
   "rank": 1,
   "name": "amon1",
   "public_addrs": {
   "addrvec": [
   {
   "type": "v2",
   "addr": "172.16.12.81:3300",
   "nonce": 0
   },
   {
   "type": "v1",
   "addr": "172.16.12.81:6789",
   "nonce": 0
   }
   ]
   },
   "addr": "172.16.12.81:6789/0",
   "public_addr": "172.16.12.81:6789/0"
   },
   {
   "rank": 2,
   "name": "amon2",
   "public_addrs": {
  

[ceph-users] Re: The best way of backup S3 buckets

2021-09-10 Thread huxia...@horebdata.cn
Thanks a lot for quick response.

Will rclone be able to handle PB data backup? Does any have experience using 
rclone to backup massive S3 object store, and what lessons learned?

best regards,

Samuel

 



huxia...@horebdata.cn
 
From: mhnx
Date: 2021-09-10 13:07
To: huxiaoyu
CC: ceph-users
Subject: Re: [ceph-users] The best way of backup S3 buckets
If you need instant backup and lifecycle rules then Multisite is the best 
choice. 

If you need daily backup and do not have different ceph cluster, then rclone 
will be your best mate.

10 Eyl 2021 Cum 13:56 tarihinde huxia...@horebdata.cn  
şunu yazdı:
Dear Ceph folks,

This is closely related to my previous questions on how to do safely and 
reliabely RadosGW remote replication. 

My major task is to backup S3 buckets. One obvious method is to use Ceph 
RadosGW multisite replication. I am wondering whether this is the best way to 
do S3 storage backup, or are there any better methods or alternatives? I am 
dealing with ca. 5-8TB amount of new data per day

thanks a lot in advance,

Samuel



huxia...@horebdata.cn
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] SSDs/HDDs in ceph Octopus

2021-09-10 Thread Luke Hall

Hi,

We have six osd machines, each containing 6x4TB HDDs plus one nvme for 
rocksdb. I need to plan upgrading these machines to all or partial SSDs.


The question I have is:

I know that ceph recognises SSDs as distinct from HDDs from their 
physical device ids etc. In a setup with 50/50 HDDs/SSDs does ceph do 
anything natively to distinguish between the two speeds of storage? I.e 
do you need to create separate pools for fast/slow storage or is there 
something 'under the hood' which will manage which requests are sent to 
which type of storage?


Thanks in advance.

--
All postal correspondence to:
The Positive Internet Company, 24 Ganton Street, London. W1F 7QY

*Follow us on Twitter* @posipeople

The Positive Internet Company Limited is registered in England and Wales.
Registered company number: 3673639. VAT no: 726 7072 28.
Registered office: Northside House, Mount Pleasant, Barnet, Herts, EN4 9EE.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: The best way of backup S3 buckets

2021-09-10 Thread mhnx
Its great. I have moved millions of objects between two cluster and its a
piece of art work by an awesome weirdo. Memory and cpu usage is epic. Very
fast and it can use metada, md5 etc.

But you need to write your own script İf you wanna crob job.

10 Eyl 2021 Cum 14:19 tarihinde huxia...@horebdata.cn 
şunu yazdı:

> Thanks a lot for quick response.
>
> Will rclone be able to handle PB data backup? Does any have experience
> using rclone to backup massive S3 object store, and what lessons learned?
>
> best regards,
>
> Samuel
>
>
>
> --
> huxia...@horebdata.cn
>
>
> *From:* mhnx 
> *Date:* 2021-09-10 13:07
> *To:* huxiaoyu 
> *CC:* ceph-users 
> *Subject:* Re: [ceph-users] The best way of backup S3 buckets
> If you need instant backup and lifecycle rules then Multisite is the best
> choice.
>
> If you need daily backup and do not have different ceph cluster, then
> rclone will be your best mate.
>
> 10 Eyl 2021 Cum 13:56 tarihinde huxia...@horebdata.cn <
> huxia...@horebdata.cn> şunu yazdı:
>
>> Dear Ceph folks,
>>
>> This is closely related to my previous questions on how to do safely and
>> reliabely RadosGW remote replication.
>>
>> My major task is to backup S3 buckets. One obvious method is to use Ceph
>> RadosGW multisite replication. I am wondering whether this is the best way
>> to do S3 storage backup, or are there any better methods or alternatives? I
>> am dealing with ca. 5-8TB amount of new data per day
>>
>> thanks a lot in advance,
>>
>> Samuel
>>
>>
>>
>> huxia...@horebdata.cn
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
>>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: SSDs/HDDs in ceph Octopus

2021-09-10 Thread Robert Sander

Hi Luke,

Am 10.09.21 um 13:27 schrieb Luke Hall:

I know that ceph recognises SSDs as distinct from HDDs from their 
physical device ids etc. In a setup with 50/50 HDDs/SSDs does ceph do 
anything natively to distinguish between the two speeds of storage? I.e 
do you need to create separate pools for fast/slow storage or is there 
something 'under the hood' which will manage which requests are sent to 
which type of storage?


There is nothing "under the hood" except the automatic detection of ssd 
versus hdd which is done by looking at /sys/block/sdX/queue/rotational.


It is best practice to have rulesets that select either hdd or ssd 
classes and then assign these rules to different pools.


It is not good practice to just mixed these classes in one pool, except 
for a transition period like with your project. The performance 
difference is just too large. Pools should have a uniform class of storage.


Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

http://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Zwangsangaben lt. §35a GmbHG:
HRB 93818 B / Amtsgericht Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein -- Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: mon stucks on probing and out of quorum, after down and restart

2021-09-10 Thread mk
no doesn’t start the mon daemon:
Sep 10 13:35:55 amon3 systemd[1]: ceph-mon@amon3.service: Start request 
repeated too quickly.
Sep 10 13:35:55 amon3 systemd[1]: ceph-mon@amon3.service: Failed with result 
'exit-code'.
Sep 10 13:35:55 amon3 systemd[1]: Failed to start Ceph cluster monitor daemon.

> On 10. Sep 2021, at 13:08, Eugen Block  wrote:
> 
> Yes, give it a try. If the cluster is healthy otherwise it shouldn't be a 
> problem.
> 
> 
> Zitat von mk :
> 
>> Thx Eugen,
>> just stopping mon and remove/rename only store.db and start mon?
>> BR
>> Max
>>> On 10. Sep 2021, at 12:50, Eugen Block  wrote:
>>> 
>>> I don't have an explanation but removing the mon store from the failed mon 
>>> has resolved similar issues in the past. Could you give that a try?
>>> 
>>> 
>>> Zitat von mk :
>>> 
 Hi CephFolks,
 
 I have a cluster 14.2.21-22/Ubuntu 18.04 with 3 mon’s. After going 
 down/restart of 1 mon(amon3) it stucks on probing and its also out of 
 quorum, we have changed nothing and it was working, regardless we have 
 checked tcp ports / mtu's of mons are open and reachable.
 appreciate any help you can provide.
 
 ceph health detail
 HEALTH_WARN 1/3 mons down, quorum amon1,amon2
 MON_DOWN 1/3 mons down, quorum amon1,amon2
   mon.amon3 (rank 0) addr [v2:172.16.12.83:3300/0,v1:172.16.12.83:6789/0] 
 is down (out of quorum)
 
 all mon daemons are up and running, monmaptool printout is also equal on 
 all mon nodes and on other healthy mon nodes log there are entries from 
 amon3:
 
 Log from healthy amon1 log:
 
 2021-09-10 10:49:16.657 7fc8285f3700  0 log_channel(audit) log [DBG] : 
 from='client.? 172.16.12.83:0/245515484' entity='client.admin' 
 cmd=[{"prefix": "status"}]: dispatch
 2021-09-10 11:04:16.586 7fc8285f3700  0 log_channel(audit) log [DBG] : 
 from='client.? 172.16.12.83:0/2472330429' entity='client.admin' 
 cmd=[{"prefix": "status"}]: dispatch
 2021-09-10 11:06:06.245 7fc8285f3700  0 log_channel(audit) log [DBG] : 
 from='client.? 172.16.12.83:0/161156' entity='client.admin' 
 cmd=[{"prefix": "mon getmap"}]: dispatch
 2021-09-10 11:19:16.581 7fc8285f3700  0 log_channel(audit) log [DBG] : 
 from='client.? 172.16.12.83:0/964654151' entity='client.admin' 
 cmd=[{"prefix": "status"}]: dispatch
 
 #
 
 monmaptool --print /tmp/monmap
 monmaptool: monmap file /tmp/monmap
 epoch 14
 fsid 10a40588-d89e-4fac-be88-afcedc1a7372
 last_changed 2021-06-22 13:35:56.211202
 created 2018-07-10 12:37:03.124657
 min_mon_release 14 (nautilus)
 0: [v2:172.16.12.83:3300/0,v1:172.16.12.83:6789/0] mon.amon3
 1: [v2:172.16.12.81:3300/0,v1:172.16.12.81:6789/0] mon.amon1
 2: [v2:172.16.12.82:3300/0,v1:172.16.12.82:6789/0] mon.amon2
 
 ##
 
 ceph config get mon.*
 WHO MASK LEVELOPTION VALUE   RO
 mon  advanced mon_crush_min_required_version firefly *
 mon  advanced mon_warn_on_insecure_global_id_reclaim_allowed false
 
 ##
 
 ceph --admin-daemon /var/run/ceph/ceph-mon.amon3.asok mon_status
 {
   "name": "amon3",
   "rank": 0,
   "state": "probing",
   "election_epoch": 0,
   "quorum": [],
   "features": {
   "required_con": "2449958747315912708",
   "required_mon": [
   "kraken",
   "luminous",
   "mimic",
   "osdmap-prune",
   "nautilus"
   ],
   "quorum_con": "0",
   "quorum_mon": []
   },
   "outside_quorum": [
   "amon3"
   ],
   "extra_probe_peers": [],
   "sync_provider": [],
   "monmap": {
   "epoch": 14,
   "fsid": "10a40588-d89e-4fac-be88-afcedc1a7372",
   "modified": "2021-06-22 13:35:56.211202",
   "created": "2018-07-10 12:37:03.124657",
   "min_mon_release": 14,
   "min_mon_release_name": "nautilus",
   "features": {
   "persistent": [
   "kraken",
   "luminous",
   "mimic",
   "osdmap-prune",
   "nautilus"
   ],
   "optional": []
   },
   "mons": [
   {
   "rank": 0,
   "name": "amon3",
   "public_addrs": {
   "addrvec": [
   {
   "type": "v2",
   "addr": "172.16.12.83:3300",
   "nonce": 0
   },
   {
   "type": "v1",
   "addr": "172.16.12.83:6789",
   "nonce": 0
   }
   ]
   },
>>

[ceph-users] List pg with heavily degraded objects

2021-09-10 Thread George Shuklin

Hello.

I wonder if there is a way to see how many replicas are available for 
each object (or, at least, PG-level statistics). Basically, if I have 
damaged cluster, I want to see the scale of damage, and I want to see 
the most degraded objects (which has 1 copy, then objects with 2 copies, 
etc).


Are there a way? pg list is not very informative, as it does not show 
how badly 'unreplicated' data are.




___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: SSDs/HDDs in ceph Octopus

2021-09-10 Thread Luke Hall
It is best practice to have rulesets that select either hdd or ssd 
classes and then assign these rules to different pools.


It is not good practice to just mixed these classes in one pool, except 
for a transition period like with your project. The performance 
difference is just too large. 



Pools should have a uniform class of storage.
Thanks Robert, this is the critical piece of information I was 
struggling to find.


--
All postal correspondence to:
The Positive Internet Company, 24 Ganton Street, London. W1F 7QY

*Follow us on Twitter* @posipeople

The Positive Internet Company Limited is registered in England and Wales.
Registered company number: 3673639. VAT no: 726 7072 28.
Registered office: Northside House, Mount Pleasant, Barnet, Herts, EN4 9EE.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: mon stucks on probing and out of quorum, after down and restart

2021-09-10 Thread Eugen Block
Is there anything wrong with the directory permissions? What does the  
mon log tell you?



Zitat von mk :


no doesn’t start the mon daemon:
Sep 10 13:35:55 amon3 systemd[1]: ceph-mon@amon3.service: Start  
request repeated too quickly.
Sep 10 13:35:55 amon3 systemd[1]: ceph-mon@amon3.service: Failed  
with result 'exit-code'.
Sep 10 13:35:55 amon3 systemd[1]: Failed to start Ceph cluster  
monitor daemon.



On 10. Sep 2021, at 13:08, Eugen Block  wrote:

Yes, give it a try. If the cluster is healthy otherwise it  
shouldn't be a problem.



Zitat von mk :


Thx Eugen,
just stopping mon and remove/rename only store.db and start mon?
BR
Max

On 10. Sep 2021, at 12:50, Eugen Block  wrote:

I don't have an explanation but removing the mon store from the  
failed mon has resolved similar issues in the past. Could you  
give that a try?



Zitat von mk :


Hi CephFolks,

I have a cluster 14.2.21-22/Ubuntu 18.04 with 3 mon’s. After  
going down/restart of 1 mon(amon3) it stucks on probing and its  
also out of quorum, we have changed nothing and it was working,  
regardless we have checked tcp ports / mtu's of mons are open  
and reachable.

appreciate any help you can provide.

ceph health detail
HEALTH_WARN 1/3 mons down, quorum amon1,amon2
MON_DOWN 1/3 mons down, quorum amon1,amon2
  mon.amon3 (rank 0) addr  
[v2:172.16.12.83:3300/0,v1:172.16.12.83:6789/0] is down (out of  
quorum)


all mon daemons are up and running, monmaptool printout is also  
equal on all mon nodes and on other healthy mon nodes log there  
are entries from amon3:


Log from healthy amon1 log:

2021-09-10 10:49:16.657 7fc8285f3700  0 log_channel(audit) log  
[DBG] : from='client.? 172.16.12.83:0/245515484'  
entity='client.admin' cmd=[{"prefix": "status"}]: dispatch
2021-09-10 11:04:16.586 7fc8285f3700  0 log_channel(audit) log  
[DBG] : from='client.? 172.16.12.83:0/2472330429'  
entity='client.admin' cmd=[{"prefix": "status"}]: dispatch
2021-09-10 11:06:06.245 7fc8285f3700  0 log_channel(audit) log  
[DBG] : from='client.? 172.16.12.83:0/161156'  
entity='client.admin' cmd=[{"prefix": "mon getmap"}]: dispatch
2021-09-10 11:19:16.581 7fc8285f3700  0 log_channel(audit) log  
[DBG] : from='client.? 172.16.12.83:0/964654151'  
entity='client.admin' cmd=[{"prefix": "status"}]: dispatch


#

monmaptool --print /tmp/monmap
monmaptool: monmap file /tmp/monmap
epoch 14
fsid 10a40588-d89e-4fac-be88-afcedc1a7372
last_changed 2021-06-22 13:35:56.211202
created 2018-07-10 12:37:03.124657
min_mon_release 14 (nautilus)
0: [v2:172.16.12.83:3300/0,v1:172.16.12.83:6789/0] mon.amon3
1: [v2:172.16.12.81:3300/0,v1:172.16.12.81:6789/0] mon.amon1
2: [v2:172.16.12.82:3300/0,v1:172.16.12.82:6789/0] mon.amon2

##

ceph config get mon.*
WHO MASK LEVELOPTION  
VALUE   RO
mon  advanced mon_crush_min_required_version  
firefly *

mon  advanced mon_warn_on_insecure_global_id_reclaim_allowed false

##

ceph --admin-daemon /var/run/ceph/ceph-mon.amon3.asok mon_status
{
  "name": "amon3",
  "rank": 0,
  "state": "probing",
  "election_epoch": 0,
  "quorum": [],
  "features": {
  "required_con": "2449958747315912708",
  "required_mon": [
  "kraken",
  "luminous",
  "mimic",
  "osdmap-prune",
  "nautilus"
  ],
  "quorum_con": "0",
  "quorum_mon": []
  },
  "outside_quorum": [
  "amon3"
  ],
  "extra_probe_peers": [],
  "sync_provider": [],
  "monmap": {
  "epoch": 14,
  "fsid": "10a40588-d89e-4fac-be88-afcedc1a7372",
  "modified": "2021-06-22 13:35:56.211202",
  "created": "2018-07-10 12:37:03.124657",
  "min_mon_release": 14,
  "min_mon_release_name": "nautilus",
  "features": {
  "persistent": [
  "kraken",
  "luminous",
  "mimic",
  "osdmap-prune",
  "nautilus"
  ],
  "optional": []
  },
  "mons": [
  {
  "rank": 0,
  "name": "amon3",
  "public_addrs": {
  "addrvec": [
  {
  "type": "v2",
  "addr": "172.16.12.83:3300",
  "nonce": 0
  },
  {
  "type": "v1",
  "addr": "172.16.12.83:6789",
  "nonce": 0
  }
  ]
  },
  "addr": "172.16.12.83:6789/0",
  "public_addr": "172.16.12.83:6789/0"
  },
  {
  "rank": 1,
  "name": "amon1",
  "public_addrs": {
  "addrvec": [
  {
  "type": "v2",
  "addr": "172.16.12.81:3300",
  "nonce": 0
  },
  {
  "type": "

[ceph-users] Re: mon stucks on probing and out of quorum, after down and restart

2021-09-10 Thread mk
I have just seen that on failed mon store.db size is 50K but on both other 
healthy mons 151M, 
What is the best practice? redeploy failed mon? 

> On 10. Sep 2021, at 13:08, Eugen Block  wrote:
> 
> Yes, give it a try. If the cluster is healthy otherwise it shouldn't be a 
> problem.
> 
> 
> Zitat von mk :
> 
>> Thx Eugen,
>> just stopping mon and remove/rename only store.db and start mon?
>> BR
>> Max
>>> On 10. Sep 2021, at 12:50, Eugen Block  wrote:
>>> 
>>> I don't have an explanation but removing the mon store from the failed mon 
>>> has resolved similar issues in the past. Could you give that a try?
>>> 
>>> 
>>> Zitat von mk :
>>> 
 Hi CephFolks,
 
 I have a cluster 14.2.21-22/Ubuntu 18.04 with 3 mon’s. After going 
 down/restart of 1 mon(amon3) it stucks on probing and its also out of 
 quorum, we have changed nothing and it was working, regardless we have 
 checked tcp ports / mtu's of mons are open and reachable.
 appreciate any help you can provide.
 
 ceph health detail
 HEALTH_WARN 1/3 mons down, quorum amon1,amon2
 MON_DOWN 1/3 mons down, quorum amon1,amon2
   mon.amon3 (rank 0) addr [v2:172.16.12.83:3300/0,v1:172.16.12.83:6789/0] 
 is down (out of quorum)
 
 all mon daemons are up and running, monmaptool printout is also equal on 
 all mon nodes and on other healthy mon nodes log there are entries from 
 amon3:
 
 Log from healthy amon1 log:
 
 2021-09-10 10:49:16.657 7fc8285f3700  0 log_channel(audit) log [DBG] : 
 from='client.? 172.16.12.83:0/245515484' entity='client.admin' 
 cmd=[{"prefix": "status"}]: dispatch
 2021-09-10 11:04:16.586 7fc8285f3700  0 log_channel(audit) log [DBG] : 
 from='client.? 172.16.12.83:0/2472330429' entity='client.admin' 
 cmd=[{"prefix": "status"}]: dispatch
 2021-09-10 11:06:06.245 7fc8285f3700  0 log_channel(audit) log [DBG] : 
 from='client.? 172.16.12.83:0/161156' entity='client.admin' 
 cmd=[{"prefix": "mon getmap"}]: dispatch
 2021-09-10 11:19:16.581 7fc8285f3700  0 log_channel(audit) log [DBG] : 
 from='client.? 172.16.12.83:0/964654151' entity='client.admin' 
 cmd=[{"prefix": "status"}]: dispatch
 
 #
 
 monmaptool --print /tmp/monmap
 monmaptool: monmap file /tmp/monmap
 epoch 14
 fsid 10a40588-d89e-4fac-be88-afcedc1a7372
 last_changed 2021-06-22 13:35:56.211202
 created 2018-07-10 12:37:03.124657
 min_mon_release 14 (nautilus)
 0: [v2:172.16.12.83:3300/0,v1:172.16.12.83:6789/0] mon.amon3
 1: [v2:172.16.12.81:3300/0,v1:172.16.12.81:6789/0] mon.amon1
 2: [v2:172.16.12.82:3300/0,v1:172.16.12.82:6789/0] mon.amon2
 
 ##
 
 ceph config get mon.*
 WHO MASK LEVELOPTION VALUE   RO
 mon  advanced mon_crush_min_required_version firefly *
 mon  advanced mon_warn_on_insecure_global_id_reclaim_allowed false
 
 ##
 
 ceph --admin-daemon /var/run/ceph/ceph-mon.amon3.asok mon_status
 {
   "name": "amon3",
   "rank": 0,
   "state": "probing",
   "election_epoch": 0,
   "quorum": [],
   "features": {
   "required_con": "2449958747315912708",
   "required_mon": [
   "kraken",
   "luminous",
   "mimic",
   "osdmap-prune",
   "nautilus"
   ],
   "quorum_con": "0",
   "quorum_mon": []
   },
   "outside_quorum": [
   "amon3"
   ],
   "extra_probe_peers": [],
   "sync_provider": [],
   "monmap": {
   "epoch": 14,
   "fsid": "10a40588-d89e-4fac-be88-afcedc1a7372",
   "modified": "2021-06-22 13:35:56.211202",
   "created": "2018-07-10 12:37:03.124657",
   "min_mon_release": 14,
   "min_mon_release_name": "nautilus",
   "features": {
   "persistent": [
   "kraken",
   "luminous",
   "mimic",
   "osdmap-prune",
   "nautilus"
   ],
   "optional": []
   },
   "mons": [
   {
   "rank": 0,
   "name": "amon3",
   "public_addrs": {
   "addrvec": [
   {
   "type": "v2",
   "addr": "172.16.12.83:3300",
   "nonce": 0
   },
   {
   "type": "v1",
   "addr": "172.16.12.83:6789",
   "nonce": 0
   }
   ]
   },
   "addr": "172.16.12.83:6789/0",
   "public_addr": "172.16.12.83:6789/0"
   },
   {
  

[ceph-users] Re: mon stucks on probing and out of quorum, after down and restart

2021-09-10 Thread Eugen Block
Redeploying would probably be the fastest way if you don't want your  
cluster in a degraded state for too long. You can check the logs  
afterwards to see what went wrong.



Zitat von mk :

I have just seen that on failed mon store.db size is 50K but on both  
other healthy mons 151M,

What is the best practice? redeploy failed mon?


On 10. Sep 2021, at 13:08, Eugen Block  wrote:

Yes, give it a try. If the cluster is healthy otherwise it  
shouldn't be a problem.



Zitat von mk :


Thx Eugen,
just stopping mon and remove/rename only store.db and start mon?
BR
Max

On 10. Sep 2021, at 12:50, Eugen Block  wrote:

I don't have an explanation but removing the mon store from the  
failed mon has resolved similar issues in the past. Could you  
give that a try?



Zitat von mk :


Hi CephFolks,

I have a cluster 14.2.21-22/Ubuntu 18.04 with 3 mon’s. After  
going down/restart of 1 mon(amon3) it stucks on probing and its  
also out of quorum, we have changed nothing and it was working,  
regardless we have checked tcp ports / mtu's of mons are open  
and reachable.

appreciate any help you can provide.

ceph health detail
HEALTH_WARN 1/3 mons down, quorum amon1,amon2
MON_DOWN 1/3 mons down, quorum amon1,amon2
  mon.amon3 (rank 0) addr  
[v2:172.16.12.83:3300/0,v1:172.16.12.83:6789/0] is down (out of  
quorum)


all mon daemons are up and running, monmaptool printout is also  
equal on all mon nodes and on other healthy mon nodes log there  
are entries from amon3:


Log from healthy amon1 log:

2021-09-10 10:49:16.657 7fc8285f3700  0 log_channel(audit) log  
[DBG] : from='client.? 172.16.12.83:0/245515484'  
entity='client.admin' cmd=[{"prefix": "status"}]: dispatch
2021-09-10 11:04:16.586 7fc8285f3700  0 log_channel(audit) log  
[DBG] : from='client.? 172.16.12.83:0/2472330429'  
entity='client.admin' cmd=[{"prefix": "status"}]: dispatch
2021-09-10 11:06:06.245 7fc8285f3700  0 log_channel(audit) log  
[DBG] : from='client.? 172.16.12.83:0/161156'  
entity='client.admin' cmd=[{"prefix": "mon getmap"}]: dispatch
2021-09-10 11:19:16.581 7fc8285f3700  0 log_channel(audit) log  
[DBG] : from='client.? 172.16.12.83:0/964654151'  
entity='client.admin' cmd=[{"prefix": "status"}]: dispatch


#

monmaptool --print /tmp/monmap
monmaptool: monmap file /tmp/monmap
epoch 14
fsid 10a40588-d89e-4fac-be88-afcedc1a7372
last_changed 2021-06-22 13:35:56.211202
created 2018-07-10 12:37:03.124657
min_mon_release 14 (nautilus)
0: [v2:172.16.12.83:3300/0,v1:172.16.12.83:6789/0] mon.amon3
1: [v2:172.16.12.81:3300/0,v1:172.16.12.81:6789/0] mon.amon1
2: [v2:172.16.12.82:3300/0,v1:172.16.12.82:6789/0] mon.amon2

##

ceph config get mon.*
WHO MASK LEVELOPTION  
VALUE   RO
mon  advanced mon_crush_min_required_version  
firefly *

mon  advanced mon_warn_on_insecure_global_id_reclaim_allowed false

##

ceph --admin-daemon /var/run/ceph/ceph-mon.amon3.asok mon_status
{
  "name": "amon3",
  "rank": 0,
  "state": "probing",
  "election_epoch": 0,
  "quorum": [],
  "features": {
  "required_con": "2449958747315912708",
  "required_mon": [
  "kraken",
  "luminous",
  "mimic",
  "osdmap-prune",
  "nautilus"
  ],
  "quorum_con": "0",
  "quorum_mon": []
  },
  "outside_quorum": [
  "amon3"
  ],
  "extra_probe_peers": [],
  "sync_provider": [],
  "monmap": {
  "epoch": 14,
  "fsid": "10a40588-d89e-4fac-be88-afcedc1a7372",
  "modified": "2021-06-22 13:35:56.211202",
  "created": "2018-07-10 12:37:03.124657",
  "min_mon_release": 14,
  "min_mon_release_name": "nautilus",
  "features": {
  "persistent": [
  "kraken",
  "luminous",
  "mimic",
  "osdmap-prune",
  "nautilus"
  ],
  "optional": []
  },
  "mons": [
  {
  "rank": 0,
  "name": "amon3",
  "public_addrs": {
  "addrvec": [
  {
  "type": "v2",
  "addr": "172.16.12.83:3300",
  "nonce": 0
  },
  {
  "type": "v1",
  "addr": "172.16.12.83:6789",
  "nonce": 0
  }
  ]
  },
  "addr": "172.16.12.83:6789/0",
  "public_addr": "172.16.12.83:6789/0"
  },
  {
  "rank": 1,
  "name": "amon1",
  "public_addrs": {
  "addrvec": [
  {
  "type": "v2",
  "addr": "172.16.12.81:3300",
  "nonce": 0
  },
  {
  "type": "v1",
  "addr": "172.16.12.81:6789",

[ceph-users] Re: List pg with heavily degraded objects

2021-09-10 Thread Janne Johansson
Den fre 10 sep. 2021 kl 13:55 skrev George Shuklin :
> Hello.
> I wonder if there is a way to see how many replicas are available for
> each object (or, at least, PG-level statistics). Basically, if I have
> damaged cluster, I want to see the scale of damage, and I want to see
> the most degraded objects (which has 1 copy, then objects with 2 copies,
> etc).
> Are there a way? pg list is not very informative, as it does not show
> how badly 'unreplicated' data are.

ceph pg dump should list all PGs and how many active OSDs they have in
a list like this:
[12,34,78,56], [12,34,2134872348723,56]

for which four (in my example) that should hold a replica to this PG,
and the second list is who actually hold one, with 2^31-1 as a
placeholder for UNKNOWN-OSD-NUMBER where an OSD is missing.


-- 
May the most significant bit of your life be positive.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: List pg with heavily degraded objects

2021-09-10 Thread Konstantin Shalygin
Hi,
One time I search undersized PG's with only one replica (as I remember)
Snippet left in my notes, so may be help you

ceph pg dump | grep undersized | awk '{print $1 " " $17 " " $18 " " $19}' | awk 
-vOFS='\t' '{ print length($4), $0 }' | sort -k1,1n | cut -f2- | head



k

> On 10 Sep 2021, at 14:49, George Shuklin  wrote:
> 
> I wonder if there is a way to see how many replicas are available for each 
> object (or, at least, PG-level statistics). Basically, if I have damaged 
> cluster, I want to see the scale of damage, and I want to see the most 
> degraded objects (which has 1 copy, then objects with 2 copies, etc).
> 
> Are there a way? pg list is not very informative, as it does not show how 
> badly 'unreplicated' data are.

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: mon stucks on probing and out of quorum, after down and restart

2021-09-10 Thread Konstantin Shalygin
Yes, try to use Wido's script (remove quorum logic or execute commands by hand)

https://gist.github.com/wido/561c69dc2ec3a49d1dba10a59b53dfe5 




k

> On 10 Sep 2021, at 14:57, mk  wrote:
> 
> I have just seen that on failed mon store.db size is 50K but on both other 
> healthy mons 151M, 
> What is the best practice? redeploy failed mon?

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: List pg with heavily degraded objects

2021-09-10 Thread George Shuklin

On 10/09/2021 15:19, Janne Johansson wrote:

Den fre 10 sep. 2021 kl 13:55 skrev George Shuklin :

Hello.
I wonder if there is a way to see how many replicas are available for
each object (or, at least, PG-level statistics). Basically, if I have
damaged cluster, I want to see the scale of damage, and I want to see
the most degraded objects (which has 1 copy, then objects with 2 copies,
etc).
Are there a way? pg list is not very informative, as it does not show
how badly 'unreplicated' data are.

ceph pg dump should list all PGs and how many active OSDs they have in
a list like this:
[12,34,78,56], [12,34,2134872348723,56]

for which four (in my example) that should hold a replica to this PG,
and the second list is who actually hold one, with 2^31-1 as a
placeholder for UNKNOWN-OSD-NUMBER where an OSD is missing.



It's not about been undersized.

Imagine a small cluster with three OSD. You have two OSD dead, than two 
more empty were added to the cluster.


Normally you'll see that each PG found a peer and there are no 
undersized PGs. But data, actually, wasn't replicated yet, the 
replication is in the process.


Is there any way to see if there are PG with 'holding a single data 
copy, but is replicating now'? I'm curious about this transition time 
between 'found a peer and doing recovery' and 'got at least two copies 
of data'.


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: List pg with heavily degraded objects

2021-09-10 Thread Janne Johansson
Den fre 10 sep. 2021 kl 14:27 skrev George Shuklin :
> On 10/09/2021 15:19, Janne Johansson wrote:
> >> Are there a way? pg list is not very informative, as it does not show
> >> how badly 'unreplicated' data are.
> > ceph pg dump should list all PGs and how many active OSDs they have in
> > a list like this:
> > [12,34,78,56], [12,34,2134872348723,56]
> >
> It's not about been undersized.
> Imagine a small cluster with three OSD. You have two OSD dead, than two
> more empty were added to the cluster.
> Normally you'll see that each PG found a peer and there are no
> undersized PGs. But data, actually, wasn't replicated yet, the
> replication is in the process.

My view is that they actually would be "undersized" until backfill is
done to the PGs on the new empty disks you just added.

> Is there any way to see if there are PG with 'holding a single data
> copy, but is replicating now'? I'm curious about this transition time
> between 'found a peer and doing recovery' and 'got at least two copies
> of data'.



-- 
May the most significant bit of your life be positive.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: List pg with heavily degraded objects

2021-09-10 Thread George Shuklin

On 10/09/2021 14:49, George Shuklin wrote:

Hello.

I wonder if there is a way to see how many replicas are available for 
each object (or, at least, PG-level statistics). Basically, if I have 
damaged cluster, I want to see the scale of damage, and I want to see 
the most degraded objects (which has 1 copy, then objects with 2 
copies, etc).


Are there a way? pg list is not very informative, as it does not show 
how badly 'unreplicated' data are. 



Actually, the problem is more complicated than I expected. Here is the 
artificial cluster, where there is a sizable chunk of data are single, 
(cluster of thee servers with 2 OSD each, put some data, shutdown server 
#1, put some more data, kill server #3, start server#1, it's guaranteed 
that server #2 holds a single copy). This is snapshot of the ceph pg 
dump for it as soon as #2 booted, and I can't find a proof that some 
data are in a single copy: 
https://gist.github.com/amarao/fbc8ef3538f66a9f2c264f8555f5c29a



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: List pg with heavily degraded objects

2021-09-10 Thread Janne Johansson
Den fre 10 sep. 2021 kl 14:39 skrev George Shuklin :
>
> On 10/09/2021 14:49, George Shuklin wrote:
> > Hello.
> >
> > I wonder if there is a way to see how many replicas are available for
> > each object (or, at least, PG-level statistics). Basically, if I have
> > damaged cluster, I want to see the scale of damage, and I want to see
> > the most degraded objects (which has 1 copy, then objects with 2
> > copies, etc).
> >
> > Are there a way? pg list is not very informative, as it does not show
> > how badly 'unreplicated' data are.
>
>
> Actually, the problem is more complicated than I expected. Here is the
> artificial cluster, where there is a sizable chunk of data are single,
> (cluster of thee servers with 2 OSD each, put some data, shutdown server
> #1, put some more data, kill server #3, start server#1, it's guaranteed
> that server #2 holds a single copy). This is snapshot of the ceph pg
> dump for it as soon as #2 booted, and I can't find a proof that some
> data are in a single copy:
> https://gist.github.com/amarao/fbc8ef3538f66a9f2c264f8555f5c29a
>

In this case, where you have both made PGs undersized, and also degraded
by letting one OSD pick up some changes and then remove it and get another
one back in (I didn't see where #2 stopped in your example), I guess you will
have to take a deep dive into
ceph pg  query to see ALL the info about it.

By the time you are stacking multiple error scenarios on top of eachother,
I don't think there is a simple "show me a short understandable list of what
it almost near working"


-- 
May the most significant bit of your life be positive.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: List pg with heavily degraded objects

2021-09-10 Thread George Shuklin

On 10/09/2021 15:37, Janne Johansson wrote:

Den fre 10 sep. 2021 kl 14:27 skrev George Shuklin :

On 10/09/2021 15:19, Janne Johansson wrote:

Are there a way? pg list is not very informative, as it does not show
how badly 'unreplicated' data are.

ceph pg dump should list all PGs and how many active OSDs they have in
a list like this:
[12,34,78,56], [12,34,2134872348723,56]


It's not about been undersized.
Imagine a small cluster with three OSD. You have two OSD dead, than two
more empty were added to the cluster.
Normally you'll see that each PG found a peer and there are no
undersized PGs. But data, actually, wasn't replicated yet, the
replication is in the process.

My view is that they actually would be "undersized" until backfill is
done to the PGs on the new empty disks you just added.


I've just created a counter-example for that.

Each server has 2 OSD, default replicated_rules.

There is 4 servers, pool size is 3.

* shutdown srv1, wait for recovery, shutdown srv2, wait for recovery.

* put some big amount of data (enough to see replication traffic), all 
data are in srv3+srv4 with degrade.


* shutdown srv3, start srv1, srv2.  srv4 is a single server with all 
data available.


I can see no 'undersized' PG, but data ARE in a single copy: 
https://gist.github.com/amarao/fbc8ef3538f66a9f2c264f8555f5c29a



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: List pg with heavily degraded objects

2021-09-10 Thread George Shuklin

On 10/09/2021 15:54, Janne Johansson wrote:

Den fre 10 sep. 2021 kl 14:39 skrev George Shuklin :

On 10/09/2021 14:49, George Shuklin wrote:

Hello.

I wonder if there is a way to see how many replicas are available for
each object (or, at least, PG-level statistics). Basically, if I have
damaged cluster, I want to see the scale of damage, and I want to see
the most degraded objects (which has 1 copy, then objects with 2
copies, etc).

Are there a way? pg list is not very informative, as it does not show
how badly 'unreplicated' data are.


Actually, the problem is more complicated than I expected. Here is the
artificial cluster, where there is a sizable chunk of data are single,
(cluster of thee servers with 2 OSD each, put some data, shutdown server
#1, put some more data, kill server #3, start server#1, it's guaranteed
that server #2 holds a single copy). This is snapshot of the ceph pg
dump for it as soon as #2 booted, and I can't find a proof that some
data are in a single copy:
https://gist.github.com/amarao/fbc8ef3538f66a9f2c264f8555f5c29a


In this case, where you have both made PGs undersized, and also degraded
by letting one OSD pick up some changes and then remove it and get another
one back in (I didn't see where #2 stopped in your example), I guess you will
have to take a deep dive into
ceph pg  query to see ALL the info about it.

By the time you are stacking multiple error scenarios on top of eachother,
I don't think there is a simple "show me a short understandable list of what
it almost near working"


No, I'm worried about observability of the situation when data are in a 
single copy (which I consider bit emergency). I've just created scenario 
when only single server (2 OSD) got data on it, and right after 
replication started, I can't detect that it's THAT bad. I've updated the 
gist: https://gist.github.com/amarao/fbc8ef3538f66a9f2c264f8555f5c29a 
with snapshot after cluster with single copy of data available found 
enough space to make all PG 'well-sized'. Replication is underway, but 
data are single-copy at the moment of snapshot.



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] OSD Service Advanced Specification db_slots

2021-09-10 Thread Edward R Huyer
I recently upgraded my existing cluster to Pacific and cephadm, and need to 
reconfigure all the (rotational) OSDs to use NVMe drives for db storage.  I 
think I have a reasonably good idea how that's going to work, but the use of 
db_slots and limit in the OSD service specification have me scratching my head.

Question 1:  Does db_slots actually work in the latest version of Pacific?  
It's listed here 
https://docs.ceph.com/en/pacific/cephadm/osd/#additional-options but in the 
advanced case section 
https://docs.ceph.com/en/pacific/cephadm/osd/#the-advanced-case there's still a 
note saying it's not implemented.

Question 2:  If db_slots still *doesn't* work, is there a coherent way to 
divide up a solid state DB drive for use by a bunch of OSDs when the OSDs may 
not all be created in one go?  At first I thought it was related to limit, but 
re-reading the advanced specification for a 4th time, I don't think that's the 
case.  Of course this question is moot if db_slots actually works.

Any advice or information would be appreciated.

-
Edward Huyer
Golisano College of Computing and Information Sciences
Rochester Institute of Technology
Golisano 70-2373
152 Lomb Memorial Drive
Rochester, NY 14623
585-475-6651
erh...@rit.edu

Obligatory Legalese:
The information transmitted, including attachments, is intended only for the 
person(s) or entity to which it is addressed and may contain confidential 
and/or privileged material. Any review, retransmission, dissemination or other 
use of, or taking of any action in reliance upon this information by persons or 
entities other than the intended recipient is prohibited. If you received this 
in error, please contact the sender and destroy any copies of this information.

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: List pg with heavily degraded objects

2021-09-10 Thread Janne Johansson
> No, I'm worried about observability of the situation when data are in a
> single copy (which I consider bit emergency). I've just created scenario
> when only single server (2 OSD) got data on it, and right after
> replication started, I can't detect that it's THAT bad. I've updated the

If you have min_size 2, the cluster will most certainly stop serving
data from the pool that this/these PGs belong to until you have 2
replicas, so it will not be hard to detect if it is THAT bad.

-- 
May the most significant bit of your life be positive.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] How many concurrent users can be supported by a single Rados gateway

2021-09-10 Thread huxia...@horebdata.cn
Dear Cephers,

I am planning a Ceph Cluster (Lumninous 12.2.13) for hosting on-line courses 
for one university.  The data would mostly be video media and thus 4+2 EC coded 
object store together with CivetWeb RADOS gateway will be utilized.

We plan to use 4 physical machines as Rados gateway solely, each with 2x Intel 
6226R CPU and 256 GB memory, for serving 8000 students concurrently, of which 
each may incur 2x 2Mb/s video streams.

Are these 4-machine Rados gateway a reasonable configuration for 8000 users, or 
an overkill, or insufficient?

Suggestions and comments are highly appreciated,

best regards,

Samuel



huxia...@horebdata.cn
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: How many concurrent users can be supported by a single Rados gateway

2021-09-10 Thread Konstantin Shalygin


> On 10 Sep 2021, at 18:04, huxia...@horebdata.cn wrote:
> 
> I am planning a Ceph Cluster (Lumninous 12.2.13) for hosting on-line courses 
> for one university.  The data would mostly be video media and thus 4+2 EC 
> coded object store together with CivetWeb RADOS gateway will be utilized.
> 
> We plan to use 4 physical machines as Rados gateway solely, each with 2x 
> Intel 6226R CPU and 256 GB memory, for serving 8000 students concurrently, of 
> which each may incur 2x 2Mb/s video streams.
> 
> Are these 4-machine Rados gateway a reasonable configuration for 8000 users, 
> or an overkill, or insufficient?
> 
> Suggestions and comments are highly appreciated,

Luminous are EOL, you should deploy Pacific instead



k
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: OSD Service Advanced Specification db_slots

2021-09-10 Thread Matthew Vernon

On 10/09/2021 15:20, Edward R Huyer wrote:


Question 2:  If db_slots still *doesn't* work, is there a coherent
way to divide up a solid state DB drive for use by a bunch of OSDs
when the OSDs may not all be created in one go?  At first I thought
it was related to limit, but re-reading the advanced specification
for a 4th time, I don't think that's the case.  Of course this
question is moot if db_slots actually works.


I've previously done this outside Ceph - i.e. have our existing
automation chop the NVMEs up into partitions, and then just tell Ceph to
use an NVME partition per OSD.

[not attempted this with cephadm, this was ceph-ansible]

Regards,

Matthew
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: How many concurrent users can be supported by a single Rados gateway

2021-09-10 Thread Eugen Block
The first suggestion is to not use Luminous since it’s already EOL. We  
noticed major improvements in performance when upgrading from L to  
Nautilus, and N will also be EOL soon. Since there are some reports  
about performance degradation when upgrading to Pacific I would  
recommend to use Octopus.



Zitat von huxia...@horebdata.cn:


Dear Cephers,

I am planning a Ceph Cluster (Lumninous 12.2.13) for hosting on-line  
courses for one university.  The data would mostly be video media  
and thus 4+2 EC coded object store together with CivetWeb RADOS  
gateway will be utilized.


We plan to use 4 physical machines as Rados gateway solely, each  
with 2x Intel 6226R CPU and 256 GB memory, for serving 8000 students  
concurrently, of which each may incur 2x 2Mb/s video streams.


Are these 4-machine Rados gateway a reasonable configuration for  
8000 users, or an overkill, or insufficient?


Suggestions and comments are highly appreciated,

best regards,

Samuel



huxia...@horebdata.cn
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io




___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: How many concurrent users can be supported by a single Rados gateway

2021-09-10 Thread Konstantin Shalygin
Nautilus already EOL too, commits is not backported to this branch. Only by 
companies who made products on this release and can verify patches self


k

Sent from my iPhone

> On 10 Sep 2021, at 18:23, Eugen Block  wrote:
> Nautilus, and N will also be EOL soon

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Drop of performance after Nautilus to Pacific upgrade

2021-09-10 Thread Igor Fedotov

Hi Luis,

some chances that you're hit by https://tracker.ceph.com/issues/52089. 
What is your physical DB volume configuration - are there fast 
standalone disks for that? If so are they showing high utilization 
during the benchmark?


It makes sense to try 16.2.6 once available - would the problem go away?


Thanks,

Igor


On 9/5/2021 8:45 PM, Luis Domingues wrote:

Hello,

I run a test cluster of 3 machines with 24 HDDs each, running bare-metal on 
CentOS 8. Long story short, I can have a bandwidth of ~ 1'200 MB/s when I do a 
rados bench, writing objects of 128k, when the cluster is installed with 
Nautilus.

When I upgrade the cluster to Pacific, (using ceph-ansible to deploy and/or 
upgrade), my performances drop to ~400 MB/s of bandwidth doing the same rados 
bench.

I am kind of clueless on what makes the performance drop so much. Does someone 
have some ideas where I can dig to find the root of this difference?

Thanks,
Luis Domingues
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Ignore Ethernet interface

2021-09-10 Thread Dominik Baack

Hi,

we are currently trying to deploy CephFS on 7 storage nodes connected by 
two infiniband ports and an ethernet port for external communication.


For various reasons the network interfaces are mapped to the same IP 
range e.g. x.x.x.15y (eno1), x.x.x.17y (ib1), x.x.x.18y (ib2) with x 
constant for all ports and server and only "y" changing for individual 
server.


Most initial networking traffic is send over Ethernet, which then leads 
Ceph to choosing this interface for communication. The problem now lies 
in the fact that it is not possible to communicate inside of the cluster 
over this interface at all. This leads to timeout of all ceph commands 
after restart or a failure to setting up the server at all.


Is it possible to deselect or fully ignore the Ethernet port for 
deploying and running the infrastructure via cephadm without changing 
the code:


https://github.com/ceph/ceph/blob/master/src/cephadm/cephadm
Line 6730++

Cheers
Dominik

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Drop of performance after Nautilus to Pacific upgrade

2021-09-10 Thread Lomayani S. Laizer
Hello,
I might be hit by the same bug. After upgrading from octopus  to pacific my
cluster is slower by around 2-3times.

I will try 16.2.6 when is out

On Fri, Sep 10, 2021 at 6:58 PM Igor Fedotov  wrote:

> Hi Luis,
>
> some chances that you're hit by https://tracker.ceph.com/issues/52089.
> What is your physical DB volume configuration - are there fast
> standalone disks for that? If so are they showing high utilization
> during the benchmark?
>
> It makes sense to try 16.2.6 once available - would the problem go away?
>
>
> Thanks,
>
> Igor
>
>
> On 9/5/2021 8:45 PM, Luis Domingues wrote:
> > Hello,
> >
> > I run a test cluster of 3 machines with 24 HDDs each, running bare-metal
> on CentOS 8. Long story short, I can have a bandwidth of ~ 1'200 MB/s when
> I do a rados bench, writing objects of 128k, when the cluster is installed
> with Nautilus.
> >
> > When I upgrade the cluster to Pacific, (using ceph-ansible to deploy
> and/or upgrade), my performances drop to ~400 MB/s of bandwidth doing the
> same rados bench.
> >
> > I am kind of clueless on what makes the performance drop so much. Does
> someone have some ideas where I can dig to find the root of this difference?
> >
> > Thanks,
> > Luis Domingues
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: How many concurrent users can be supported by a single Rados gateway

2021-09-10 Thread huxia...@horebdata.cn
Thanks for the suggestions. 

My viewpoints may be wrong, but i think stability is utmost for us, and an 
older version such as Luminous may be much well battle-field tested that recent 
ones. Unless there is some instatbilty or bug reports, I would still trust 
older versions. Just my own preference on which version takes my turst

thanks a lot,

Samuel




huxia...@horebdata.cn
 
From: Eugen Block
Date: 2021-09-10 17:21
To: huxiaoyu
CC: ceph-users
Subject: Re: [ceph-users] How many concurrent users can be supported by a 
single Rados gateway
The first suggestion is to not use Luminous since it’s already EOL. We  
noticed major improvements in performance when upgrading from L to  
Nautilus, and N will also be EOL soon. Since there are some reports  
about performance degradation when upgrading to Pacific I would  
recommend to use Octopus.
 
 
Zitat von huxia...@horebdata.cn:
 
> Dear Cephers,
>
> I am planning a Ceph Cluster (Lumninous 12.2.13) for hosting on-line  
> courses for one university.  The data would mostly be video media  
> and thus 4+2 EC coded object store together with CivetWeb RADOS  
> gateway will be utilized.
>
> We plan to use 4 physical machines as Rados gateway solely, each  
> with 2x Intel 6226R CPU and 256 GB memory, for serving 8000 students  
> concurrently, of which each may incur 2x 2Mb/s video streams.
>
> Are these 4-machine Rados gateway a reasonable configuration for  
> 8000 users, or an overkill, or insufficient?
>
> Suggestions and comments are highly appreciated,
>
> best regards,
>
> Samuel
>
>
>
> huxia...@horebdata.cn
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
 
 
 
 
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io