[ceph-users] Re: BUG #51821 - client is using insecure global_id reclaim

2021-08-17 Thread Daniel Persson
Hi again.

I've now solved my issue with help from people in this group. Thank you for
helping out.
I thought the process was a bit complicated so I created a short video
describing the process.

https://youtu.be/Ds4Wvvo79-M

I hope this helps someone else, and again thank you.

Best regards
Daniel


On Mon, Aug 9, 2021 at 5:43 PM Ilya Dryomov  wrote:

> On Mon, Aug 9, 2021 at 5:14 PM Robert W. Eckert 
> wrote:
> >
> > I have had the same issue with the windows client.
> > I had to issue
> > ceph config set mon auth_expose_insecure_global_id_reclaim false
> > Which allows the other clients to connect.
> > I think you need to restart the monitors as well, because the first few
> times I tried this, I still couldn't connect.
>
> For archive's sake, I'd like to mention that disabling
> auth_expose_insecure_global_id_reclaim isn't right and it wasn't
> intended for this.  Enabling auth_allow_insecure_global_id_reclaim
> should be enough to allow all (however old) clients to connect.
> The fact that it wasn't enough for the available Windows build
> suggests that there is some subtle breakage in it because all "expose"
> does is it forces the client to connect twice instead of just once.
> It doesn't actually refuse old unpatched clients.
>
> (The breakage isn't surprising given that the available build is
> more or less a random development snapshot with some pending at the
> time Windows-specific patches applied.  I'll try to escalate issue
> and get the linked MSI bundle updated.)
>
> Thanks,
>
> Ilya
>
> >
> > -Original Message-
> > From: Richard Bade 
> > Sent: Sunday, August 8, 2021 8:27 PM
> > To: Daniel Persson 
> > Cc: Ceph Users 
> > Subject: [ceph-users] Re: BUG #51821 - client is using insecure
> global_id reclaim
> >
> > Hi Daniel,
> > I had a similar issue last week after upgrading my test cluster from
> > 14.2.13 to 14.2.22 which included this fix for Global ID reclaim in .20.
> My issue was a rados gw that I was re-deploying on the latest version. The
> problem seemed to be related with cephx authentication.
> > It kept displaying the error message you have and the service wouldn't
> start.
> > I ended up stopping and removing the old rgw service, deleting all the
> keys in /etc/ceph/ and all data in /var/lib/ceph/radosgw/ and re-deploying
> the radosgw. This used the new rgw bootstrap keys and new key for this
> radosgw.
> > So, I would suggest you double and triple check which keys your clients
> are using and that cephx is enabled correctly on your cluster.
> > Check your admin key in /etc/ceph as well, as that's what's being used
> for ceph status.
> >
> > Regards,
> > Rich
> >
> > On Sun, 8 Aug 2021 at 05:01, Daniel Persson 
> wrote:
> > >
> > > Hi everyone.
> > >
> > > I suggested asking for help here instead of in the bug tracker so that
> > > I will try it.
> > >
> > > https://tracker.ceph.com/issues/51821?next_issue_id=51820&prev_issue_i
> > > d=51824
> > >
> > > I have a problem that I can't seem to figure out how to resolve the
> issue.
> > >
> > > AUTH_INSECURE_GLOBAL_ID_RECLAIM: client is using insecure global_id
> > > reclaim
> > > AUTH_INSECURE_GLOBAL_ID_RECLAIM_ALLOWED: mons are allowing insecure
> > > global_id reclaim
> > >
> > >
> > > Both of these have to do with reclaiming ID and securing that no
> > > client could steal or reuse another client's ID. I understand the
> > > reason for this and want to resolve the issue.
> > >
> > > Currently, I have three different clients.
> > >
> > > * One Windows client using the latest Ceph-Dokan build. (ceph version
> > > 15.0.0-22274-g5656003758 (5656003758614f8fd2a8c49c2e7d4f5cd637b0ea)
> > > pacific
> > > (rc))
> > > * One Linux Debian build using the built packages for that kernel. (
> > > 4.19.0-17-amd64)
> > > * And one client that I've built from source for a raspberry PI as
> > > there is no arm build for the Pacific release. (5.11.0-1015-raspi)
> > >
> > > If I switch over to not allow global id reclaim, none of these clients
> > > could connect, and using the command "ceph status" on one of my nodes
> > > will also fail.
> > >
> > > All of them giving the same error message:
> > >
> > > monclient(hunting): handle_auth_bad_method server allowed_methods [2]
> > > but i only support [2]
> > >
> > >
> > > Has anyone encountered this problem and have any suggestions?
> > >
> > > PS. The reason I have 3 different hosts is that this is a test
> > > environment where I try to resolve and look at issues before we
> > > upgrade our production environment to pacific. DS.
> > >
> > > Best regards
> > > Daniel
> > > ___
> > > ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an
> > > email to ceph-users-le...@ceph.io
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an
> email to ceph-users-le...@ceph.io
> > ___
> > ceph-users mailing l

[ceph-users] Re: PGs stuck after replacing OSDs

2021-08-17 Thread Etienne Menguy
Hi,

It’s hard to explain as issue is no longer here, if it happens again “ceph pg 
x.y query” output could be useful.

I don’t think you went too fast or removed too many disks in a single step.
As you only have 3 nodes, Ceph should have directly noticed degraded PG and 
could not do much.
You didn’t had to set them out as you removed them from crushmap just after.

If you are able to change disk without restarting host, I would advise you to 
do it one by one.
Any osd issue on the 2 others servers will lead to a service outage, as you’ll 
end with a single copy of some PG.
If you have to deal with issues, you’ll prefer to have a cluster with 3% of 
degraded objects rather than 33%.
Also, it will be less impacting for your users as more OSD will be available to 
handle IO.

From my experience, restarting OSD/cluster is sometimes a fix for strange 
issues.

Étienne

> On 17 Aug 2021, at 09:41, Ml Ml  wrote:
> 
> Hello List,
> 
> I am running Proxmox on top of ceph 14.2.20 on the nodes, replica 3, size 2.
> 
> Last week I wanted to swap the HDDs to SDDs on one node.
> 
> Since i have 3 Nodes with replica 3, size 2 i did the following:
> 
> 1.) cep osd set noout
> 2.) Stopped all OSD on that one node
> 3.) i set the OSDs to out "ceph osd out" on that node
> 4.) I removed/destroyed the OSD
> 5.) I physically took the disk/osd out
> 6.) I plugged my SSDs in and started to add them as OSDs.
> 
> Recovery was active and running, but some PGs did not serve IO and
> where stuck. VMs started to  complain about IO problems. Looked like
> write was not able to some.
> 
> Looked like I had "pgs stuck" and "slow osds blocking"...
> ...but "ceph osd perf" and "iostat -dx 3" showed bored/idle OSDs.
> 
> ...I restarted the OSD which seemed to be involved. Which did not help.
> 
> After 1h or so i started to restart ALL the OSDs one by one in the
> whole Cluster. After restarting the last OSD in that cluster on a very
> different node, the blocking error went away
> and everything seemed to recover smoothly.
> 
> 
> I wonder what i did wrong.
> I did those Steps (1-6) within 5mins (so pretty fast). Maybe I should
> have taken more time?
> Was it too rough to replace all OSDs on one node?
> Should I have replaced it one by one?
> 
> Any hints are welcome.
> 
> Cheers,
> Mario
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Multiple DNS names for RGW?

2021-08-17 Thread Christian Rohmann

Hey Burkhard, Chris, all,

On 16/08/2021 10:48, Chris Palmer wrote:
It's straightforward to add multiple DNS names to an endpoint. We do 
this for the sort of reasons you suggest. You then don't need separate 
rgw instances (not for this reason anyway).


Assuming default:

 * radosgw-admin zonegroup get > zg-default
 * Edit zg-default, changing "hostnames" to e.g.  ["host1",
   "host1.domain", "host2", "host2.domain"]
 * radosgw-admin zonegroup set --infile zg-default
 * Restart all rgw instances

Please excuse my confusion, but how does this relate to the endpoints of 
zonegroup and zones then.

What does setting endpoints (or hostnames even) on those actually do?


If I may split my confusion up into some questions 


1) From what I understand is that a zone has endpoints to identify how 
it can be reached by other RGWs to enable communication for multisite sync.

So having

  * s3-az1.example.com (zone "az1")
  * s3-az2.example.com (zone "az2")

as endpoints in each zone of my two zones allows the two zones to talk 
to each other.


But does this have to match the "rgw dns name" setting on the RGWs in 
each zone then?
Or could I potentially just add all the individual hosts (if they were 
reachable) of my RGW farm to avoid hitting the loadbalancer in front?



2) How do the endpoints of the whole zone-group relate then? Do I simply 
add all endpoints of all zones?

What are those used for then?


3) How would one go about having a global DNS name used to always point 
to the master zone
Would I just add another "global" or "generic" hostname, let's say 
s3.example.com to the zonegroup as an endpoint and have the DNS point to 
the LB of the current master zone? The intention would be to avoid 
involving the clients having to update their endpoint in case of a failover.




Thanks and with kind regards


Christian


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Is rbd-mirror a product level feature?

2021-08-17 Thread zp_8483
Hi all,




Can we enable rbd-mirror feature in product environment? if not,  are there any 
known issues?




Thanks,




Zhen












 





 
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Multiple DNS names for RGW?

2021-08-17 Thread Chris Palmer

Hi Christian

I don't have much experience with multisite so I'll let someone else 
answer that aspect. But each RGW will only accept requests where the 
Host header matches one of the "hostnames" configured as below. 
Otherwise the client will simply get an error response. So, as someone 
else suggested, a proxy could inject/overwrite a suitable Host header 
(haven't thought through whether that could affect signed URLs). We set 
the hostnames as it allows some of our internal traffic to hit RGWs 
without going through the proxy, whereas external traffic must come 
through the proxy. In that case we didn't need the proxy to inject a 
Host header.


Regards, Chris

On 17/08/2021 09:47, Christian Rohmann wrote:

Hey Burkhard, Chris, all,

On 16/08/2021 10:48, Chris Palmer wrote:
It's straightforward to add multiple DNS names to an endpoint. We do 
this for the sort of reasons you suggest. You then don't need 
separate rgw instances (not for this reason anyway).


Assuming default:

 * radosgw-admin zonegroup get > zg-default
 * Edit zg-default, changing "hostnames" to e.g.  ["host1",
   "host1.domain", "host2", "host2.domain"]
 * radosgw-admin zonegroup set --infile zg-default
 * Restart all rgw instances

Please excuse my confusion, but how does this relate to the endpoints 
of zonegroup and zones then.

What does setting endpoints (or hostnames even) on those actually do?


If I may split my confusion up into some questions 


1) From what I understand is that a zone has endpoints to identify how 
it can be reached by other RGWs to enable communication for multisite 
sync.

So having

  * s3-az1.example.com (zone "az1")
  * s3-az2.example.com (zone "az2")

as endpoints in each zone of my two zones allows the two zones to talk 
to each other.


But does this have to match the "rgw dns name" setting on the RGWs in 
each zone then?
Or could I potentially just add all the individual hosts (if they were 
reachable) of my RGW farm to avoid hitting the loadbalancer in front?



2) How do the endpoints of the whole zone-group relate then? Do I 
simply add all endpoints of all zones?

What are those used for then?


3) How would one go about having a global DNS name used to always 
point to the master zone
Would I just add another "global" or "generic" hostname, let's say 
s3.example.com to the zonegroup as an endpoint and have the DNS point 
to the LB of the current master zone? The intention would be to avoid 
involving the clients having to update their endpoint in case of a 
failover.




Thanks and with kind regards


Christian


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Raid redundance not good

2021-08-17 Thread Network Admin
Hi all, 

first, apologize for my english writen :) 

I installed a Ceph system with 3 servers : 

- server 1 : all services 

- server 2 : all services 

- serveur 3 : no osd, only monitor 

I put files with Cepfs : all is good and ceph monitor indicate 2
replicates. 

But when I down server 2, my datas are no more accessible. 

Why ? Isn't goal of Ceph to serve data in this case ? 

Thanks you for any help. 

Best regards 

Yves
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Raid redundance not good

2021-08-17 Thread Marc
Your min replicas is also 2?? Change that to 1.




> -Original Message-
> Sent: Tuesday, 17 August 2021 12:16
> To: ceph-users@ceph.io
> Subject: [ceph-users] Raid redundance not good
> 
> Hi all,
> 
> first, apologize for my english writen :)
> 
> I installed a Ceph system with 3 servers :
> 
> - server 1 : all services
> 
> - server 2 : all services
> 
> - serveur 3 : no osd, only monitor
> 
> I put files with Cepfs : all is good and ceph monitor indicate 2
> replicates.
> 
> But when I down server 2, my datas are no more accessible.
> 
> Why ? Isn't goal of Ceph to serve data in this case ?
> 
> Thanks you for any help.
> 
> Best regards
> 

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: SSD disk for OSD detected as type HDD

2021-08-17 Thread mabi
Hi Etienne,

Thanks for your answer. I actually had to remove the class first. So for 
example this 2-step process works:

ceph osd crush rm-device-class osd.0
ceph osd crush set-device-class sdd osd.0

osd tree now reports correctly sdd as class. Funilly enough "ceph orch device 
ls" still reports hdd as class but maybe this is a matter of time that it 
changes in the orchestrator.

‐‐‐ Original Message ‐‐‐
On Monday, August 16th, 2021 at 12:10 PM, Etienne Menguy 
 wrote:

> Hi,
>
> Changing device class works?
>
> https://docs.ceph.com/en/latest/rados/operations/crush-map/#device-classes
> ceph osd crush set-device-class   [...]
>
> Étienne
>
>> On 16 Aug 2021, at 12:05, mabi  wrote:
>>
>> Hello,
>>
>> I noticed that cephadm detects my newly added SSD disk as type HDD as you 
>> can see below:
>>
>> $ ceph orch device ls
>> Hostname Path Type Serial Size Health Ident Fault Available
>> node1 /dev/sda hdd REMOVED 7681G Unknown N/A N/A No
>>
>> How can I force the disk type to SSD instead of HDD?
>>
>> Regards,
>> Mabi
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Multiple DNS names for RGW?

2021-08-17 Thread Janne Johansson
Den tis 17 aug. 2021 kl 11:46 skrev Chris Palmer :
>
> Hi Christian
>
> I don't have much experience with multisite so I'll let someone else
> answer that aspect. But each RGW will only accept requests where the
> Host header matches one of the "hostnames" configured as below.
> Otherwise the client will simply get an error response. So, as someone
> else suggested, a proxy could inject/overwrite a suitable Host header
> (haven't thought through whether that could affect signed URLs). We set
> the hostnames as it allows some of our internal traffic to hit RGWs
> without going through the proxy, whereas external traffic must come
> through the proxy. In that case we didn't need the proxy to inject a
> Host header.

Don't forget that v4 auth bakes in the clients idea of what the
hostname of the endpoint was, so its not only about changing headers.
If you are not using v2 auth, you will not be able to rewrite the
hostname on the fly.

-- 
May the most significant bit of your life be positive.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Raid redundance not good

2021-08-17 Thread Janne Johansson
Den tis 17 aug. 2021 kl 12:17 skrev Network Admin
:
> Hi all,
> first, apologize for my english writen :)
> I installed a Ceph system with 3 servers :
> - server 1 : all services
> - server 2 : all services
> - serveur 3 : no osd, only monitor
> I put files with Cepfs : all is good and ceph monitor indicate 2
> replicates.
> But when I down server 2, my datas are no more accessible.
> Why ? Isn't goal of Ceph to serve data in this case ?

No, you have no redundancy at this point, so the cluster chooses the
safe choice and stops IO until you can get redundancy back up again.
As Marc said you can override this, but by default ceph makes sure you
will not lose data or consistency, which is why it stops.

Ceph even prefers each pool having 3 copies, with min_size = 2 so that
you can allow both reads and writes when a single host is down, but
you need 3 (and preferably more) hosts for that.

-- 
May the most significant bit of your life be positive.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] PGs stuck after replacing OSDs

2021-08-17 Thread Ml Ml
Hello List,

I am running Proxmox on top of ceph 14.2.20 on the nodes, replica 3, size 2.

Last week I wanted to swap the HDDs to SDDs on one node.

Since i have 3 Nodes with replica 3, size 2 i did the following:

1.) cep osd set noout
2.) Stopped all OSD on that one node
3.) i set the OSDs to out "ceph osd out" on that node
4.) I removed/destroyed the OSD
5.) I physically took the disk/osd out
6.) I plugged my SSDs in and started to add them as OSDs.

Recovery was active and running, but some PGs did not serve IO and
where stuck. VMs started to  complain about IO problems. Looked like
write was not able to some.

Looked like I had "pgs stuck" and "slow osds blocking"...
...but "ceph osd perf" and "iostat -dx 3" showed bored/idle OSDs.

...I restarted the OSD which seemed to be involved. Which did not help.

After 1h or so i started to restart ALL the OSDs one by one in the
whole Cluster. After restarting the last OSD in that cluster on a very
different node, the blocking error went away
and everything seemed to recover smoothly.


I wonder what i did wrong.
I did those Steps (1-6) within 5mins (so pretty fast). Maybe I should
have taken more time?
Was it too rough to replace all OSDs on one node?
Should I have replaced it one by one?

Any hints are welcome.

Cheers,
Mario
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: PGs stuck after replacing OSDs

2021-08-17 Thread Frank Schilder
Maybe an instance of https://tracker.ceph.com/issues/46847 ?
Nest time you see this problem, you can try the new "repeer" command on 
affected PGs. The "ceph pg x.y query" as mentioned by Etienne will provide a 
clue if its due to this bug.

Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14


From: Etienne Menguy 
Sent: 17 August 2021 10:27:14
To: ceph-users
Subject: [ceph-users] Re: PGs stuck after replacing OSDs

Hi,

It’s hard to explain as issue is no longer here, if it happens again “ceph pg 
x.y query” output could be useful.

I don’t think you went too fast or removed too many disks in a single step.
As you only have 3 nodes, Ceph should have directly noticed degraded PG and 
could not do much.
You didn’t had to set them out as you removed them from crushmap just after.

If you are able to change disk without restarting host, I would advise you to 
do it one by one.
Any osd issue on the 2 others servers will lead to a service outage, as you’ll 
end with a single copy of some PG.
If you have to deal with issues, you’ll prefer to have a cluster with 3% of 
degraded objects rather than 33%.
Also, it will be less impacting for your users as more OSD will be available to 
handle IO.

>From my experience, restarting OSD/cluster is sometimes a fix for strange 
>issues.

Étienne

> On 17 Aug 2021, at 09:41, Ml Ml  wrote:
>
> Hello List,
>
> I am running Proxmox on top of ceph 14.2.20 on the nodes, replica 3, size 2.
>
> Last week I wanted to swap the HDDs to SDDs on one node.
>
> Since i have 3 Nodes with replica 3, size 2 i did the following:
>
> 1.) cep osd set noout
> 2.) Stopped all OSD on that one node
> 3.) i set the OSDs to out "ceph osd out" on that node
> 4.) I removed/destroyed the OSD
> 5.) I physically took the disk/osd out
> 6.) I plugged my SSDs in and started to add them as OSDs.
>
> Recovery was active and running, but some PGs did not serve IO and
> where stuck. VMs started to  complain about IO problems. Looked like
> write was not able to some.
>
> Looked like I had "pgs stuck" and "slow osds blocking"...
> ...but "ceph osd perf" and "iostat -dx 3" showed bored/idle OSDs.
>
> ...I restarted the OSD which seemed to be involved. Which did not help.
>
> After 1h or so i started to restart ALL the OSDs one by one in the
> whole Cluster. After restarting the last OSD in that cluster on a very
> different node, the blocking error went away
> and everything seemed to recover smoothly.
>
>
> I wonder what i did wrong.
> I did those Steps (1-6) within 5mins (so pretty fast). Maybe I should
> have taken more time?
> Was it too rough to replace all OSDs on one node?
> Should I have replaced it one by one?
>
> Any hints are welcome.
>
> Cheers,
> Mario
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: create a Multi-zone-group sync setup

2021-08-17 Thread Boris Behrens
Hi, after some trial and error I got it working, so users will get synced.

However, If I try to create a bucket via s3cmd I receive the following
error:
s3cmd --access_key=XX --secret_key=YY --host=HOST mb s3://test
ERROR: S3 error: 403 (InvalidAccessKeyId)

When I try the same with ls I just get an empty response (because there are
no buckets to list).

I get this against both radosgw locations.
I have an nginx in between the internet and radosgw that will just proxy
pass every address and sets host and x-forwarded-for header.


Am Fr., 30. Juli 2021 um 16:46 Uhr schrieb Boris Behrens :

> Hi people,
>
> I try to create a Multi-zone-group setup (like it is described here:
> https://docs.ceph.com/en/latest/radosgw/multisite/)
>
> But I simply fail.
>
> I just created a testcluster to mess with it, and no matter how I try to.
>
> Is there a howto avaialable?
>
> I don't want to get a multi-zone setup, where I sync the actual zone data,
> but have a global namespace where all buckets and users are uniqe.
>
> Cheers
>  Boris
>
> --
> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
> groüen Saal.
>


-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Why you might want packages not containers for Ceph deployments

2021-08-17 Thread Erik Lindahl
Hi,

I figured I should follow up on this discussion, not with the intention of
bashing any particular solution, but pointing to at least one current major
challenge with cephadm.

As I wrote earlier in the thread, we previously found it ... challenging to
debug things running in cephadm. Earlier this week it appears we too were
hit by the bug where cephadm removes monitors from the monmap (
https://tracker.ceph.com/issues/51027 ) if the node is rebooted.

Presently our cluster is offline, because there's still no fix, and every
single piece of documentation for things like monmaptool appears to assume
it's running natively, not through cephadm. There's also the additional
fragility that all the "ceph orch" commands themselves stop working (even a
simple status request just hangs) if the ceph cluster itself is down.  I
suspect we'll find ways around that, but when reflecting I have a few
thoughts:

1. It is significantly harder than one thinks to develop a stable
orchestrating environment. We've been happy with both salt & ansible, but
on balance cephadm appears quite fragile - and I'm not sure if it will ever
be realistic to invest the amount of work required to make it as stable.
There are of course many advantages to having something closely tied to the
specific solution (ceph) - but in hindsight that seems to only have been an
advantage in sunny weather. Once the service itself is down, I think it is
a clear & major drawback that suddenly your orchestrator also stops
responding. Long-term, if cephadm is the solution, I think it's important
that it works even when the ceph services themselves are down.

2. I think ceph - in particular the documentation - suffers from too many
different ways of doing things (raw packages, or rook, or cephadm, which in
turn can use either docker or podman, etc.), which again is a pain the
second you need to debug or fix anything.  If the decision is that cephadm
is the way things should work, so be it, but then all documentation has to
actually reflect how to do things in a cephadm environment (and not e.g.
assuming all the containers are running so you can log in to the right
container first). How do you extract a monman in a cephadm cluster, for
instance? Just following the default documentation produces errors.
Presently I feel the short-term solution has been to allow multiple
different ways of doing things. As a developer I can understand that, but
as a user it's a nightmare unless somebody takes the time to properly
update all documentation with two (or more) choices describing how to do
things (a) natively, or (b) in a cephadm cluster.



Again, this is meant as hopefully constructive feedback rather than
complaints, but the feeling a get after having had fairly smooth operations
with raw packages (including fixing previous bugs leading to severe
crashes) and lately grinding our teeth a bit over cephadm is that it has
helped automated a bunch of stuff that wasn't particularly difficult (it's
nice to issue an update with a single command, but it works perfectly fine
manually too) at the cost of making it WAY more difficult to fix things
(not to mention simply get information about the cluster) when we have
problems - and in the long run that's not a trade-off I'm entirely happy
with :-)


Cheers,

Erik





On Tue, Jun 29, 2021 at 1:25 AM Sage Weil  wrote:

> On Fri, Jun 25, 2021 at 10:27 AM Nico Schottelius
>  wrote:
> > Hey Sage,
> >
> > Sage Weil  writes:
> > > Thank you for bringing this up.  This is in fact a key reason why the
> > > orchestration abstraction works the way it does--to allow other
> > > runtime environments to be supported (FreeBSD!
> > > sysvinit/Devuan/whatever for systemd haters!)
> >
> > I would like you to stop labeling people who have reasons for not using
> > a specific software as haters.
> >
> > It is not productive to call Ceph developers "GlusterFS haters", nor to
> > call Redhat users Debian haters.
> >
> > It is simple not an accurate representation.
>
> You're right, and I apologize.  My intention was to point out that we
> tried to keep the door open to everyone, even those who might be
> called "haters", but I clearly missed the mark.
>
> sage
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>


-- 
Erik Lindahl 
Professor of Biophysics, Dept. Biochemistry & Biophysics, Stockholm
University
Science for Life Laboratory, Box 1031, 17121 Solna, Sweden

Note: I frequently do email outside office hours because it is a convenient
time for me to write, but please do not interpret that as an expectation
for you to respond outside your work hours.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Why you might want packages not containers for Ceph deployments

2021-08-17 Thread Marc
> 
> Again, this is meant as hopefully constructive feedback rather than
> complaints, but the feeling a get after having had fairly smooth
> operations with raw packages (including fixing previous bugs leading to
> severe crashes) and lately grinding our teeth a bit over cephadm is that
> it has helped automated a bunch of stuff that wasn't particularly
> difficult (it's nice to issue an update with a single command, but it
> works perfectly fine manually too) at the cost of making it WAY more
> difficult to fix things (not to mention simply get information about the
> cluster) when we have problems - and in the long run that's not a trade-
> off I'm entirely happy with :-)
> 

Everyone can only agree to keeping things simple. I honestly do not even know 
why you want to try cephadm. The containerized solution was developed to 
replace ceph deploy, ceph ansible etc. as a solution to make ceph installation 
for new users easier. That is exactly the reason (imho) why you should not use 
the containerized environment. Because a containerized environment has not as 
primaray task being an easy deployment tool. And because the focus is on easy 
deployment, the real characteristics of the containerized environment are being 
ignored during this development. Such as, you must be out of your mind to 
create a depency between ceph-osd/msd/mon/all and dockerd.

10 years(?) ago the people of mesos thought the docker containerizer was 
'flacky' and created their own more stable containerizer. And still today, 
containers are being killed if dockerd is terminated. What some users had to 
learn the hard way, as recently posted here. 

Today's container solutions are not on the level where you can say, you require 
absolutely no knowledge to fix issues. So that means you would always require 
knowledge of the container solution + ceph to troubleshoot. And that is of 
course more knowledge, than just knowing ceph.

I would not be surprised if cephadm ends up like ceph deploy/ansible. 
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Ceph cluster with 2 replicas

2021-08-17 Thread Michel Niyoyita
Hi all ,

Going to deploy a ceph cluster in production with  replicas size of 2 . Is
there any inconvenience on the service side ?  I am going to change the
default (3) to 2.

Please advise.

Regards.

Michel
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Manual deployment of an OSD failed

2021-08-17 Thread Marc
> 
> going to deploy a test cluster and successfully deployed my first
> monitor (hurray!).
> 
> Now trying to add the first osd host following instructions at:
> https://docs.ceph.com/en/latest/install/manual-deployment/#bluestore
> 


ceph-volume lvm zap --destroy /dev/sdb
ceph-volume lvm create --data /dev/sdb --dmcrypt

systemctl enable ceph-osd@0


> I have to note - however - that:
> 
> 1.
> 
> --
> 
> copy /var/lib/ceph/bootstrap-osd/ceph.keyring from monitor node
> (mon-node1) to /var/lib/ceph/bootstrap-osd/ceph.keyring on osd node
> (osd-node1)
> 
this not easier:

sudo -u ceph ceph auth get client.bootstrap-osd -o 
/var/lib/ceph/bootstrap-osd/ceph.keyring

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Ceph cluster with 2 replicas

2021-08-17 Thread Marc
If you have to ask (and don't give crucial details like ssd/hdd etc), then I 
would recommend just following the advice of more experienced and knowledgable 
people here and stick to 3 (see archive).




> -Original Message-
> From: Michel Niyoyita 
> Sent: Tuesday, 17 August 2021 16:29
> To: ceph-users@ceph.io
> Subject: *SPAM* [ceph-users] Ceph cluster with 2 replicas
> 
> Hi all ,
> 
> Going to deploy a ceph cluster in production with  replicas size of 2 .
> Is
> there any inconvenience on the service side ?  I am going to change the
> default (3) to 2.
> 
> Please advise.
> 
> Regards.
> 
> Michel
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph cluster with 2 replicas

2021-08-17 Thread Nathan Fish
There are only two ways that size=2 can go:
A) You set min_size=1 and risk data loss
B) You set min_size=2 and your cluster stops every time you lose a
drive or reboot a machine

Neither of these are good options for most use cases; but there's
always an edge case. You should stay with size=3, min_size=2 unless
you have an unusual use case.

On Tue, Aug 17, 2021 at 10:33 AM Michel Niyoyita  wrote:
>
> Hi all ,
>
> Going to deploy a ceph cluster in production with  replicas size of 2 . Is
> there any inconvenience on the service side ?  I am going to change the
> default (3) to 2.
>
> Please advise.
>
> Regards.
>
> Michel
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph cluster with 2 replicas

2021-08-17 Thread Anthony D'Atri

There are cerrtain sequences of events that can result in Ceph not knowing 
which copy of a PG (if any) has the current information.  That’s one way you 
can effectively lose data.

I ran into it myself last year on a legacy R2 cluster.

If you *must* have a 2:1 raw:usable ratio, you’re better off with 2,2 EC. 
Asuming you have at least 4 failure domains.

> 
> There are only two ways that size=2 can go:
> A) You set min_size=1 and risk data loss
> B) You set min_size=2 and your cluster stops every time you lose a
> drive or reboot a machine
> 
> Neither of these are good options for most use cases; but there's
> always an edge case. You should stay with size=3, min_size=2 unless
> you have an unusual use case.
> 
> On Tue, Aug 17, 2021 at 10:33 AM Michel Niyoyita  wrote:
>> 
>> Hi all ,
>> 
>> Going to deploy a ceph cluster in production with  replicas size of 2 . Is
>> there any inconvenience on the service side ?  I am going to change the
>> default (3) to 2.
>> 
>> Please advise.
>> 
>> Regards.
>> 
>> Michel
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Why you might want packages not containers for Ceph deployments

2021-08-17 Thread Erik Lindahl
Hi,

Whether containers are good or not is a separate discussion where I suspect 
there won't be consensus in the near future.

However, after just having looked at the documentation again, my main point 
would be that when a major stable open source project recommends a specific 
installation method (=cephadm) first in the "getting started" guide, users are 
going to expect that's the alternative things are documented for, which isn't 
quite the case for cephadm (yet).

Most users will probably accept either solution as long as there is ONE clear & 
well-documented way of working with ceph - but the current setup of even having 
the simple (?) getting started guide talk about at least three different ways 
without clearly separating their documentation seems like a guarantee for 
long-term confusion and higher entry barriers for new users, which I assume is 
the opposite of the goal of cephadm!

Cheers,

Erik


Erik Lindahl 
Professor of Biophysics
Science for Life Laboratory
Stockholm University & KTH
Office (SciLifeLab): +46 8 524 81567
Cell (Sweden): +46 73 4618050
Cell (US): +1 (650) 924 7674 



> On 17 Aug 2021, at 16:29, Marc  wrote:
> 
> 
>> 
>> 
>> Again, this is meant as hopefully constructive feedback rather than
>> complaints, but the feeling a get after having had fairly smooth
>> operations with raw packages (including fixing previous bugs leading to
>> severe crashes) and lately grinding our teeth a bit over cephadm is that
>> it has helped automated a bunch of stuff that wasn't particularly
>> difficult (it's nice to issue an update with a single command, but it
>> works perfectly fine manually too) at the cost of making it WAY more
>> difficult to fix things (not to mention simply get information about the
>> cluster) when we have problems - and in the long run that's not a trade-
>> off I'm entirely happy with :-)
>> 
> 
> Everyone can only agree to keeping things simple. I honestly do not even know 
> why you want to try cephadm. The containerized solution was developed to 
> replace ceph deploy, ceph ansible etc. as a solution to make ceph 
> installation for new users easier. That is exactly the reason (imho) why you 
> should not use the containerized environment. Because a containerized 
> environment has not as primaray task being an easy deployment tool. And 
> because the focus is on easy deployment, the real characteristics of the 
> containerized environment are being ignored during this development. Such as, 
> you must be out of your mind to create a depency between ceph-osd/msd/mon/all 
> and dockerd.
> 
> 10 years(?) ago the people of mesos thought the docker containerizer was 
> 'flacky' and created their own more stable containerizer. And still today, 
> containers are being killed if dockerd is terminated. What some users had to 
> learn the hard way, as recently posted here. 
> 
> Today's container solutions are not on the level where you can say, you 
> require absolutely no knowledge to fix issues. So that means you would always 
> require knowledge of the container solution + ceph to troubleshoot. And that 
> is of course more knowledge, than just knowing ceph.
> 
> I would not be surprised if cephadm ends up like ceph deploy/ansible. 
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] 1 pools have many more objects per pg than average

2021-08-17 Thread Ml Ml
Hello,

i get: 1 pools have many more objects per pg than average
detail:  pool cephfs.backup.data objects per pg (203903) is more than
20.307 times cluster average (10041)

I set pg_num and pgp_num from 32 to 128 but my autoscaler seem to set
them back to 32 again :-/

For Details please see:
 https://pastebin.com/hmSjPR7b

Has the BIAS column something to do with it?

Cheers,
Mario
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Manual deployment of an OSD failed

2021-08-17 Thread Francesco Piraneo G.

Hi all,

going to deploy a test cluster and successfully deployed my first 
monitor (hurray!).


Now trying to add the first osd host following instructions at: 
https://docs.ceph.com/en/latest/install/manual-deployment/#bluestore



I have to note - however - that:

1.

--

copy /var/lib/ceph/bootstrap-osd/ceph.keyring from monitor node 
(mon-node1) to /var/lib/ceph/bootstrap-osd/ceph.keyring on osd node 
(osd-node1)


--

The source directory on monitor node (in my case named mon1) is empty; 
however I have my keyring on osd host; I arranged to import the mon 
keyring and mon map following the instructions here:


https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/3/html/installation_guide_for_red_hat_enterprise_linux/manually-installing-red-hat-ceph-storage


2.

So I just run:

# ceph-volume lvm create --data /dev/sdb --cluster euch01
Running command: /usr/bin/ceph-authtool --gen-print-key
Running command: /usr/bin/ceph --cluster ceph --name 
client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring 
-i - osd new 788c15be-7f1f-4151-b8c5-703e4a72361c
 stderr: Error initializing cluster client: ObjectNotFound('RADOS 
object not found (error calling conf_read_file)',)

-->  RuntimeError: Unable to create a new OSD id


Where my cluster custom name is euch01 and volume to deploy OSD is 
/dev/sdb; however my suspect is that the osd creation fails because on 
the third "Running command" line I read "--cluster ceph" that is the 
default cluster name - Note again: I specified my custom cluster name.


Any hint is strongly welcome.

Thank you very much for any help.

Francesco


Some side notes:

- Sorry, newbie here - a lot to learn -> Please be patient.

- With cephadm I arranged to run a test cluster in a very reasonable 
time - However I don't like containerized solutions in this case due to 
overhead that I feel quite stupid...


- Tried to deploy with ceph-ansible but installation crash and no 
support from ceph-ansible developer... no problem: Trying to deploy 
manually - More things to learn!


- Now to deploy my first monitor I did a collage of the RedHat and Ceph 
original website because RedHat is pointed to deploy on RHEL and not on 
other distros like CentOS (or debian); ceph.com seems to miss some 
important steps that I've found on RH pages. Of course if someone has to 
suggest a complete guide to deploy Pacific from a to z without 
containers is strongly welcome.





___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Why you might want packages not containers for Ceph deployments

2021-08-17 Thread Andrew Walker-Brown
Hi,

I’m coming at this from the position of a newbie to Ceph.  I had some 
experience of it as part of Proxmox, but not as a standalone solution.

I really don’t care whether Ceph is contained or not, I don’t have the depth of 
knowledge or experience to argue it either way.  I can see that containers may 
well offer a more consistent deployment scenario with fewer dependencies on the 
external host OS.  Upgrades/patches to the host OS may not impact the container 
deployment etc., with the two systems not held in any lock-step.

The challenge for me hasn’t been Ceph its self. Ceph has worked brilliantly, I 
have a fully resilient architecture split between two active datacentres and my 
storage can survive up-to 50% node/OSD hardware failure.

No, the challenge has been documentation.  I’ve run off down multiple rabbit 
holes trying to find solutions to problems or just background information.  
I’ve been tripped up by not spotting the Ceph documentation was “v: latest” 
rather than “v: octopus”...so features didn’t exist or commands were structured 
slightly differently.

Also just not being obvious whether the bit of documentation I was looking at 
related to a native Ceph package deployment or a container one.  Plus you get 
the Ceph/Suse/Redhat/Proxmox/IBM etc..etc.. flavour answer depending on which 
Google link you click.  Yes I know, its part of the joy of working with open 
sourcebut still, not what you need when I chunk of infrastructure has 
failed and you don’t know why.

I’m truly in awe of what the Ceph community has produced and is planning for 
the future, so don’t think I’m any kind of hater.

My biggest request is for the documentation to take on some restructuring.  
Keep the different deployment methods documented separately, yes an intro 
covering the various options and recommendations is great, but then keep it 
entirely discreet.

Then when a feature/function is documented, make it clear if this applies to 
packaged or container deployment etc...  e.g. Zabbix (we use Zabbix)lovely 
documentation on how to integrate Ceph and Zabbixuntil you finally find out 
its not supported with containersvia a forum and an RFE/Bug entry.

And thank you to all the support in the community, REALLY appreciated.

Best,

Andrew


Sent from Mail for Windows

From: Erik Lindahl
Sent: 17 August 2021 16:01
To: Marc
Cc: Nico Schottelius; Kai 
Börnert; ceph-users
Subject: [ceph-users] Re: Why you might want packages not containers for Ceph 
deployments

Hi,

Whether containers are good or not is a separate discussion where I suspect 
there won't be consensus in the near future.

However, after just having looked at the documentation again, my main point 
would be that when a major stable open source project recommends a specific 
installation method (=cephadm) first in the "getting started" guide, users are 
going to expect that's the alternative things are documented for, which isn't 
quite the case for cephadm (yet).

Most users will probably accept either solution as long as there is ONE clear & 
well-documented way of working with ceph - but the current setup of even having 
the simple (?) getting started guide talk about at least three different ways 
without clearly separating their documentation seems like a guarantee for 
long-term confusion and higher entry barriers for new users, which I assume is 
the opposite of the goal of cephadm!

Cheers,

Erik


Erik Lindahl 
Professor of Biophysics
Science for Life Laboratory
Stockholm University & KTH
Office (SciLifeLab): +46 8 524 81567
Cell (Sweden): +46 73 4618050
Cell (US): +1 (650) 924 7674



> On 17 Aug 2021, at 16:29, Marc  wrote:
>
> 
>>
>>
>> Again, this is meant as hopefully constructive feedback rather than
>> complaints, but the feeling a get after having had fairly smooth
>> operations with raw packages (including fixing previous bugs leading to
>> severe crashes) and lately grinding our teeth a bit over cephadm is that
>> it has helped automated a bunch of stuff that wasn't particularly
>> difficult (it's nice to issue an update with a single command, but it
>> works perfectly fine manually too) at the cost of making it WAY more
>> difficult to fix things (not to mention simply get information about the
>> cluster) when we have problems - and in the long run that's not a trade-
>> off I'm entirely happy with :-)
>>
>
> Everyone can only agree to keeping things simple. I honestly do not even know 
> why you want to try cephadm. The containerized solution was developed to 
> replace ceph deploy, ceph ansible etc. as a solution to make ceph 
> installation for new users easier. That is exactly the reason (imho) why you 
> should not use the containerized environment. Because a containerized 
> environment has not as primaray t

[ceph-users] Re: The cluster expands the osd, but the storage pool space becomes smaller

2021-08-17 Thread Reed Dier
Hey David,

In case this wasn't answered off list already:

It looks like you have only added a single OSD to each new host?
You specified 12*10T on osd{1..5}, and 12*12T on osd{6,7}.

Just as a word of caution, the added 24T is more or less going to be wasted on 
osd{6,7} assuming that your crush ruleset is to use host as your failure domain.
But that is beside the point near term.

The problem is that your new OSD host buckets are really lopsided in the 
opposite direction, because now you have two tiny buckets, with 5 large buckets.

So what I would suggest is to set the noreblance, nobackfill, norecover flags 
on your cluster.
Then finish adding all of the OSDs to the two new hosts.

Then unset the no* flags, and let everything rebalance at that point.
Crush is going to try to satisfy the ruleset, which is to place data in host 
failure domains, and two of those failure domains are ~10% the size of the 
others, and it is going to try to evenly distribute across hosts, effectively 
making your smallest bucket, your measuring stick for cephfs fullness.

So if you try and bring everything up all at once, and feel free to throttle 
backfills as needed, it should increase your usable space as expected.

Hope that helps,
Reed

> On Aug 11, 2021, at 2:24 AM, David Yang  wrote:
> 
> Each osd node is configured with 12*10T hdd, 1*1.5T nvme ssd, 150G*1 ssd;
> 
> Now we are ready to expand the cluster to 2 nodes.
> 
> Each node is configured with 12*12T hdd and 2*1.2T nvme ssd.
> 
> At present, I have out of the newly added osd, and the cluster size will be
> restored;
> For example, when the normal 320T is marked as in, the storage pool size is
> only 300T.
> 
> ID   CLASS  WEIGHT TYPE NAME   STATUS  REWEIGHT  PRI-AFF
> -1 596.79230  root default
> -9 110.60374  host osd1
>  1hdd9.09569  osd.1   up   1.0  1.0
>  7hdd9.09569  osd.7   up   1.0  1.0
> 12hdd9.09569  osd.12  up   1.0  1.0
> 17hdd9.09569  osd.17  up   1.0  1.0
> 22hdd9.09569  osd.22  up   1.0  1.0
> 27hdd9.09569  osd.27  up   1.0  1.0
> 32hdd9.09569  osd.32  up   1.0  1.0
> 37hdd9.09569  osd.37  up   1.0  1.0
> 42hdd9.09569  osd.42  up   1.0  1.0
> 47hdd9.09569  osd.47  up   1.0  1.0
> 52hdd9.09569  osd.52  up   1.0  1.0
> 57hdd9.09569  osd.57  up   1.0  1.0
> 60ssd1.45549  osd.60  up   1.0  1.0
> -3 110.60374  host osd2
>  0hdd9.09569  osd.0   up   1.0  1.0
>  5hdd9.09569  osd.5   up   1.0  1.0
> 10hdd9.09569  osd.10  up   1.0  1.0
> 15hdd9.09569  osd.15  up   1.0  1.0
> 20hdd9.09569  osd.20  up   1.0  1.0
> 25hdd9.09569  osd.25  up   1.0  1.0
> 30hdd9.09569  osd.30  up   1.0  1.0
> 35hdd9.09569  osd.35  up   1.0  1.0
> 40hdd9.09569  osd.40  up   1.0  1.0
> 45hdd9.09569  osd.45  up   1.0  1.0
> 50hdd9.09569  osd.50  up   1.0  1.0
> 55hdd9.09569  osd.55  up   1.0  1.0
> 61ssd1.45549  osd.61  up   1.0  1.0
> -5 110.60374  host osd3
>  2hdd9.09569  osd.2   up   1.0  1.0
>  6hdd9.09569  osd.6   up   1.0  1.0
> 11hdd9.09569  osd.11  up   1.0  1.0
> 16hdd9.09569  osd.16  up   1.0  1.0
> 21hdd9.09569  osd.21  up   1.0  1.0
> 26hdd9.09569  osd.26  up   1.0  1.0
> 31hdd9.09569  osd.31  up   1.0  1.0
> 36hdd9.09569  osd.36  up   1.0  1.0
> 41hdd9.09569  osd.41  up   1.0  1.0
> 46hdd9.09569  osd.46  up   1.0  1.0
> 51hdd9.09569  osd.51  up   1.0  1.0
> 56hdd9.09569  osd.56  up   1.0  1.0
> 62ssd1.45549  osd.62  up   1.0  1.0
> -7 110.60374  host osd4
>  3hdd9.09569  osd.3   up   1.0  1.0
>  8hdd9.09569  osd.8   up   1.0  1.0
> 13hdd9.0

[ceph-users] Re: Manual deployment of an OSD failed

2021-08-17 Thread Anthony D'Atri



> On Aug 17, 2021, at 12:28 PM, Francesco Piraneo G.  
> wrote:
> 
> # ceph-volume lvm create --data /dev/sdb --dmcrypt --cluster euch01

Your first message indicated a default cluster name; this one implies a 
non-default name.

Whatever else you do, avoid custom cluster names.  They will only bring you 
grief.

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Manual deployment of an OSD failed

2021-08-17 Thread Francesco Piraneo G.

going to deploy a test cluster and successfully deployed my first
monitor (hurray!).

Now trying to add the first osd host following instructions at:
https://docs.ceph.com/en/latest/install/manual-deployment/#bluestore


ceph-volume lvm zap --destroy /dev/sdb
ceph-volume lvm create --data /dev/sdb --dmcrypt

systemctl enable ceph-osd@0


# ceph-volume lvm create --data /dev/sdb --dmcrypt --cluster euch01
Running command: /usr/bin/ceph-authtool --gen-print-key
Running command: /usr/bin/ceph-authtool --gen-print-key
-->  RuntimeError: No valid ceph configuration file was loaded.
[root@osd1 ~]# ceph-authtool --gen-print-key
AQCADBxhqFIDNhAAQlwoW1l983923Ms/EJuSiA==
[root@osd1 ~]# ceph-authtool --gen-print-key --cluster euch01
AQCcDBxh1zzbGBAAW8tVp0aX668zpGUobhQWBg==


This to say that the zap --destroy worked.

The lvm create raised an error; running the ceph-authtool alone worked, 
so I have a valid configuration file on my osd node; ceph-authtool 
worked both specifying and not specifying the cluster name.


F.

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] ceph snap-schedule retention is not properly being implemented

2021-08-17 Thread Prayank Saxena
Hello everyone,

We have a ceph cluster with version Pacific v16.2.4

We are trying to implement the ceph module snap-schedule from this document
https://docs.ceph.com/en/latest/cephfs/snap-schedule/

It works if you have say, hourly and retention is h 3

ceph fs snap-schedule add /volumes/user1/vol7 1h 
ceph fs snap-schedule retention add /volumes/user1/vol7 h 6

But we tried the following retention configuration, it did not quite
make the result we are expecting:

ceph fs snap-schedule add /volumes/user1/vol7 1h 2021-08-12T23:41:00
ceph fs snap-schedule retention add /volumes/user1/vol7 d 2
ceph fs snap-schedule retention add /volumes/user1/vol7 h 6

by definition this should : take a snapshot every one hour, then retain 6
snapshots with an hour apart and 2 snapshot days apart.

> ceph fs snap-schedule status /volumes/user1/vol7
{"fs": "cephfs", "subvol": null, "path": "/volumes/user1/vol7", "rel_path":
"/volumes/user1/vol7", "schedule": "1h", "retention": {"d": 2, "h": 6},
"start": "2021-08-12T23:41:00", "created": "2021-08-12T23:41:07", "first":
"2021-08-13T00:41:00", "last": "2021-08-17T09:41:00", "last_pruned":
"2021-08-17T09:41:00", "created_count": 106, "pruned_count": 96, "active":
true}

> ceph fs subvolume snapshot ls cephfs vol7 --group_name user1 | grep name
"name": "scheduled-2021-08-13-23_41_00"<--- this should be
deleted based on retention
"name": "scheduled-2021-08-14-23_41_00"<--- this too
"name": "scheduled-2021-08-15-23_41_00"
"name": "scheduled-2021-08-16-23_41_00"
"name": "scheduled-2021-08-17-04_41_00"
"name": "scheduled-2021-08-17-05_41_00"
"name": "scheduled-2021-08-17-06_41_00"
"name": "scheduled-2021-08-17-07_41_00"
"name": "scheduled-2021-08-17-08_41_00"
"name": "scheduled-2021-08-17-09_41_00"


this is what we get in the log
--- start log 
... truncated ...
2021-08-17 08:41:00,271 [Thread-3194] [INFO]
[snap_schedule.fs.schedule_client] created scheduled snapshot of
/volumes/user1/vol7
2021-08-17 08:41:00,271 [Thread-3194] [DEBUG]
[snap_schedule.fs.schedule_client] created scheduled snapshot
/volumes/user1/vol7/.snap/scheduled-2021-08-17-08_41_00
2021-08-17 08:41:00,271 [Thread-3194] [DEBUG]
[snap_schedule.fs.schedule_client] SnapDB on cephfs changed for
/volumes/user1/vol7, updating next Timer
2021-08-17 08:41:00,271 [Thread-3194] [DEBUG]
[snap_schedule.fs.schedule_client] Creating new snapshot timer for
/volumes/user1/vol7
2021-08-17 08:41:00,272 [Thread-3194] [DEBUG]
[snap_schedule.fs.schedule_client] Will snapshot /volumes/user1/vol7 in fs
cephfs in 3600s
2021-08-17 08:41:00,272 [Thread-3194] [DEBUG]
[snap_schedule.fs.schedule_client] Pruning snapshots
2021-08-17 08:41:00,272 [Thread-3194] [DEBUG] [mgr_util] self.fs_id=1,
fs_id=1
2021-08-17 08:41:00,273 [Thread-3194] [DEBUG]
[snap_schedule.fs.schedule_client] skipping dir entry b'.'
2021-08-17 08:41:00,274 [Thread-3194] [DEBUG]
[snap_schedule.fs.schedule_client] skipping dir entry b'..'
2021-08-17 08:41:00,275 [Thread-3194] [DEBUG]
[snap_schedule.fs.schedule_client] add b'scheduled-2021-08-13-23_41_00' to
pruning
2021-08-17 08:41:00,275 [Thread-3194] [DEBUG]
[snap_schedule.fs.schedule_client] add b'scheduled-2021-08-14-23_41_00' to
pruning
2021-08-17 08:41:00,276 [Thread-3194] [DEBUG]
[snap_schedule.fs.schedule_client] add b'scheduled-2021-08-15-23_41_00' to
pruning
2021-08-17 08:41:00,276 [Thread-3194] [DEBUG]
[snap_schedule.fs.schedule_client] add b'scheduled-2021-08-16-23_41_00' to
pruning
2021-08-17 08:41:00,277 [Thread-3194] [DEBUG]
[snap_schedule.fs.schedule_client] add b'scheduled-2021-08-17-02_41_00' to
pruning
2021-08-17 08:41:00,278 [Thread-3194] [DEBUG]
[snap_schedule.fs.schedule_client] add b'scheduled-2021-08-17-03_41_00' to
pruning
2021-08-17 08:41:00,278 [Thread-3194] [DEBUG]
[snap_schedule.fs.schedule_client] add b'scheduled-2021-08-17-04_41_00' to
pruning
2021-08-17 08:41:00,279 [Thread-3194] [DEBUG]
[snap_schedule.fs.schedule_client] add b'scheduled-2021-08-17-05_41_00' to
pruning
2021-08-17 08:41:00,279 [Thread-3194] [DEBUG]
[snap_schedule.fs.schedule_client] add b'scheduled-2021-08-17-06_41_00' to
pruning
2021-08-17 08:41:00,280 [Thread-3194] [DEBUG]
[snap_schedule.fs.schedule_client] add b'scheduled-2021-08-17-07_41_00' to
pruning
2021-08-17 08:41:00,280 [Thread-3194] [DEBUG]
[snap_schedule.fs.schedule_client] add b'scheduled-2021-08-17-08_41_00' to
pruning
2021-08-17 08:41:00,280 [Thread-3194] [DEBUG]
[snap_schedule.fs.schedule_client] compiling keep set for period n
2021-08-17 08:41:00,280 [Thread-3194] [DEBUG]
[snap_schedule.fs.schedule_client] compiling keep set for period M
2021-08-17 08:41:00,280 [Thread-3194] [DEBUG]
[snap_schedule.fs.schedule_client] compiling keep set for period h
2021-08-17 08:41:00,280 [Thread-3194] [DEBUG]
[snap_schedule.fs.schedule_client] keeping b'scheduled-2021-08-17-08_41_00'
due to 6h
2021-08-17 08:41:00,280 [Thread-3194] [DEBUG]
[snap_schedule.fs.schedule_client] keeping b'sched

[ceph-users] ceph 14.2.22 snaptrim and slow ops

2021-08-17 Thread Rainer Krienke

Hello,

about four weeks ago I upgraded my 14.2.16 cluster (144 4TB hdd-OSDs, 9 
hosts) from 14.2.16 to 14.2.22. The upgrade did not cause any trouble. 
The cluster is healthy. One thing is however new since the upgrade and 
somewhat irritating:


Each weekend in the night from sat to sun I now see health warnings 
about slow ops of some osds that I did never see before running 14.2.16. 
 The mentioned slow osd are not always the same and I did not find any 
hints in the smart values or logs that indicate a failing disk.


In this list I recently saw several other posts no matter if Nautilus or 
Octopus reporting the very same issue.


Is there a way to get around the slow ops warning, or is it a bug? Can I 
check if ceph really succeeds trimming removed snapshots or perhaps 
quits trimming because of the slow ops?


In "ceph osd pool health detail" I see a list for one pool that has 
about 30 snapshots created and also 30 snapshots deleted each week that 
now has 65 removed snaps entries shown as [1~6a,6c~30,9d~2d,cc~a, ...] 
in the output. Can I assume that trimming works if this 
[1~6a,6c~30,9d~2d,cc~a, ...] list does not get longer each week? Is 
there another way to check if trimming works?


Thanks for hints
Rainer
--
Rainer Krienke, Uni Koblenz, Rechenzentrum, A22, Universitaetsstrasse  1
56070 Koblenz, Web: http://www.uni-koblenz.de/~krienke, Tel: +49261287 1312
PGP: http://www.uni-koblenz.de/~krienke/mypgp.html, Fax: +49261287 
1001312

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Multiple DNS names for RGW?

2021-08-17 Thread Christian Rohmann

On 17/08/2021 13:37, Janne Johansson wrote:

Don't forget that v4 auth bakes in the clients idea of what the
hostname of the endpoint was, so its not only about changing headers.
If you are not using v2 auth, you will not be able to rewrite the
hostname on the fly.


Thanks for the heads up in this regard.


How would one achieve the idea of having two distinct sites, i.e.

* s3-az1.example.com
* s3-az2.example.com

each having their own rgw_dns_name set and doing mult-site sync, but 
also having a generic hostname, s3.example.com,

that I can simply reconfigure to point to the master?

From what you said I read that I cannot:

a) use an additonal rgw_dns_name, as only one can be configured (right?)
b) simply rewrite the hostname from the frontend-proxy / lb to the 
backends as this will invalidate the sigv4 the clients do?





Regards


Christian



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [Suspicious newsletter] Re: create a Multi-zone-group sync setup

2021-08-17 Thread Boris Behrens
Yes,
I want to open up a new DC where people can store their objects, but I want
the bucket names and users unique over both DC.
After some reading I found that I need one realm with multiple zonegroups,
each containing only one zone.

No sync of actual user data, but metadata like users or used bucket names.

So I created a test setup which contains three servers on each side, each
server is used for mon,mgr,osd,radosgw.
One is a nautilus installation (the master) and the other is a octopus
installation.

I've set up realm,first zonegroup with the zone and a sync user in the
master setup, and commited.
Then I've pulled the periode on the 2nd setup and added a 2nd zonegroup
with a zone and commited.

Now I can create users in the master setup, but not in the 2nd (as it
doesn't sync back). But I am not able to create a bucket or so with the
credentials of the users I created.

Am Mi., 18. Aug. 2021 um 06:08 Uhr schrieb Szabo, Istvan (Agoda) <
istvan.sz...@agoda.com>:

> Hi,
>
> " but have a global namespace where all buckets and users are uniqe."
>
> You mean manage multiple cluster from 1 "master" cluster but ono sync? So
> 1 realm, multiple dc BUT no sync?
>
> Istvan Szabo
> Senior Infrastructure Engineer
> ---
> Agoda Services Co., Ltd.
> e: istvan.sz...@agoda.com
> ---
>
> -Original Message-
> From: Boris Behrens 
> Sent: Tuesday, August 17, 2021 8:51 PM
> To: ceph-users@ceph.io
> Subject: [Suspicious newsletter] [ceph-users] Re: create a
> Multi-zone-group sync setup
>
> Email received from the internet. If in doubt, don't click any link nor
> open any attachment !
> 
>
> Hi, after some trial and error I got it working, so users will get synced.
>
> However, If I try to create a bucket via s3cmd I receive the following
> error:
> s3cmd --access_key=XX --secret_key=YY --host=HOST mb s3://test
> ERROR: S3 error: 403 (InvalidAccessKeyId)
>
> When I try the same with ls I just get an empty response (because there
> are no buckets to list).
>
> I get this against both radosgw locations.
> I have an nginx in between the internet and radosgw that will just proxy
> pass every address and sets host and x-forwarded-for header.
>
>
> Am Fr., 30. Juli 2021 um 16:46 Uhr schrieb Boris Behrens :
>
> > Hi people,
> >
> > I try to create a Multi-zone-group setup (like it is described here:
> > https://docs.ceph.com/en/latest/radosgw/multisite/)
> >
> > But I simply fail.
> >
> > I just created a testcluster to mess with it, and no matter how I try to.
> >
> > Is there a howto avaialable?
> >
> > I don't want to get a multi-zone setup, where I sync the actual zone
> > data, but have a global namespace where all buckets and users are uniqe.
> >
> > Cheers
> >  Boris
> >
> > --
> > Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend
> > im groüen Saal.
> >
>
>
> --
> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
> groüen Saal.
> ___
> ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an
> email to ceph-users-le...@ceph.io
>


-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io