[ceph-users] cephfs ha mount expectations

2022-10-26 Thread mj

Hi!

We have read https://docs.ceph.com/en/latest/man/8/mount.ceph, and would 
like to see our expectations confirmed (or denied) here. :-)


Suppose we build a three-node cluster, three monitors, three MDSs, etc, 
in order to export a cephfs to multiple client nodes.


On the (RHEL8) clients (web application servers) fstab, we will mount 
the cephfs like:



cehp1,ceph2,ceph3:/ /mnt/ha-pool/ ceph 
name=admin,secretfile=/etc/ceph/admin.secret,noatime 0 2


We expect that the RHEL clients will then be able to use (read/write) a 
shared /mnt/ha-pool directory simultaneously.


Our question: how HA can we expect this setup to be? Looking for some 
practical experience here.


Specific: Can we reboot any of the three involved ceph servers without 
the clients noticing anything? Or will there be certain timeouts 
involved, during which /mnt/ha-pool/ will appear unresposive, and 
*after* a timeout the client switches monitor node, and /mnt/ha-pool/ 
will respond again?


Of course we hope the answer is: in such a setup, cephfs clients should 
not notice a reboot at all. :-)


All the best!

MJ
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Why did ceph turn /etc/ceph/ceph.client.admin.keyring into a directory?

2022-10-26 Thread Marc


I can't remember that much of it, afaicr this is more a kubernetes plugin, and 
what functionality was lacking in kubernetes they tried to bypass with the 
plugin. So I have problems creating preprovisioned volumes. Afaicr you need the 
driver to create a volume, so when you use the driver from the commandline, you 
are confronted with all these weird things.

This an issue where I point out that they are not even validating passed 
arguments, if an argument is lacking for some reason. Your hosts root fs is 
getting mounted in the launched task. 
https://github.com/ceph/ceph-csi/issues/1798#issuecomment-748916298

This is were I report files being deleted
https://github.com/ceph/ceph-csi/issues/1799

They just don't get the concept of csi. I have even seen old issues where they 
were storing volume id's locally, so your task could never migrate to a 
different host. Who is designing that like this .




> 
> Could you explain? I have just deployed Ceph CSI just like the docs
> specified. What mode is it running in if not container mode?
> 
> 
> On Tue, Oct 25, 2022 at 10:56 AM Marc   > wrote:
> 
> 
>   Wtf, unbelievable that it is still like this. You can't fix it, I
> had to fork and patch it because these @#$@#$@ ignored it. I don't know
> much about kubernetes I am running mesos. Can't you set/configure
> kubernetes to launch the driver in a container mode?
> 
> 
>   >
>   > How should we fix it? Should we remove the directory and add back
> the
>   > keyring file?
>   >
>   >
>   >
>   > On Tue, Oct 25, 2022 at 9:45 AM Martin Johansen  
>   >  > > wrote:
>   >
>   >
>   >   Yes, we are using the ceph-csi driver in a kubernetes
> cluster. Is
>   > it that that is causing this?
>   >
>   >   Best Regards,
>   >
>   >
>   >   Martin Johansen
>   >
>   >
>   >   On Tue, Oct 25, 2022 at 9:44 AM Marc  outsourcing.eu 
>   >  >
> > wrote:
>   >
>   >
>   >   >
>   >   > 1) Why does ceph delete
> /etc/ceph/ceph.client.admin.keyring
>   > several
>   >   > times a
>   >   > day?
>   >   >
>   >   > 2) Why was it turned into a directory? It
> contains one file
>   >   > "ceph.client.admin.keyring.new". This then causes
> an error
>   > in the ceph
>   >   > logs
>   >   > when ceph tries to remove the file: "rm: cannot
> remove
>   >   > '/etc/ceph/ceph.client.admin.keyring': Is a
> directory".
>   >   >
>   >
>   >   Are you using the ceph-csi driver? The ceph csi
> people just
>   > delete your existing ceph files and mount your root fs when you
> are not
>   > running the driver in a container. They seem to think that
> checking for
>   > files and validating parameters is not necessary.
>   >
> 
> 

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: post-mortem of a ceph disruption

2022-10-26 Thread Stefan Kooman

On 10/25/22 17:08, Simon Oosthoek wrote:



At this point, one of noticed that a strange ip adress was mentioned; 
169.254.0.2, it turns out that a recently added package (openmanage) and 
some configuration had added this interface and address to hardware 
nodes from Dell. For us, our single interface assumption is now out the 
window and 0.0.0.0/0 is a bad idea in /etc/ceph/ceph.conf for public and 
cluster network (though it's the same network for us).


Our 3 datacenters are on three different subnets so it becomes a bit 
difficult to make it more specific. The nodes are all under the same 
/16, so we can choose that, but it is starting to look like a weird 
network setup.
I've always thought that this configuration was kind of non-intuitive 
and I still do. And now it has bitten us :-(



Thanks for reading and if you have any suggestions on how to fix/prevent 
this kind of error, we'll be glad to hear it!


We don't have the public_network specified in our cluster(s). AFAIK It's 
not needed (anymore). There is no default network address range 
configured. So I would just get rid of it. Same for cluster_network if 
you have that configured. There I fixed it! ;-).


If you don't use IPv6, I would explicitly turn it off:

ms_bind_ipv6 = false

The Ceph daemons and clients need to know what monitors there are and 
what their address is: that is important (mon_host). IPs of OSDs are 
available in osd map, IPs of MDSs in mds map, etc. and the clients will 
request that when needed from the monitors.


If you want to hardcode each ip a daemon has to listen on, it is 
possible. You can create daemon specific entries on what IP they have to 
bind to (public_bind_addr IIRC).


Gr. Stefan
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: cephfs ha mount expectations

2022-10-26 Thread William Edwards

> Op 26 okt. 2022 om 10:11 heeft mj  het volgende 
> geschreven:
> 
> Hi!
> 
> We have read https://docs.ceph.com/en/latest/man/8/mount.ceph, and would like 
> to see our expectations confirmed (or denied) here. :-)
> 
> Suppose we build a three-node cluster, three monitors, three MDSs, etc, in 
> order to export a cephfs to multiple client nodes.
> 
> On the (RHEL8) clients (web application servers) fstab, we will mount the 
> cephfs like:
> 
>> cehp1,ceph2,ceph3:/ /mnt/ha-pool/ ceph 
>> name=admin,secretfile=/etc/ceph/admin.secret,noatime 0 2
> 
> We expect that the RHEL clients will then be able to use (read/write) a 
> shared /mnt/ha-pool directory simultaneously.
> 
> Our question: how HA can we expect this setup to be? Looking for some 
> practical experience here.
> 
> Specific: Can we reboot any of the three involved ceph servers without the 
> clients noticing anything? Or will there be certain timeouts involved, during 
> which /mnt/ha-pool/ will appear unresposive, and *after* a timeout the client 
> switches monitor node, and /mnt/ha-pool/ will respond again?

Monitor failovers don’t cause a noticeable disruption IIRC.

MDS failovers do. The MDS needs to replay. You can minimise the effect with 
mds_standby_replay.

> 
> Of course we hope the answer is: in such a setup, cephfs clients should not 
> notice a reboot at all. :-)
> 
> All the best!
> 
> MJ
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
> 

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] [Ceph Grafana deployment] - error on Ceph Quinchy

2022-10-26 Thread Lokendra Rathour
Hi Team,
Facing the issue while installing Grafance and related containers while
deploying ceph -ansible.

Error:
t_read_packet: read header failed: Broken pipe\r\ndebug2: Received exit
status from master 0\r\n")
fatal: [storagenode1]: FAILED! => changed=false
  invocation:
module_args:
  daemon_reexec: false
  daemon_reload: false
  enabled: true
  force: null
  masked: null
  name: docker
  no_block: false
  scope: system
  state: started
  msg: 'Could not find the requested service docker: host'


Configs:


dashboard_enabled: True

grafana_container_image: "reposerver.com:5000/grafana_8.3.5:latest"
#node_exporter_container_image: "
reposerver.com:5000/node-exporter_1.3.1:latest"
alertmanager_container_image: "
reposerver.com:5000/alertmanager_0.23.0:latest"
prometheus_container_image: "reposerver.com:5000/prometheus_2.33.4:latest"

dashboard_protocol: https
dashboard_port: 8443
dashboard_admin_user: admin
dashboard_admin_password: P@ssw0rd321
grafana_admin_user: admin
grafana_admin_password: P@ssw0rd321
grafana_server_fqdn: storagenode1
grafana_server_group_name: grafana-server

any idea about the error, any input will be of great help


-- 
~ Lokendra
skype: lokendrarathour
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: cephfs ha mount expectations

2022-10-26 Thread Robert Gallop
I use this very example, few more servers.  I have no outage windows for my
ceph deployments as they support several production environments.

MDS is your focus, there are many knobs, but MDS is the key to client
experience.  In my environment, MDS failover takes 30-180 seconds,
depending on how much replay and rejoin needs to take place.  During this
failover I/O on the client is paused, but not broken. If you were to do an
ls at the time of failover, it may not return for a couple min worst case.
If a file transfer is ongoing it will stop writing for this failover time,
but both will complete after failover.

If I have MDs issues and failover for whatever reason is > 5 min, my
clients are lost.  I must reboot all clients tied to that MDS to recover
due to thousands of open files in various states.  This is obviously major
impact, and as we learn ceph happens less frequently, and only 3 times in
the first year of operation.

It’s awesome tech, and I look forward to future enhancements in general.

On Wed, Oct 26, 2022 at 3:41 AM William Edwards 
wrote:

>
> > Op 26 okt. 2022 om 10:11 heeft mj  het volgende
> geschreven:
> >
> > Hi!
> >
> > We have read https://docs.ceph.com/en/latest/man/8/mount.ceph, and
> would like to see our expectations confirmed (or denied) here. :-)
> >
> > Suppose we build a three-node cluster, three monitors, three MDSs, etc,
> in order to export a cephfs to multiple client nodes.
> >
> > On the (RHEL8) clients (web application servers) fstab, we will mount
> the cephfs like:
> >
> >> cehp1,ceph2,ceph3:/ /mnt/ha-pool/ ceph
> name=admin,secretfile=/etc/ceph/admin.secret,noatime 0 2
> >
> > We expect that the RHEL clients will then be able to use (read/write) a
> shared /mnt/ha-pool directory simultaneously.
> >
> > Our question: how HA can we expect this setup to be? Looking for some
> practical experience here.
> >
> > Specific: Can we reboot any of the three involved ceph servers without
> the clients noticing anything? Or will there be certain timeouts involved,
> during which /mnt/ha-pool/ will appear unresposive, and *after* a timeout
> the client switches monitor node, and /mnt/ha-pool/ will respond again?
>
> Monitor failovers don’t cause a noticeable disruption IIRC.
>
> MDS failovers do. The MDS needs to replay. You can minimise the effect
> with mds_standby_replay.
>
> >
> > Of course we hope the answer is: in such a setup, cephfs clients should
> not notice a reboot at all. :-)
> >
> > All the best!
> >
> > MJ
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> >
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph User + Dev Monthly Meeting coming up this Thursday

2022-10-26 Thread Stefan Kooman

On 10/19/22 00:20, Laura Flores wrote:

Hello Ceph users,

The *Ceph User + Dev Monthly Meeting* is happening this *Thursday, October
20th @ 2:00 pm UTC.* The meeting will be on this link:
https://meet.jit.si/ceph-user-dev-monthly. Please feel free to add any
topics you'd like to discuss to the monthly minutes etherpad!
https://pad.ceph.com/p/ceph-user-dev-monthly-minutes


Topics

* Status of note "Separate PR will be issued to remove/update the 
malformed SnapMapper keys. (https://tracker.ceph.com/issues/56147).
Any additional actions required after keys are removed/updated to 
reclaim space?


* "Use Cases for Ceph" survey

* 16.2.11 status


As far as I know there are no minutes from this meeting. Have all of 
these points been discussed? If so, do you (or somebody else who 
attended) still know what was the outcome? I would really like to know 
when 16.2.11 will be released?


Thanks,

Gr. Stefan

P.s. Would it be a good idea to start making minutes and sending them to 
this list (same like Ceph leadership minutes)?

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: post-mortem of a ceph disruption

2022-10-26 Thread Simon Oosthoek

On 26/10/2022 10:57, Stefan Kooman wrote:

On 10/25/22 17:08, Simon Oosthoek wrote:



At this point, one of noticed that a strange ip adress was mentioned; 
169.254.0.2, it turns out that a recently added package (openmanage) 
and some configuration had added this interface and address to 
hardware nodes from Dell. For us, our single interface assumption is 
now out the window and 0.0.0.0/0 is a bad idea in /etc/ceph/ceph.conf 
for public and cluster network (though it's the same network for us).


Our 3 datacenters are on three different subnets so it becomes a bit 
difficult to make it more specific. The nodes are all under the same 
/16, so we can choose that, but it is starting to look like a weird 
network setup.
I've always thought that this configuration was kind of non-intuitive 
and I still do. And now it has bitten us :-(



Thanks for reading and if you have any suggestions on how to 
fix/prevent this kind of error, we'll be glad to hear it!


We don't have the public_network specified in our cluster(s). AFAIK It's 
not needed (anymore). There is no default network address range 
configured. So I would just get rid of it. Same for cluster_network if 
you have that configured. There I fixed it! ;-).


Hi Stefan

thanks for the suggestions!

I've removed the cluster_network definition, but retained the 
public_network definition in a more specific way (list of the subnets 
that we are using for ceph nodes). In the code it isn't entirely clear 
to us what happens when public_network is undefined...




If you don't use IPv6, I would explicitly turn it off:

ms_bind_ipv6 = false


I just added this, it seems like a no brainer.

Cheers

/Simon
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: cephfs ha mount expectations

2022-10-26 Thread Eugen Block
Just one comment on the standby-replay setting: it really depends on  
the use-case, it can make things worse during failover. Just recently  
we had a customer where disabling standby-replay made failovers even  
faster and cleaner in a heavily used cluster. With standby-replay they  
had to manually clean things up in the mounted directory. So I would  
recommend to test both options.


Zitat von William Edwards :

Op 26 okt. 2022 om 10:11 heeft mj  het  
volgende geschreven:


Hi!

We have read https://docs.ceph.com/en/latest/man/8/mount.ceph, and  
would like to see our expectations confirmed (or denied) here. :-)


Suppose we build a three-node cluster, three monitors, three MDSs,  
etc, in order to export a cephfs to multiple client nodes.


On the (RHEL8) clients (web application servers) fstab, we will  
mount the cephfs like:


cehp1,ceph2,ceph3:/ /mnt/ha-pool/ ceph  
name=admin,secretfile=/etc/ceph/admin.secret,noatime 0 2


We expect that the RHEL clients will then be able to use  
(read/write) a shared /mnt/ha-pool directory simultaneously.


Our question: how HA can we expect this setup to be? Looking for  
some practical experience here.


Specific: Can we reboot any of the three involved ceph servers  
without the clients noticing anything? Or will there be certain  
timeouts involved, during which /mnt/ha-pool/ will appear  
unresposive, and *after* a timeout the client switches monitor  
node, and /mnt/ha-pool/ will respond again?


Monitor failovers don’t cause a noticeable disruption IIRC.

MDS failovers do. The MDS needs to replay. You can minimise the  
effect with mds_standby_replay.




Of course we hope the answer is: in such a setup, cephfs clients  
should not notice a reboot at all. :-)


All the best!

MJ
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io




___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] how to upgrade host os under ceph

2022-10-26 Thread Simon Oosthoek

Dear list,

I'm looking for some guide or pointers to how people upgrade the 
underlying host OS in a ceph cluster (if this is the right way to 
proceed, I don't even know...)


Our cluster is nearing the 4.5 years of age and now our ubuntu 18.04 is 
nearing the end of support date. We have a mixed cluster of u18 and u20 
nodes, all running octopus at the moment.


We would like to upgrade the OS on the nodes, without changing the ceph 
version for now (or per se).


Is it as easy as installing a new OS version, installing the ceph-osd 
package and a correct ceph.conf file and restoring the host key?


Or is more needed regarding the specifics of the OSD disks/WAL/journal?

Or is it necessary to drain a node of all data and re-add the OSDs as 
new units? (This would be too much work, so I doubt it ;-)


The problem with searching for information about this, is that it seems 
undocumented in the ceph documentation, and search results are flooded 
with ceph version upgrades.


Cheers

/Simon
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Ceph Leadership Team Meeting Minutes - 2022 Oct 26

2022-10-26 Thread Casey Bodley
lab issues blocking centos container builds and teuthology testing:
* https://tracker.ceph.com/issues/57914
* delays testing for 16.2.11

upcoming events:
* Ceph Developer Monthly (APAC) next week, please add topics:
https://tracker.ceph.com/projects/ceph/wiki/CDM_02-NOV-2022
* Ceph Virtual 2022 starts next Thursday:
https://ceph.io/en/community/events/2022/ceph-virtual/

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: how to upgrade host os under ceph

2022-10-26 Thread Mark Schouten

Hi Simon,

You can just dist-upgrade the underlying OS. Assuming that you installed 
the packages from https://download.ceph.com/debian-octopus/, just change 
bionic to focal in all apt-sources, and dist-upgrade away.


—
Mark Schouten, CTO
Tuxis B.V.
m...@tuxis.nl


-- Original Message --
From "Simon Oosthoek" 
To "ceph-users@ceph.io" 
Date 26/10/2022 16:14:28
Subject [ceph-users] how to upgrade host os under ceph


Dear list,

I'm looking for some guide or pointers to how people upgrade the underlying 
host OS in a ceph cluster (if this is the right way to proceed, I don't even 
know...)

Our cluster is nearing the 4.5 years of age and now our ubuntu 18.04 is nearing 
the end of support date. We have a mixed cluster of u18 and u20 nodes, all 
running octopus at the moment.

We would like to upgrade the OS on the nodes, without changing the ceph version 
for now (or per se).

Is it as easy as installing a new OS version, installing the ceph-osd package 
and a correct ceph.conf file and restoring the host key?

Or is more needed regarding the specifics of the OSD disks/WAL/journal?

Or is it necessary to drain a node of all data and re-add the OSDs as new 
units? (This would be too much work, so I doubt it ;-)

The problem with searching for information about this, is that it seems 
undocumented in the ceph documentation, and search results are flooded with 
ceph version upgrades.

Cheers

/Simon
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph User + Dev Monthly Meeting coming up this Thursday

2022-10-26 Thread Laura Flores
Hi Gr. Stefan,

I'll reply to the whole list in case anyone else has the same question.
Regarding 16.2.11, there is currently no ETA since we are experiencing some
issues in our testing lab. As soon as the testing lab is fixed, which is
the main priority at the moment, we plan to resume getting in last patches
for Pacific.

Thanks,
Laura Flores

On Wed, Oct 26, 2022 at 6:47 AM Stefan Kooman  wrote:

> On 10/19/22 00:20, Laura Flores wrote:
> > Hello Ceph users,
> >
> > The *Ceph User + Dev Monthly Meeting* is happening this *Thursday,
> October
> > 20th @ 2:00 pm UTC.* The meeting will be on this link:
> > https://meet.jit.si/ceph-user-dev-monthly. Please feel free to add any
> > topics you'd like to discuss to the monthly minutes etherpad!
> > https://pad.ceph.com/p/ceph-user-dev-monthly-minutes
>
> Topics
>
> * Status of note "Separate PR will be issued to remove/update the
> malformed SnapMapper keys. (https://tracker.ceph.com/issues/56147).
> Any additional actions required after keys are removed/updated to
> reclaim space?
>
> * "Use Cases for Ceph" survey
>
> * 16.2.11 status
>
>
> As far as I know there are no minutes from this meeting. Have all of
> these points been discussed? If so, do you (or somebody else who
> attended) still know what was the outcome? I would really like to know
> when 16.2.11 will be released?
>
> Thanks,
>
> Gr. Stefan
>
> P.s. Would it be a good idea to start making minutes and sending them to
> this list (same like Ceph leadership minutes)?
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
>

-- 

Laura Flores

She/Her/Hers

Software Engineer, Ceph Storage

Red Hat Inc. 

Chicago, IL

lflo...@redhat.com
M: +17087388804
@RedHat    Red Hat
  Red Hat


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: how to upgrade host os under ceph

2022-10-26 Thread Reed Dier
You should be able to `do-release-upgrade` from bionic/18 to focal/20.

Octopus/15 is shipped for both dists from ceph.
Its been a while since I did this, the release upgrader might disable the ceph 
repo, and uninstall the ceph* packages.
However, the OSDs should still be there, re-enable the ceph repo, install 
ceph-osd, and then `ceph-volume lvm activate —all` should find and start all of 
the OSDs.

Caveat, if you’re using cephadm, I’m sure the process is different.
And also, if you’re trying to go to jammy/22, thats a different story, because 
ceph isn’t shipping packages for jammy yet for any version of ceph.
I assume that they are going to ship quincy for jammy at some point, which will 
give a stepping stone from focal to jammy with the quincy release, because I 
don’t imagine that there will be a reef release for focal.

Reed

> On Oct 26, 2022, at 9:14 AM, Simon Oosthoek  wrote:
> 
> Dear list,
> 
> I'm looking for some guide or pointers to how people upgrade the underlying 
> host OS in a ceph cluster (if this is the right way to proceed, I don't even 
> know...)
> 
> Our cluster is nearing the 4.5 years of age and now our ubuntu 18.04 is 
> nearing the end of support date. We have a mixed cluster of u18 and u20 
> nodes, all running octopus at the moment.
> 
> We would like to upgrade the OS on the nodes, without changing the ceph 
> version for now (or per se).
> 
> Is it as easy as installing a new OS version, installing the ceph-osd package 
> and a correct ceph.conf file and restoring the host key?
> 
> Or is more needed regarding the specifics of the OSD disks/WAL/journal?
> 
> Or is it necessary to drain a node of all data and re-add the OSDs as new 
> units? (This would be too much work, so I doubt it ;-)
> 
> The problem with searching for information about this, is that it seems 
> undocumented in the ceph documentation, and search results are flooded with 
> ceph version upgrades.
> 
> Cheers
> 
> /Simon
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: cephfs ha mount expectations

2022-10-26 Thread mj

Hi all,

Thanks for the interesting discussion. Actually it's a bit disappointing 
to see that also cephfs with multiple MDS servers is not as HA as we 
would like it.


I read also that filover time depends on the number of clients. We will 
only have three, and they will not do heavy IO. So that should perhaps 
help a bit.


Is there any difference between an 'uncontrolled' ceph server 
(accidental) reboot, and a controlled reboot, where we (for example) 
first failover the MDS in a controlled, gentle way?


MJ

Op 26-10-2022 om 14:40 schreef Eugen Block:
Just one comment on the standby-replay setting: it really depends on the 
use-case, it can make things worse during failover. Just recently we had 
a customer where disabling standby-replay made failovers even faster and 
cleaner in a heavily used cluster. With standby-replay they had to 
manually clean things up in the mounted directory. So I would recommend 
to test both options.


Zitat von William Edwards :

Op 26 okt. 2022 om 10:11 heeft mj  het volgende 
geschreven:


Hi!

We have read https://docs.ceph.com/en/latest/man/8/mount.ceph, and 
would like to see our expectations confirmed (or denied) here. :-)


Suppose we build a three-node cluster, three monitors, three MDSs, 
etc, in order to export a cephfs to multiple client nodes.


On the (RHEL8) clients (web application servers) fstab, we will mount 
the cephfs like:


cehp1,ceph2,ceph3:/ /mnt/ha-pool/ ceph 
name=admin,secretfile=/etc/ceph/admin.secret,noatime 0 2


We expect that the RHEL clients will then be able to use (read/write) 
a shared /mnt/ha-pool directory simultaneously.


Our question: how HA can we expect this setup to be? Looking for some 
practical experience here.


Specific: Can we reboot any of the three involved ceph servers 
without the clients noticing anything? Or will there be certain 
timeouts involved, during which /mnt/ha-pool/ will appear 
unresposive, and *after* a timeout the client switches monitor node, 
and /mnt/ha-pool/ will respond again?


Monitor failovers don’t cause a noticeable disruption IIRC.

MDS failovers do. The MDS needs to replay. You can minimise the effect 
with mds_standby_replay.




Of course we hope the answer is: in such a setup, cephfs clients 
should not notice a reboot at all. :-)


All the best!

MJ
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io




___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] ceph-volume claiming wrong device

2022-10-26 Thread Oleksiy Stashok
Hey guys,

I ran into a weird issue, hope you can explain what I'm observing. I'm
testing* Ceph 16.2.10* on *Ubuntu 20.04* in *Google Cloud VMs*, I created 3
instances and attached 4 persistent SSD disks to each instance. I can see
these disks attached as `/dev/sdb, /dev/sdc, /dev/sdd, /dev/sde` devices.

As a next step I used ceph-ansible to bootstrap the ceph cluster on 3
instances, however I intentionally skipped OSD setup. So I ended up with a
Ceph cluster w/o any OSD.

I ssh'ed into each VM and ran:

```
  sudo -s
  for dev in sdb sdc sdd sde; do
/usr/sbin/ceph-volume --cluster ceph lvm create --bluestore
--dmcrypt --data "/dev/$dev"
  done
```

The operation above randomly fails on random instances/devices with
something like:
```
bluefs _replay 0x0: stop: uuid e2f72ec9-2747-82d7-c7f8-41b7b6d41e1b !=
super.uuid 0110ddb3-d4bf-4c1e-be11-654598c71db0
```

The interesting this is that when I do
```
/usr/sbin/ceph-volume lvm ls
```

I can see that the device for which OSD creation failed actually belongs to
a different OSD that was previously created for a different device. For
example the failure I mentioned above happened on the `/dev/sde` device, so
when I list lvms I see this:
```
== osd.2 ===

  [block]
/dev/ceph-103a4373-dbe0-43d6-a9e0-34db4e1b257c/osd-block-9af542ba-fd65-4355-ad17-7293856acaeb

  block device
 
/dev/ceph-103a4373-dbe0-43d6-a9e0-34db4e1b257c/osd-block-9af542ba-fd65-4355-ad17-7293856acaeb
  block uuidFfFnLt-h33F-F73V-tY45-VuZM-scj7-C3dg1K
  cephx lockbox secret  AQAlelljqNPoMhAA59JwN3wGt0d6Si+nsnxsRQ==
  cluster fsid  348fff8e-e850-4774-9694-05d5414b1c53
  cluster name  ceph
  crush device class
  encrypted 1
  osd fsid  9af542ba-fd65-4355-ad17-7293856acaeb
  osd id2
  osdspec affinity
  type  block
  vdo   0
  devices   /dev/sdd

  [block]
/dev/ceph-df14969f-2dfb-45f1-a579-a8e23ec12e33/osd-block-4686f6fc-8dc1-48fd-a2d9-70a281c8ee64

  block device
 
/dev/ceph-df14969f-2dfb-45f1-a579-a8e23ec12e33/osd-block-4686f6fc-8dc1-48fd-a2d9-70a281c8ee64
  block uuidGEajK3-Tsyf-XZS9-E5ik-M1BB-VIpb-q7D1ET
  cephx lockbox secret  AQAwelljFw2nJBAApuMs2WE0TT+7c1TGa4xQzg==
  cluster fsid  348fff8e-e850-4774-9694-05d5414b1c53
  cluster name  ceph
  crush device class
  encrypted 1
  osd fsid  4686f6fc-8dc1-48fd-a2d9-70a281c8ee64
  osd id2
  osdspec affinity
  type  block
  vdo   0
  devices   /dev/sde
```

How did it happen that `/dev/sde/ was claimed by osd.2?

Thank you!
Oleksiy
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: how to upgrade host os under ceph

2022-10-26 Thread shubjero
We've done 14.04 -> 16.04 -> 18.04 -> 20.04 all at various stages of our
ceph cluster life.

The latest 18.04 to 20.04 was painless and we ran:
apt update && apt dist-upgrade -y -o Dpkg::Options::=\"--force-confdef\" -o
Dpkg::Options::=\"--force-confold\"
do-release-upgrade --allow-third-party -f DistUpgradeViewNonInteractive


On Wed, Oct 26, 2022 at 11:17 AM Reed Dier  wrote:

> You should be able to `do-release-upgrade` from bionic/18 to focal/20.
>
> Octopus/15 is shipped for both dists from ceph.
> Its been a while since I did this, the release upgrader might disable the
> ceph repo, and uninstall the ceph* packages.
> However, the OSDs should still be there, re-enable the ceph repo, install
> ceph-osd, and then `ceph-volume lvm activate —all` should find and start
> all of the OSDs.
>
> Caveat, if you’re using cephadm, I’m sure the process is different.
> And also, if you’re trying to go to jammy/22, thats a different story,
> because ceph isn’t shipping packages for jammy yet for any version of ceph.
> I assume that they are going to ship quincy for jammy at some point, which
> will give a stepping stone from focal to jammy with the quincy release,
> because I don’t imagine that there will be a reef release for focal.
>
> Reed
>
> > On Oct 26, 2022, at 9:14 AM, Simon Oosthoek 
> wrote:
> >
> > Dear list,
> >
> > I'm looking for some guide or pointers to how people upgrade the
> underlying host OS in a ceph cluster (if this is the right way to proceed,
> I don't even know...)
> >
> > Our cluster is nearing the 4.5 years of age and now our ubuntu 18.04 is
> nearing the end of support date. We have a mixed cluster of u18 and u20
> nodes, all running octopus at the moment.
> >
> > We would like to upgrade the OS on the nodes, without changing the ceph
> version for now (or per se).
> >
> > Is it as easy as installing a new OS version, installing the ceph-osd
> package and a correct ceph.conf file and restoring the host key?
> >
> > Or is more needed regarding the specifics of the OSD disks/WAL/journal?
> >
> > Or is it necessary to drain a node of all data and re-add the OSDs as
> new units? (This would be too much work, so I doubt it ;-)
> >
> > The problem with searching for information about this, is that it seems
> undocumented in the ceph documentation, and search results are flooded with
> ceph version upgrades.
> >
> > Cheers
> >
> > /Simon
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: A question about rgw.otp pool

2022-10-26 Thread Satoru Takeuchi
Hi,

2022年10月24日(月) 11:22 Satoru Takeuchi :
...
> Could you tell me how to fix this problem and what is the `...rgw.opt` pool.

I understood that "...rgw.otp" pool is for mfa. In addition, I
consider this behavior is bug and opened a new issue.

pg autoscaler of rgw pools doesn't work after creating otp pool
https://tracker.ceph.com/issues/57937

Thanks,
Satoru
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: how to upgrade host os under ceph

2022-10-26 Thread Stefan Kooman

On 10/26/22 16:14, Simon Oosthoek wrote:

Dear list,

I'm looking for some guide or pointers to how people upgrade the 
underlying host OS in a ceph cluster (if this is the right way to 
proceed, I don't even know...)


Our cluster is nearing the 4.5 years of age and now our ubuntu 18.04 is 
nearing the end of support date. We have a mixed cluster of u18 and u20 
nodes, all running octopus at the moment.


We would like to upgrade the OS on the nodes, without changing the ceph 
version for now (or per se).


You can upgrade, or do a re-install. If you want to start fresh while 
keeping all OSD / MON data intact, then that's easily possible. OSDs can 
be activate with ceph-volume command. The mon store on the monitors 
should be preserved (make a backup). If it's on a separate disk 
(recommended) than you should leave it alone. Backup the /etc/ceph/ 
directory so after a re-install you can restore quickly and easily.
It's good to know how to recover from an OS failure in case disk(s) die 
anyway. We have an iPXE setup for (re-)installations, and if the node 
has a NVMe OS disk its bootstrapped in a couple of minutes.  Our 
current record of re-installing / reprovisioning a node (incl. recovery) 
is under 30 minutes .


Gr. Stefan
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io