date:20220125

[ceph-users] switch restart facilitating cluster/client network.

2022-01-25 Thread Marc

If the switch needs an update and needs to be restarted (expected 2 minutes). Can I just leave the cluster as it is, because ceph will handle this correctly? Or should I eg. put some vm's I am running in pause mode, or even stop them. What happens to the monitors? Can they handle this, or mayb

[ceph-users] How to remove stuck daemon?

2022-01-25 Thread Fyodor Ustinov

Hi! I have Ceph cluster version 16.2.7 with this error: root@s-26-9-19-mon-m1:~# ceph health detail HEALTH_WARN 1 failed cephadm daemon(s) [WRN] CEPHADM_FAILED_DAEMON: 1 failed cephadm daemon(s) daemon osd.91 on s-26-8-2-1 is in error state But I don't have that osd anymore. I deleted it. r

[ceph-users] Moving all s3 objects from an ec pool to a replicated pool using storage classes.

2022-01-25 Thread Frédéric Nass

Hello, I've just heard about storage classes and imagined how we could use them to migrate all S3 objects within a placement pool from an ec pool to a replicated pool (or vice-versa) for data resiliency reasons, not to save space. It looks possible since ; 1. data pools are associated to st

[ceph-users] Re: switch restart facilitating cluster/client network.

2022-01-25 Thread Janne Johansson

If you can stop VMs it will help, even if the cluster recovers quickly, VMs take great offense if a write does not finish within 120s, and many will put filesystems in readonly-mode if writes are delayed for so long, so if there is a 120s outage of IO, the VMs will be stuck/useless anyhow so you mi

[ceph-users] Re: Fwd: Lots of OSDs crashlooping (DRAFT - feedback?)

2022-01-25 Thread Dan van der Ster

Hi Benjamin, Apologies that I can't help for the bluestore issue. But that huge 100GB OSD consumption could be related to similar reports linked here: https://tracker.ceph.com/issues/53729 Does your cluster have the pglog_hardlimit set? # ceph osd dump | grep pglog flags sortbitwise,recovery_de

[ceph-users] Re: CephFS keyrings for K8s

2022-01-25 Thread Frédéric Nass

Hello Michal, With cephfs and a single filesystem shared across multiple k8s clusters, you should subvolumegroups to limit data exposure. You'll find an example of how to use subvolumegroups in the ceph-csi-cephfs helm chart [1]. Essentially you just have to set the subvolumeGroup to whatever

[ceph-users] Re: CephFS keyrings for K8s

2022-01-25 Thread Frédéric Nass

Le 25/01/2022 à 12:09, Frédéric Nass a écrit : Hello Michal, With cephfs and a single filesystem shared across multiple k8s clusters, you should subvolumegroups to limit data exposure. You'll find an example of how to use subvolumegroups in the ceph-csi-cephfs helm chart [1]. Essentially yo

[ceph-users] Re: switch restart facilitating cluster/client network.

2022-01-25 Thread Tyler Stachecki

I would still set noout on relevant parts of the cluster in case something goes south and it does take longer than 2 minutes. Otherwise OSDs will start outing themselves after 10 minutes or so by default and then you have a lot of churn going on. The monitors monitors will be fine unless you lose

[ceph-users] Re: Using s3website with ceph orch?

2022-01-25 Thread Manuel Holtgrewe

Thanks, I had another review of the configuration and it appears that the configuration *is* properly propagated to the daemon (also visible in my second link). I traced down my issues further and it looks like I have first tripped over the following issue again... https://tracker.ceph.com/issue

[ceph-users] Re: Moving all s3 objects from an ec pool to a replicated pool using storage classes.

2022-01-25 Thread Casey Bodley

On Tue, Jan 25, 2022 at 4:49 AM Frédéric Nass wrote: > > Hello, > > I've just heard about storage classes and imagined how we could use them > to migrate all S3 objects within a placement pool from an ec pool to a > replicated pool (or vice-versa) for data resiliency reasons, not to save > space.

[ceph-users] January Ceph Science Virtual User Group

2022-01-25 Thread Kevin Hrpcek

Hey all, Sorry for the late notice. We will be having a Ceph science/research/big cluster call on Wednesday January 26th. If anyone wants to discuss something specific they can add it to the pad linked below. If you have questions or comments you can contact me. This is an informal open call

[ceph-users] Monitoring ceph cluster

2022-01-25 Thread Michel Niyoyita

Hello team, I would like to monitor my ceph cluster using one of the monitoring tool, does someone has a help on that ? Michel ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Fwd: Lots of OSDs crashlooping (DRAFT - feedback?)

2022-01-25 Thread Dan van der Ster

On Tue, Jan 25, 2022 at 4:07 PM Frank Schilder wrote: > > Hi Dan, > > in several threads I have now seen statements like "Does your cluster have > the pglog_hardlimit set?". In this context, I would be grateful if you could > shed some light on the following: > > 1) How do I check that? > > Ther

[ceph-users] Re: Moving all s3 objects from an ec pool to a replicated pool using storage classes.

2022-01-25 Thread Frédéric Nass

Le 25/01/2022 à 14:48, Casey Bodley a écrit : On Tue, Jan 25, 2022 at 4:49 AM Frédéric Nass wrote: Hello, I've just heard about storage classes and imagined how we could use them to migrate all S3 objects within a placement pool from an ec pool to a replicated pool (or vice-versa) for data re

[ceph-users] Re: Moving all s3 objects from an ec pool to a replicated pool using storage classes.

2022-01-25 Thread Casey Bodley

On Tue, Jan 25, 2022 at 11:59 AM Frédéric Nass wrote: > > > Le 25/01/2022 à 14:48, Casey Bodley a écrit : > > On Tue, Jan 25, 2022 at 4:49 AM Frédéric Nass > > wrote: > >> Hello, > >> > >> I've just heard about storage classes and imagined how we could use them > >> to migrate all S3 objects with

[ceph-users] Re: Multipath and cephadm

2022-01-25 Thread Thomas Roth

Would like to know that as well. I have the same setup - cephadm, Pacific, CentOS8, and a host with a number of HDDs which are all connect by 2 paths. No way to use these without multipath > ceph orch daemon add osd serverX:/dev/sdax > Cannot update volume group ceph-51f8b9b0-2917-431d-8a6d-8f

[ceph-users] Re: Moving all s3 objects from an ec pool to a replicated pool using storage classes.

2022-01-25 Thread Frédéric Nass

Le 25/01/2022 à 18:28, Casey Bodley a écrit : On Tue, Jan 25, 2022 at 11:59 AM Frédéric Nass wrote: Le 25/01/2022 à 14:48, Casey Bodley a écrit : On Tue, Jan 25, 2022 at 4:49 AM Frédéric Nass wrote: Hello, I've just heard about storage classes and imagined how we could use them to migrate

[ceph-users] Re: Disk Failure Predication cloud module?

2022-01-25 Thread Yaarit Hatuka

Hi Jake, Many thanks for contributing the data. Indeed, our data scientists use the data from Backblaze too. Have you found strong correlations between device health metrics (such as reallocated sector count, or any combination of attributes) and read/write errors in /var/log/messages from what

[ceph-users] Re: Fwd: Lots of OSDs crashlooping (DRAFT - feedback?)

2022-01-25 Thread Benjamin Staffin

Thank you for your responses! Since yesterday we found that several OSD pods still had memory limits set, and in fact some of them (but far from all) were getting OOM killed, so we have fully removed those limits again. Unfortunately this hasn't helped much and there are still 50ish OSDs down. W

[ceph-users] Re: Disk Failure Predication cloud module?

2022-01-25 Thread Marc

Is there also (going to be) something available that works 'offline'? ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: 3 OSDs can not be started after a server reboot - rocksdb Corruption

2022-01-25 Thread Sebastian Mazza

Hey Igor, thank you for your response! >> >> Do you suggest to disable the HDD write-caching and / or the >> bluefs_buffered_io for productive clusters? >> > Generally upstream recommendation is to disable disk write caching, there > were multiple complains it might negatively impact the perf

[ceph-users] problems with snap-schedule on 16.2.7

2022-01-25 Thread Kyriazis, George

Hello Ceph users, I have a problem with scheduled snapshots on ceph 16.2.7 (in a Proxmox install). While trying to understand how snap schedules work, I created more schedules than I needed to: root@vis-mgmt:~# ceph fs snap-schedule list /backups/nassie/NAS /backups/nassie/NAS 1h 24h7d8w12m /b

[ceph-users] Re: Monitoring ceph cluster

2022-01-25 Thread Michel Niyoyita

Thank you for your email Szabo, these can be helpful , can you provide links then I start to work on it. Michel. On Tue, 25 Jan 2022, 18:51 Szabo, Istvan (Agoda), wrote: > Which monitoring tool? Like prometheus or nagios style thing? > We use sensu for keepalive and ceph health reporting + prom

[ceph-users] switch restart facilitating cluster/client network.

[ceph-users] How to remove stuck daemon?

[ceph-users] Moving all s3 objects from an ec pool to a replicated pool using storage classes.

[ceph-users] Re: switch restart facilitating cluster/client network.

[ceph-users] Re: Fwd: Lots of OSDs crashlooping (DRAFT - feedback?)

[ceph-users] Re: CephFS keyrings for K8s

[ceph-users] Re: CephFS keyrings for K8s

[ceph-users] Re: switch restart facilitating cluster/client network.

[ceph-users] Re: Using s3website with ceph orch?

[ceph-users] Re: Moving all s3 objects from an ec pool to a replicated pool using storage classes.

[ceph-users] January Ceph Science Virtual User Group

[ceph-users] Monitoring ceph cluster

[ceph-users] Re: Fwd: Lots of OSDs crashlooping (DRAFT - feedback?)

[ceph-users] Re: Moving all s3 objects from an ec pool to a replicated pool using storage classes.

[ceph-users] Re: Moving all s3 objects from an ec pool to a replicated pool using storage classes.

[ceph-users] Re: Multipath and cephadm

[ceph-users] Re: Moving all s3 objects from an ec pool to a replicated pool using storage classes.

[ceph-users] Re: Disk Failure Predication cloud module?

[ceph-users] Re: Fwd: Lots of OSDs crashlooping (DRAFT - feedback?)

[ceph-users] Re: Disk Failure Predication cloud module?

[ceph-users] Re: 3 OSDs can not be started after a server reboot - rocksdb Corruption

[ceph-users] problems with snap-schedule on 16.2.7

[ceph-users] Re: Monitoring ceph cluster

23 matches

Site Navigation

Mail list logo

Footer information