[ceph-users] Re: Ceph remote disaster recovery at PB scale

2022-04-06 Thread Arthur Outhenin-Chalandre
Hi,

On 4/1/22 10:56, huxia...@horebdata.cn wrote:
> 1) Rbd mirroring with Peta  bytes data is doable or not? are there any 
> practical limits on the size of the total data? 

So the first thing that matter with rbd replication is the amount of
data you write if you have a PB that mostly don't change your
replication would be mostly idle...

That being said the limit are mostly the reads that you can afford for
replication on the source cluster and the writes on your target
clusters. After that there is how much rbd-mirror can output but
theoretically you can scale the number of rbd-mirror you have if this is
a bottleneck.

> 2) Should i use parallel rbd mirroring daemons to speed up the sync process? 
> Or a single daemon would be sufficient?

Depends on the number of writes you have :). But well yes, rbd-mirror
essentially talk between themselves and distribute the work. Note that
this not a very smart work sharing, it tries to balance the number of
rbd image each daemon handle. So this essentially means that you could
technically have all your really busy image on one daemon for example.

Even if one would be sufficient for you, I would put at least two for
redundancy.

> 3) What could be the lagging time at the remote site? at most 1 minutes or 10 
> minutes?

It depends on the mode, with journal it's how much entry in the journal
you lag behind. With snapshots, it depends on your interval between
snapshots and the time you take to write the diff to the target cluster
essentially.

In my setup, when I tested  the journal mode I noticed a significantly
slower replication than what the snapshot mode have. I would encourage
you to read this set of slides that I presented last year at Ceph Month
June: https://codimd.web.cern.ch/p/-qWD2Y0S9#/. Feel free to test the
journal mode in your setup and report back to the list though, it could
be very interesting!

Cheers,

-- 
Arthur Outhenin-Chalandre
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph remote disaster recovery at PB scale

2022-04-06 Thread Eugen Block

Hi,

I don't mean to hijack this thread, I'm just curious about the  
multiple mirror daemons statement. Last year you mentioned that  
multiple daemons only make sense if you have different pools to mirror  
[1], at leat that's how I read it, you wrote:


[...] but actually you can have multiple rbd-mirror daemons per  
cluster. It's the number of peers that are limited to one remote  
peer per pool. So technically if using different pools, you should  
be able to have three clusters connected as long as you have only  
one remote peer per pool. I never tested it though...
[...] For further multi-peer support, I am currently working on  
adding support for it!


What's the current status on this? In the docs I find only a general  
statement that pre-Luminous you only could have one rbd mirror daemon  
per cluster.


2) Should i use parallel rbd mirroring daemons to speed up the sync  
process? Or a single daemon would be sufficient?


Depends on the number of writes you have :). But well yes, rbd-mirror
essentially talk between themselves and distribute the work. Note that
this not a very smart work sharing, it tries to balance the number of
rbd image each daemon handle. So this essentially means that you could
technically have all your really busy image on one daemon for example.


This seems to contradict my previous understanding, so apparently you  
can have multiple daemons per cluster and they spread the load among  
themselves independent of the pools? I'd appreciate any clarification.


Thanks!
Eugen

[1] https://www.spinics.net/lists/ceph-users/msg68736.html

Zitat von Arthur Outhenin-Chalandre :


Hi,

On 4/1/22 10:56, huxia...@horebdata.cn wrote:
1) Rbd mirroring with Peta  bytes data is doable or not? are there  
any practical limits on the size of the total data?


So the first thing that matter with rbd replication is the amount of
data you write if you have a PB that mostly don't change your
replication would be mostly idle...

That being said the limit are mostly the reads that you can afford for
replication on the source cluster and the writes on your target
clusters. After that there is how much rbd-mirror can output but
theoretically you can scale the number of rbd-mirror you have if this is
a bottleneck.

2) Should i use parallel rbd mirroring daemons to speed up the sync  
process? Or a single daemon would be sufficient?


Depends on the number of writes you have :). But well yes, rbd-mirror
essentially talk between themselves and distribute the work. Note that
this not a very smart work sharing, it tries to balance the number of
rbd image each daemon handle. So this essentially means that you could
technically have all your really busy image on one daemon for example.

Even if one would be sufficient for you, I would put at least two for
redundancy.

3) What could be the lagging time at the remote site? at most 1  
minutes or 10 minutes?


It depends on the mode, with journal it's how much entry in the journal
you lag behind. With snapshots, it depends on your interval between
snapshots and the time you take to write the diff to the target cluster
essentially.

In my setup, when I tested  the journal mode I noticed a significantly
slower replication than what the snapshot mode have. I would encourage
you to read this set of slides that I presented last year at Ceph Month
June: https://codimd.web.cern.ch/p/-qWD2Y0S9#/. Feel free to test the
journal mode in your setup and report back to the list though, it could
be very interesting!

Cheers,

--
Arthur Outhenin-Chalandre
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io




___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: RuntimeError on activate lvm

2022-04-06 Thread Eugen Block

Hi,

is there any specific reason why you do it manually instead of letting  
cephadm handle it? I might misremember but I believe for the manual  
lvm activation to work you need to pass the '--no-systemd' flag.


Regards,
Eugen

Zitat von Dominique Ramaekers :


Hi,


I've setup a ceph cluster using cephadmin on three ubuntu servers.  
Everything went great until I tried to activate a osd prepared on a  
lvm.



I have prepared 4 volumes with the command:

ceph-volume lvm prepare --data vg/lv


Now I try to activate one of them with the command (followed by the output):

root@hvs001:/# ceph-volume lvm activate 0  
25bfe96a-4f7a-47e1-8644-b74a4d104dbc

Running command: /usr/bin/mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-0
Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-0
Running command: /usr/bin/ceph-bluestore-tool --cluster=ceph  
prime-osd-dir --dev /dev/hvs001_sda2/lvol0 --path  
/var/lib/ceph/osd/ceph-0 --no-mon-config
Running command: /usr/bin/ln -snf /dev/hvs001_sda2/lvol0  
/var/lib/ceph/osd/ceph-0/block

Running command: /usr/bin/chown -h ceph:ceph /var/lib/ceph/osd/ceph-0/block
Running command: /usr/bin/chown -R ceph:ceph /dev/dm-1
Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-0
Running command: /usr/bin/systemctl enable  
ceph-volume@lvm-0-25bfe96a-4f7a-47e1-8644-b74a4d104dbc
 stderr: Created symlink  
/etc/systemd/system/multi-user.target.wants/ceph-volume@lvm-0-25bfe96a-4f7a-47e1-8644-b74a4d104dbc.service ->  
/usr/lib/systemd/system/ceph-volume@.service.

Running command: /usr/bin/systemctl enable --runtime ceph-osd@0
 stderr: Created symlink  
/run/systemd/system/ceph-osd.target.wants/ceph-osd@0.service ->  
/usr/lib/systemd/system/ceph-osd@.service.

Running command: /usr/bin/systemctl start ceph-osd@0
 stderr: Failed to connect to bus: No such file or directory
-->  RuntimeError: command returned non-zero exit status: 1


Seems systemd isn't playing along?


Please advice.


Some additional backround info:

root@hvs001:/# ceph status
  cluster:
id: dd4b0610-b4d2-11ec-bb58-d1b32ae31585
health: HEALTH_OK

  services:
mon: 3 daemons, quorum hvs001,hvs002,hvs003 (age 23m)
mgr: hvs001.baejuo(active, since 23m), standbys: hvs002.etijdk
osd: 4 osds: 0 up, 2 in (since 36m)

  data:
pools:   0 pools, 0 pgs
objects: 0 objects, 0 B
usage:   0 B used, 0 B / 0 B avail
pgs:


root@hvs001:/# ceph-volume lvm list


== osd.0 ===

  [block]   /dev/hvs001_sda2/lvol0

  block device  /dev/hvs001_sda2/lvol0
  block uuid6cEw8v-5xIA-K76l-7zIN-V2BK-RNWD-yGwfqp
  cephx lockbox secret
  cluster fsid  dd4b0610-b4d2-11ec-bb58-d1b32ae31585
  cluster name  ceph
  crush device class
  encrypted 0
  osd fsid  25bfe96a-4f7a-47e1-8644-b74a4d104dbc
  osd id0
  osdspec affinity
  type  block
  vdo   0
  devices   /dev/sda2

== osd.1 ===

  [block]   /dev/hvs001_sdb3/lvol1




Greetings,


Dominique.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io




___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: RuntimeError on activate lvm

2022-04-06 Thread Dominique Ramaekers
Hi Eugen,

Thanks for the quick response! I'm probably doing things the more difficult 
(wrong) way 😉

This is my first installation of a Ceph-cluster. I'm setting op three servers 
for non-critical data and low i/o-load.
I don't want to lose capacity in storage space by losing the entire disk on 
which the os is installed. The os disk is about 900Gb and I've partitioned 50Gb 
for the os. I want to use the remaining 850Gb as OSD.

First I've created a new partition of 850Gb and changed the type to 95 (Ceph 
OSD). Then I tried to add it to the cluster with 'ceph orch daemon add osd 
hvs002:/dev/sda3', but I got an error.

That's why I tried the lvm manual way.

I know using a partition next to the os isn't best practice. But pointers to 
'better practice' than what I describe above would be greatly appreciated.

Greetings,

Dominique.

> -Oorspronkelijk bericht-
> Van: Eugen Block 
> Verzonden: woensdag 6 april 2022 9:53
> Aan: ceph-users@ceph.io
> Onderwerp: [ceph-users] Re: RuntimeError on activate lvm
> 
> Hi,
> 
> is there any specific reason why you do it manually instead of letting
> cephadm handle it? I might misremember but I believe for the manual lvm
> activation to work you need to pass the '--no-systemd' flag.
> 
> Regards,
> Eugen
> 
> Zitat von Dominique Ramaekers :
> 
> > Hi,
> >
> >
> > I've setup a ceph cluster using cephadmin on three ubuntu servers.
> > Everything went great until I tried to activate a osd prepared on a
> > lvm.
> >
> >
> > I have prepared 4 volumes with the command:
> >
> > ceph-volume lvm prepare --data vg/lv
> >
> >
> > Now I try to activate one of them with the command (followed by the
> output):
> >
> > root@hvs001:/# ceph-volume lvm activate 0
> > 25bfe96a-4f7a-47e1-8644-b74a4d104dbc
> > Running command: /usr/bin/mount -t tmpfs tmpfs
> > /var/lib/ceph/osd/ceph-0 Running command: /usr/bin/chown -R
> ceph:ceph
> > /var/lib/ceph/osd/ceph-0 Running command: /usr/bin/ceph-bluestore-
> tool
> > --cluster=ceph prime-osd-dir --dev /dev/hvs001_sda2/lvol0 --path
> > /var/lib/ceph/osd/ceph-0 --no-mon-config Running command: /usr/bin/ln
> > -snf /dev/hvs001_sda2/lvol0 /var/lib/ceph/osd/ceph-0/block Running
> > command: /usr/bin/chown -h ceph:ceph /var/lib/ceph/osd/ceph-0/block
> > Running command: /usr/bin/chown -R ceph:ceph /dev/dm-1 Running
> > command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-0
> Running
> > command: /usr/bin/systemctl enable
> > ceph-volume@lvm-0-25bfe96a-4f7a-47e1-8644-b74a4d104dbc
> >  stderr: Created symlink
> > /etc/systemd/system/multi-user.target.wants/ceph-volume@lvm-0-
> 25bfe96a
> > -4f7a-47e1-8644-b74a4d104dbc.service -> /usr/lib/systemd/system/ceph-
> volume@.service.
> > Running command: /usr/bin/systemctl enable --runtime ceph-osd@0
> >  stderr: Created symlink
> > /run/systemd/system/ceph-osd.target.wants/ceph-osd@0.service ->
> > /usr/lib/systemd/system/ceph-osd@.service.
> > Running command: /usr/bin/systemctl start ceph-osd@0
> >  stderr: Failed to connect to bus: No such file or directory
> > -->  RuntimeError: command returned non-zero exit status: 1
> >
> >
> > Seems systemd isn't playing along?
> >
> >
> > Please advice.
> >
> >
> > Some additional backround info:
> >
> > root@hvs001:/# ceph status
> >   cluster:
> > id: dd4b0610-b4d2-11ec-bb58-d1b32ae31585
> > health: HEALTH_OK
> >
> >   services:
> > mon: 3 daemons, quorum hvs001,hvs002,hvs003 (age 23m)
> > mgr: hvs001.baejuo(active, since 23m), standbys: hvs002.etijdk
> > osd: 4 osds: 0 up, 2 in (since 36m)
> >
> >   data:
> > pools:   0 pools, 0 pgs
> > objects: 0 objects, 0 B
> > usage:   0 B used, 0 B / 0 B avail
> > pgs:
> >
> >
> > root@hvs001:/# ceph-volume lvm list
> >
> >
> > == osd.0 ===
> >
> >   [block]   /dev/hvs001_sda2/lvol0
> >
> >   block device  /dev/hvs001_sda2/lvol0
> >   block uuid6cEw8v-5xIA-K76l-7zIN-V2BK-RNWD-yGwfqp
> >   cephx lockbox secret
> >   cluster fsid  dd4b0610-b4d2-11ec-bb58-d1b32ae31585
> >   cluster name  ceph
> >   crush device class
> >   encrypted 0
> >   osd fsid  25bfe96a-4f7a-47e1-8644-b74a4d104dbc
> >   osd id0
> >   osdspec affinity
> >   type  block
> >   vdo   0
> >   devices   /dev/sda2
> >
> > == osd.1 ===
> >
> >   [block]   /dev/hvs001_sdb3/lvol1
> >
> > 
> >
> >
> > Greetings,
> >
> >
> > Dominique.
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an
> > email to ceph-users-le...@ceph.io
> 
> 
> 
> ___
> ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email
> to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an emai

[ceph-users] Re: Ceph remote disaster recovery at PB scale

2022-04-06 Thread Arthur Outhenin-Chalandre
Hi Eugen,

On 4/6/22 09:47, Eugen Block wrote:
> I don't mean to hijack this thread, I'm just curious about the  
> multiple mirror daemons statement. Last year you mentioned that  
> multiple daemons only make sense if you have different pools to mirror  
> [1], at leat that's how I read it, you wrote:
> 
>> [...] but actually you can have multiple rbd-mirror daemons per  
>> cluster. It's the number of peers that are limited to one remote  
>> peer per pool. So technically if using different pools, you should  
>> be able to have three clusters connected as long as you have only  
>> one remote peer per pool. I never tested it though...
>> [...] For further multi-peer support, I am currently working on  
>> adding support for it!
> 
> What's the current status on this? In the docs I find only a general  
> statement that pre-Luminous you only could have one rbd mirror daemon  
> per cluster.

Sorry if my last year message was confusing... I was talking about
adding multiple clusters as peer, so essentially doing `rbd mirror pool
peer add [...]` (or similar) on the same pool and cluster multiple times
which is still not possible in any stable version now (still progressing
on the matter in a PR upstream, I still have a few bugs but it mostly
works).

But indeed, yes you can launch multiple rbd-mirror daemons quite easily.
If you launch multiple rbd-mirror daemons on the same cluster, they will
elect a leader among themselves and then the leader will try to maintain
a equal amount of image each deamon handle. And there is no special
trick about distributing the work in multiple pools, each daemon should
handle images on all the pools where rbd replication is enabled.

You can see who is the leader etc with `rbd mirror pool status
--verbose`, for instance on one of our cluster:

```
$ rbd mirror pool status --verbose barn-mirror
[...]
DAEMONS
service 149710145:
  instance_id: 149710151
  client_id: barn-rbd-mirror-b
  hostname: barn-rbd-mirror-b.cern.ch
  version: 15.2.xx
  leader: false
  health: OK

service 149710160:
  instance_id: 149710166
  client_id: barn-rbd-mirror-c
  hostname: barn-rbd-mirror-c.cern.ch
  version: 15.2.xx
  leader: false
  health: OK

service 149781483:
  instance_id: 149710136
  client_id: barn-rbd-mirror-a
  hostname: barn-rbd-mirror-a.cern.ch
  version: 15.2.xx
  leader: true
  health: OK
[...]
```

Cheers,

-- 
Arthur Outhenin-Chalandre
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: RuntimeError on activate lvm

2022-04-06 Thread Dominique Ramaekers
Additionaly, if I try to add the volume automatically (I zapped the lvm and 
removed de osd entries with ceph osd rm, then recreated the lvm's). Now I get 
this...
Command: 'ceph orch daemon add osd hvs001:/dev/hvs001_sda2/lvol0'

Errors:
RuntimeError: cephadm exited with an error code: 1, stderr:Inferring config 
/var/lib/ceph/dd4b0610-b4d2-11ec-bb58-d1b32ae31585/mon.hvs001/config
Non-zero exit code 1 from /usr/bin/docker run --rm --ipc=host 
--stop-signal=SIGTERM --net=host --entrypoint /usr/sbin/ceph-volume 
--privileged --group-add=disk --init -e 
CONTAINER_IMAGE=quay.ceph.io/ceph-ci/ceph@sha256:3cf8e17ae80444cda3aa8872a36938b3e2b62fa564f29794773762406f9420d7
 -e NODE_NAME=hvs001 -e CEPH_USE_RANDOM_NONCE=1 -e 
CEPH_VOLUME_OSDSPEC_AFFINITY=None -e CEPH_VOLUME_SKIP_RESTORECON=yes -e 
CEPH_VOLUME_DEBUG=1 -v 
/var/run/ceph/dd4b0610-b4d2-11ec-bb58-d1b32ae31585:/var/run/ceph:z -v 
/var/log/ceph/dd4b0610-b4d2-11ec-bb58-d1b32ae31585:/var/log/ceph:z -v 
/var/lib/ceph/dd4b0610-b4d2-11ec-bb58-d1b32ae31585/crash:/var/lib/ceph/crash:z 
-v /dev:/dev -v /run/udev:/run/udev -v /sys:/sys -v /run/lvm:/run/lvm -v 
/run/lock/lvm:/run/lock/lvm -v /:/rootfs -v 
/tmp/ceph-tmpcv2el0nk:/etc/ceph/ceph.conf:z -v 
/tmp/ceph-tmpmc7njw96:/var/lib/ceph/bootstrap-osd/ceph.keyring:z 
quay.ceph.io/ceph-ci/ceph@sha256:3cf8e17ae80444cda3aa8872a36938b3e2b62fa564f29794773762406f9420d7
 lvm batch --no-auto /dev/hvs001_sda2/lvol0 --yes --no-systemd
/usr/bin/docker: stderr --> passed data devices: 0 physical, 1 LVM
/usr/bin/docker: stderr --> relative data size: 1.0
/usr/bin/docker: stderr Running command: /usr/bin/ceph-authtool --gen-print-key
/usr/bin/docker: stderr Running command: /usr/bin/ceph --cluster ceph --name 
client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - 
osd new a6a26aa1-894c-467b-bcae-1445213d6f91
/usr/bin/docker: stderr  stderr: Error EEXIST: entity osd.0 exists but key does 
not match
...



> -Oorspronkelijk bericht-
> Van: Dominique Ramaekers
> Verzonden: woensdag 6 april 2022 10:13
> Aan: 'Eugen Block' ; ceph-users@ceph.io
> Onderwerp: RE: [ceph-users] Re: RuntimeError on activate lvm
> 
> Hi Eugen,
> 
> Thanks for the quick response! I'm probably doing things the more difficult
> (wrong) way 😉
> 
> This is my first installation of a Ceph-cluster. I'm setting op three servers 
> for
> non-critical data and low i/o-load.
> I don't want to lose capacity in storage space by losing the entire disk on
> which the os is installed. The os disk is about 900Gb and I've partitioned 
> 50Gb
> for the os. I want to use the remaining 850Gb as OSD.
> 
> First I've created a new partition of 850Gb and changed the type to 95 (Ceph
> OSD). Then I tried to add it to the cluster with 'ceph orch daemon add osd
> hvs002:/dev/sda3', but I got an error.
> 
> That's why I tried the lvm manual way.
> 
> I know using a partition next to the os isn't best practice. But pointers to
> 'better practice' than what I describe above would be greatly appreciated.
> 
> Greetings,
> 
> Dominique.
> 
> > -Oorspronkelijk bericht-
> > Van: Eugen Block 
> > Verzonden: woensdag 6 april 2022 9:53
> > Aan: ceph-users@ceph.io
> > Onderwerp: [ceph-users] Re: RuntimeError on activate lvm
> >
> > Hi,
> >
> > is there any specific reason why you do it manually instead of letting
> > cephadm handle it? I might misremember but I believe for the manual
> > lvm activation to work you need to pass the '--no-systemd' flag.
> >
> > Regards,
> > Eugen
> >
> > Zitat von Dominique Ramaekers :
> >
> > > Hi,
> > >
> > >
> > > I've setup a ceph cluster using cephadmin on three ubuntu servers.
> > > Everything went great until I tried to activate a osd prepared on a
> > > lvm.
> > >
> > >
> > > I have prepared 4 volumes with the command:
> > >
> > > ceph-volume lvm prepare --data vg/lv
> > >
> > >
> > > Now I try to activate one of them with the command (followed by the
> > output):
> > >
> > > root@hvs001:/# ceph-volume lvm activate 0
> > > 25bfe96a-4f7a-47e1-8644-b74a4d104dbc
> > > Running command: /usr/bin/mount -t tmpfs tmpfs
> > > /var/lib/ceph/osd/ceph-0 Running command: /usr/bin/chown -R
> > ceph:ceph
> > > /var/lib/ceph/osd/ceph-0 Running command: /usr/bin/ceph-bluestore-
> > tool
> > > --cluster=ceph prime-osd-dir --dev /dev/hvs001_sda2/lvol0 --path
> > > /var/lib/ceph/osd/ceph-0 --no-mon-config Running command:
> > > /usr/bin/ln -snf /dev/hvs001_sda2/lvol0
> > > /var/lib/ceph/osd/ceph-0/block Running
> > > command: /usr/bin/chown -h ceph:ceph /var/lib/ceph/osd/ceph-0/block
> > > Running command: /usr/bin/chown -R ceph:ceph /dev/dm-1 Running
> > > command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-0
> > Running
> > > command: /usr/bin/systemctl enable
> > > ceph-volume@lvm-0-25bfe96a-4f7a-47e1-8644-b74a4d104dbc
> > >  stderr: Created symlink
> > > /etc/systemd/system/multi-user.target.wants/ceph-volume@lvm-0-
> > 25bfe96a
> > > -4f7a-47e1-8644-b74a4d104dbc.service ->
> > > /usr/lib/systemd/system/ceph-
> > volume@.servic

[ceph-users] Re: RuntimeError on activate lvm

2022-04-06 Thread Janne Johansson
Den ons 6 apr. 2022 kl 10:44 skrev Dominique Ramaekers
:
>
> Additionaly, if I try to add the volume automatically (I zapped the lvm and 
> removed de osd entries with ceph osd rm, then recreated the lvm's). Now I get 
> this...
> Command: 'ceph orch daemon add osd hvs001:/dev/hvs001_sda2/lvol0'

You should have used "ceph osd purge", the "ceph auth" for osd.0 is
still there from before.

> /usr/bin/docker: stderr  stderr: Error EEXIST: entity osd.0 exists but key 
> does not match
> ...


-- 
May the most significant bit of your life be positive.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] latest octopus radosgw missing cors header

2022-04-06 Thread Boris Behrens
Hi,

I just try to get CORS header to work (we've set them always on a front
facing HAproxy, but a customer wants their own).

I've set CORS policy via aws cli (please don't mind the test header):
$ cat cors.json
{"CORSRules": [{"AllowedOrigins": ["https://example.com"],"AllowedHeaders":
["test"],"AllowedMethods": ["HEAD", "GET", "PUT", "POST"],"MaxAgeSeconds":
3000}]}

$ aws s3api put-bucket-cors --bucket bb-test-bucket --cors-configuration
file://cors.json

$ s3cmd --config info s3://bb-test-bucket
s3://bb-test-bucket/ (bucket):
   Location:  eu
   Payer: BucketOwner
   Expiration Rule: none
   Policy:none
   CORS:  http://s3.amazonaws.com/doc/2006-03-01/
">GETPUTHEADPOST
https://example.com
test3000
   ACL:   *anon*: READ
   ACL:   4a60852b-9e03-4346-9c26-19a2b3913f63: FULL_CONTROL
   URL:   http://bb-test-bucket.kervyn.de/

But I don't get any CORS header back when I query the radosgw:
root@rgw-1:~# curl -s -D - -o /dev/null --header "Host: kervyn.de" 'http://
[fd00:2380:0:24::51]:7480/bb-test-bucket/hello.txt'
HTTP/1.1 200 OK
Content-Length: 12
Accept-Ranges: bytes
Last-Modified: Wed, 06 Apr 2022 08:33:55 GMT
x-rgw-object-type: Normal
ETag: "ed076287532e86365e841e92bfc50d8c"
x-amz-request-id: tx0bf7cf6cbcc6c8c93-00624d55e7-895784b1-eu-central-1
Content-Type: application/octet-stream
Date: Wed, 06 Apr 2022 08:57:11 GMT

Am I missing a config option?
I couldn't find anything helpful in the documentation.

Cheers
 Boris
-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groĂƒÆ’ÂŒen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: RuntimeError on activate lvm

2022-04-06 Thread Dominique Ramaekers
Thanks Janne ofr the tip! I removed the keys with 'ceph auth rm'. The lvm's are 
added now automatically!

> -Oorspronkelijk bericht-
> Van: Janne Johansson 
> Verzonden: woensdag 6 april 2022 10:48
> Aan: Dominique Ramaekers 
> CC: Eugen Block ; ceph-users@ceph.io
> Onderwerp: Re: [ceph-users] Re: RuntimeError on activate lvm
> 
> Den ons 6 apr. 2022 kl 10:44 skrev Dominique Ramaekers
> :
> >
> > Additionaly, if I try to add the volume automatically (I zapped the lvm and
> removed de osd entries with ceph osd rm, then recreated the lvm's). Now I
> get this...
> > Command: 'ceph orch daemon add osd hvs001:/dev/hvs001_sda2/lvol0'
> 
> You should have used "ceph osd purge", the "ceph auth" for osd.0 is still
> there from before.
> 
> > /usr/bin/docker: stderr  stderr: Error EEXIST: entity osd.0 exists but
> > key does not match ...
> 
> 
> --
> May the most significant bit of your life be positive.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph remote disaster recovery at PB scale

2022-04-06 Thread Eugen Block
Thanks for the clarification, I get it now. This would be quite  
helpful to have in the docs, I believe. ;-)



Zitat von Arthur Outhenin-Chalandre :


Hi Eugen,

On 4/6/22 09:47, Eugen Block wrote:

I don't mean to hijack this thread, I'm just curious about the
multiple mirror daemons statement. Last year you mentioned that
multiple daemons only make sense if you have different pools to mirror
[1], at leat that's how I read it, you wrote:


[...] but actually you can have multiple rbd-mirror daemons per
cluster. It's the number of peers that are limited to one remote
peer per pool. So technically if using different pools, you should
be able to have three clusters connected as long as you have only
one remote peer per pool. I never tested it though...
[...] For further multi-peer support, I am currently working on
adding support for it!


What's the current status on this? In the docs I find only a general
statement that pre-Luminous you only could have one rbd mirror daemon
per cluster.


Sorry if my last year message was confusing... I was talking about
adding multiple clusters as peer, so essentially doing `rbd mirror pool
peer add [...]` (or similar) on the same pool and cluster multiple times
which is still not possible in any stable version now (still progressing
on the matter in a PR upstream, I still have a few bugs but it mostly
works).

But indeed, yes you can launch multiple rbd-mirror daemons quite easily.
If you launch multiple rbd-mirror daemons on the same cluster, they will
elect a leader among themselves and then the leader will try to maintain
a equal amount of image each deamon handle. And there is no special
trick about distributing the work in multiple pools, each daemon should
handle images on all the pools where rbd replication is enabled.

You can see who is the leader etc with `rbd mirror pool status
--verbose`, for instance on one of our cluster:

```
$ rbd mirror pool status --verbose barn-mirror
[...]
DAEMONS
service 149710145:
  instance_id: 149710151
  client_id: barn-rbd-mirror-b
  hostname: barn-rbd-mirror-b.cern.ch
  version: 15.2.xx
  leader: false
  health: OK

service 149710160:
  instance_id: 149710166
  client_id: barn-rbd-mirror-c
  hostname: barn-rbd-mirror-c.cern.ch
  version: 15.2.xx
  leader: false
  health: OK

service 149781483:
  instance_id: 149710136
  client_id: barn-rbd-mirror-a
  hostname: barn-rbd-mirror-a.cern.ch
  version: 15.2.xx
  leader: true
  health: OK
[...]
```

Cheers,

--
Arthur Outhenin-Chalandre
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io




___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] mons on osd nodes with replication

2022-04-06 Thread Ali Akil

Hallo together,

i am planning a Ceph cluster on 3 storage nodes (12 OSDs per Cluster
with Bluestorage). Each node has 192 GB of memory nad 24 cores of cpu.
I know it's recommended to have separated MON and ODS hosts, in order to
minimize disruption since monitor and OSD daemons are not inactive at
the same time.

But the servers are really pimped up and it's expensive to buy another 3
nodes for the MONs.

I thought, If the the OSDs will be replicated on the host level, and a
node goes down, this should not be a problem if the MON run on the same
OSD nodes, as i have another two nodes with the replicated OSDs and
another MON running there.

Is that right, or would i have some performance impact, if  one node
(containing a Mon and OSDs) out of 3 goes down ?

Thanks a lot,
Ali

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: mons on osd nodes with replication

2022-04-06 Thread Eugen Block

Hi Ali,

it's very common to have MONs and OSDs colocated on the same host.

Zitat von Ali Akil :


Hallo together,

i am planning a Ceph cluster on 3 storage nodes (12 OSDs per Cluster
with Bluestorage). Each node has 192 GB of memory nad 24 cores of cpu.
I know it's recommended to have separated MON and ODS hosts, in order to
minimize disruption since monitor and OSD daemons are not inactive at
the same time.

But the servers are really pimped up and it's expensive to buy another 3
nodes for the MONs.

I thought, If the the OSDs will be replicated on the host level, and a
node goes down, this should not be a problem if the MON run on the same
OSD nodes, as i have another two nodes with the replicated OSDs and
another MON running there.

Is that right, or would i have some performance impact, if  one node
(containing a Mon and OSDs) out of 3 goes down ?

Thanks a lot,
Ali

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io




___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: latest octopus radosgw missing cors header

2022-04-06 Thread Boris Behrens
Ok, apparently I was "holding it wrong".
(kudos to dwfreed from IRC for helping)

When I send the Origin Header correct, I will also receive the correct CORS
header:

root@rgw-1:~# curl -s -D - -o /dev/null --header "Host: kervyn.de" -H
"Origin: https://example.com"; 'http://
[fd00:2380:0:24::51]:7480/bb-test-bucket/hello.txt'
HTTP/1.1 200 OK
Content-Length: 12
Accept-Ranges: bytes
Last-Modified: Wed, 06 Apr 2022 08:33:55 GMT
x-rgw-object-type: Normal
ETag: "ed076287532e86365e841e92bfc50d8c"
x-amz-request-id: tx0a43b06742d5fef2d-00624d6247-8ac9b571-eu-central-1
Access-Control-Allow-Origin: https://example.com
Vary: Origin
Access-Control-Allow-Methods: GET
Access-Control-Max-Age: 3000
Content-Type: application/octet-stream
Date: Wed, 06 Apr 2022 09:49:59 GMT

Thank you all :)

Am Mi., 6. Apr. 2022 um 11:01 Uhr schrieb Boris Behrens :

> Hi,
>
> I just try to get CORS header to work (we've set them always on a front
> facing HAproxy, but a customer wants their own).
>
> I've set CORS policy via aws cli (please don't mind the test header):
> $ cat cors.json
> {"CORSRules": [{"AllowedOrigins": ["https://example.com"],"AllowedHeaders":
> ["test"],"AllowedMethods": ["HEAD", "GET", "PUT", "POST"],"MaxAgeSeconds":
> 3000}]}
>
> $ aws s3api put-bucket-cors --bucket bb-test-bucket --cors-configuration
> file://cors.json
>
> $ s3cmd --config info s3://bb-test-bucket
> s3://bb-test-bucket/ (bucket):
>Location:  eu
>Payer: BucketOwner
>Expiration Rule: none
>Policy:none
>CORS:  http://s3.amazonaws.com/doc/2006-03-01/
> ">GETPUTHEADPOST
> https://example.com
> test3000
>ACL:   *anon*: READ
>ACL:   4a60852b-9e03-4346-9c26-19a2b3913f63: FULL_CONTROL
>URL:   http://bb-test-bucket.kervyn.de/
>
> But I don't get any CORS header back when I query the radosgw:
> root@rgw-1:~# curl -s -D - -o /dev/null --header "Host: kervyn.de"
> 'http://[fd00:2380:0:24::51]:7480/bb-test-bucket/hello.txt'
> HTTP/1.1 200 OK
> Content-Length: 12
> Accept-Ranges: bytes
> Last-Modified: Wed, 06 Apr 2022 08:33:55 GMT
> x-rgw-object-type: Normal
> ETag: "ed076287532e86365e841e92bfc50d8c"
> x-amz-request-id: tx0bf7cf6cbcc6c8c93-00624d55e7-895784b1-eu-central-1
> Content-Type: application/octet-stream
> Date: Wed, 06 Apr 2022 08:57:11 GMT
>
> Am I missing a config option?
> I couldn't find anything helpful in the documentation.
>
> Cheers
>  Boris
> --
> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
> groĂƒÆ’ÂŒen Saal.
>


-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groĂƒÆ’ÂŒen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Ceph status HEALT_WARN - pgs problems

2022-04-06 Thread Dominique Ramaekers
Hi,

My cluster is up and running. I saw a note in ceph status that 1 pg was 
undersized. I read about the amount of pgs and the recommended value 
(OSD's*100/poolsize => 6*100/3 = 200). The pg_num should be raised carfully, so 
I raised it to 2 and ceph status was fine again. So I left it like it was.

Than I created a new pool: libvirt-pool.

Now ceph status is again in warning regarding pgs. I raised pg_num_max of the 
libvirt_pool to 265 and pg_num to 128.

Ceph status stays in warning.
root@hvs001:/# ceph status
...
health: HEALTH_WARN
Reduced data availability: 64 pgs inactive
Degraded data redundancy: 68 pgs undersized
...
   pgs: 94.118% pgs not active
 4/6 objects misplaced (66.667%) -This is there from the beginning 
of the creation of the cluster-
 64 undersized+peered
 4  active+undersized+remapped

I also get a progress: global Recovery Event (0s) which only go's away with 
'ceph progress clear'

My autoscale-status is the following:
root@hvs001:/# ceph osd pool autoscale-status
POOLSIZE  TARGET SIZE  RATE  RAW CAPACITY   RATIO  TARGET RATIO  
EFFECTIVE RATIO  BIAS  PG_NUM  NEW PG_NUM  AUTOSCALE  BULK
.mgr  576.5k3.0 1743G  0.   
   1.0   1  on False
libvirt-pool  0 3.0 1743G  0.   
   1.0  64  on False

(It's a 3 node cluster with 2 OSD's per node.)

The documentation doesn't help me much here. What should I do?

Greetings,

Dominique.


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Quincy: mClock config propagation does not work properly

2022-04-06 Thread Sridhar Seshasayee
Hi Luis,

While I work on the fix, I thought a workaround could be useful. I am adding
it here and also in the tracker I mentioned in my previous update.

--

*Workaround*

Until a fix is available, the following workaround may be used
to override the parameters:

1. Run the injectargs command as shown in the following example
to override the mclock settings.

$ ceph tell osd.0 injectargs '--osd_mclock_scheduler_background_recovery_res=10'

OR

2. Another alternative command you could use is:

$ ceph daemon osd.0 config set osd_mclock_scheduler_background_recovery_res 10

3. The above settings are ephemeral and are lost in case the
OSD restarts. To ensure that the above values are retained
after an OSD restarts, run the following additional command,

$ ceph config set osd.0 osd_mclock_scheduler_background_recovery_res 10

The above steps must be followed for any subsequent change to
the mclock config parameters when using the 'custom' profile.

--

Thanks,

-Sridhar
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Ceph PGs stuck inactive after rebuild node

2022-04-06 Thread Eugen Block

Hi all,

I have a strange situation here, a Nautilus cluster with two DCs, the  
main pool is an EC pool with k7 m11, min_size = 8 (failure domain  
host). We confirmed failure resiliency multiple times for this  
cluster, today we rebuilt one node resulting in currently 34 inactive  
PGs. I'm wondering why they are inactive though. It's quite urgent and  
I'd like to get the PGs active again. Before rebuilding we didn't  
drain it though, but this procedure has worked multiple times in the  
past.
I haven't done too much damage yet, except for trying to force the  
backfill of one PG (ceph pg force-backfill ) to no avail yet. Any  
pointers are highly appreciated!


Regards,
Eugen

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph PGs stuck inactive after rebuild node

2022-04-06 Thread Eugen Block
Update: Restarting the primary PG helped to bring the PGs back to  
active state. Consider this thread closed.



Zitat von Eugen Block :


Hi all,

I have a strange situation here, a Nautilus cluster with two DCs,  
the main pool is an EC pool with k7 m11, min_size = 8 (failure  
domain host). We confirmed failure resiliency multiple times for  
this cluster, today we rebuilt one node resulting in currently 34  
inactive PGs. I'm wondering why they are inactive though. It's quite  
urgent and I'd like to get the PGs active again. Before rebuilding  
we didn't drain it though, but this procedure has worked multiple  
times in the past.
I haven't done too much damage yet, except for trying to force the  
backfill of one PG (ceph pg force-backfill ) to no avail yet.  
Any pointers are highly appreciated!


Regards,
Eugen




___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph PGs stuck inactive after rebuild node

2022-04-06 Thread Zakhar Kirpichenko
Hi Eugen,

Can you please elaborate on what you mean by "restarting the primary PG"?

Best regards,
Zakhar

On Wed, Apr 6, 2022 at 5:15 PM Eugen Block  wrote:

> Update: Restarting the primary PG helped to bring the PGs back to
> active state. Consider this thread closed.
>
>
> Zitat von Eugen Block :
>
> > Hi all,
> >
> > I have a strange situation here, a Nautilus cluster with two DCs,
> > the main pool is an EC pool with k7 m11, min_size = 8 (failure
> > domain host). We confirmed failure resiliency multiple times for
> > this cluster, today we rebuilt one node resulting in currently 34
> > inactive PGs. I'm wondering why they are inactive though. It's quite
> > urgent and I'd like to get the PGs active again. Before rebuilding
> > we didn't drain it though, but this procedure has worked multiple
> > times in the past.
> > I haven't done too much damage yet, except for trying to force the
> > backfill of one PG (ceph pg force-backfill ) to no avail yet.
> > Any pointers are highly appreciated!
> >
> > Regards,
> > Eugen
>
>
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph PGs stuck inactive after rebuild node

2022-04-06 Thread Eugen Block
Sure, from the output of 'ceph pg map ' you get the acting set,  
for example:


cephadmin:~ # ceph pg map 32.18
osdmap e7198 pg 32.18 (32.18) -> up [9,2,1] acting [9,2,1]

Then I restarted OSD.9 and the inactive PG became active again.
I remember this has been discussed a couple of times in the past on  
this list, but I'm wondering if this still happens in newer releases.  
I assume there's no way of preventing that, so we'll probably go with  
the safe approach on the next node. It's a production cluster and this  
incident was not expected, of course. At least we got it back online.



Zitat von Zakhar Kirpichenko :


Hi Eugen,

Can you please elaborate on what you mean by "restarting the primary PG"?

Best regards,
Zakhar

On Wed, Apr 6, 2022 at 5:15 PM Eugen Block  wrote:


Update: Restarting the primary PG helped to bring the PGs back to
active state. Consider this thread closed.


Zitat von Eugen Block :

> Hi all,
>
> I have a strange situation here, a Nautilus cluster with two DCs,
> the main pool is an EC pool with k7 m11, min_size = 8 (failure
> domain host). We confirmed failure resiliency multiple times for
> this cluster, today we rebuilt one node resulting in currently 34
> inactive PGs. I'm wondering why they are inactive though. It's quite
> urgent and I'd like to get the PGs active again. Before rebuilding
> we didn't drain it though, but this procedure has worked multiple
> times in the past.
> I haven't done too much damage yet, except for trying to force the
> backfill of one PG (ceph pg force-backfill ) to no avail yet.
> Any pointers are highly appreciated!
>
> Regards,
> Eugen



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io





___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph PGs stuck inactive after rebuild node

2022-04-06 Thread Zakhar Kirpichenko
Thanks everyone!

/Zakhar

On Wed, Apr 6, 2022 at 6:24 PM Josh Baergen 
wrote:

> For future reference, "ceph pg repeer " might have helped here.
>
> Was the PG stuck in the "activating" state? If so, I wonder if you
> temporarily exceeded mon_max_pg_per_osd on some OSDs when rebuilding
> your host. At least on Nautilus I've seen cases where Ceph doesn't
> gracefully recover from this temporary limit violation and the PGs
> need some nudges to become active.
>
> Josh
>
> On Wed, Apr 6, 2022 at 9:02 AM Eugen Block  wrote:
> >
> > Sure, from the output of 'ceph pg map ' you get the acting set,
> > for example:
> >
> > cephadmin:~ # ceph pg map 32.18
> > osdmap e7198 pg 32.18 (32.18) -> up [9,2,1] acting [9,2,1]
> >
> > Then I restarted OSD.9 and the inactive PG became active again.
> > I remember this has been discussed a couple of times in the past on
> > this list, but I'm wondering if this still happens in newer releases.
> > I assume there's no way of preventing that, so we'll probably go with
> > the safe approach on the next node. It's a production cluster and this
> > incident was not expected, of course. At least we got it back online.
> >
> >
> > Zitat von Zakhar Kirpichenko :
> >
> > > Hi Eugen,
> > >
> > > Can you please elaborate on what you mean by "restarting the primary
> PG"?
> > >
> > > Best regards,
> > > Zakhar
> > >
> > > On Wed, Apr 6, 2022 at 5:15 PM Eugen Block  wrote:
> > >
> > >> Update: Restarting the primary PG helped to bring the PGs back to
> > >> active state. Consider this thread closed.
> > >>
> > >>
> > >> Zitat von Eugen Block :
> > >>
> > >> > Hi all,
> > >> >
> > >> > I have a strange situation here, a Nautilus cluster with two DCs,
> > >> > the main pool is an EC pool with k7 m11, min_size = 8 (failure
> > >> > domain host). We confirmed failure resiliency multiple times for
> > >> > this cluster, today we rebuilt one node resulting in currently 34
> > >> > inactive PGs. I'm wondering why they are inactive though. It's quite
> > >> > urgent and I'd like to get the PGs active again. Before rebuilding
> > >> > we didn't drain it though, but this procedure has worked multiple
> > >> > times in the past.
> > >> > I haven't done too much damage yet, except for trying to force the
> > >> > backfill of one PG (ceph pg force-backfill ) to no avail yet.
> > >> > Any pointers are highly appreciated!
> > >> >
> > >> > Regards,
> > >> > Eugen
> > >>
> > >>
> > >>
> > >> ___
> > >> ceph-users mailing list -- ceph-users@ceph.io
> > >> To unsubscribe send an email to ceph-users-le...@ceph.io
> > >>
> >
> >
> >
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph PGs stuck inactive after rebuild node

2022-04-06 Thread Anthony D'Atri
Something worth a try before restarting an OSD in situations like this:

ceph osd down 9

This marks the OSD down in the osdmap, but doesn’t touch the daemon.

Typically the subject OSD will see this and tell the mons “I’m not dead yet!” 
and repeer, which sometimes suffices to clear glitches.



> Then I restarted OSD.9 and the inactive PG became active again.

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph PGs stuck inactive after rebuild node

2022-04-06 Thread Eugen Block
Thanks for the comments, I'll get the log files to see if there's any  
hint. Getting the PGs in an active state is one thing, I'm sure  
multiple approaches would have worked. The main question is why this  
happens, we have 19 hosts to rebuild and can't risk the application  
outage everytime.


Was the PG stuck in the "activating" state? If so, I wonder if you  
temporarily exceeded mon_max_pg_per_osd on some OSDs when rebuilding  
your host. At least on Nautilus I've seen cases where Ceph doesn't  
gracefully recover from this temporary limit violation and the PGs  
need some nudges to become active.


I'm pretty sure that their cluster isn't anywhere near the limit for  
mon_max_pg_per_osd, they currently have around 100 PGs per OSD and the  
configs have not been touched, it's pretty basic. This cluster was  
upgraded from Luminous to Nautilus a few months ago.


Zitat von Anthony D'Atri :


Something worth a try before restarting an OSD in situations like this:

ceph osd down 9

This marks the OSD down in the osdmap, but doesn’t touch the daemon.

Typically the subject OSD will see this and tell the mons “I’m not  
dead yet!” and repeer, which sometimes suffices to clear glitches.





Then I restarted OSD.9 and the inactive PG became active again.


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io




___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph PGs stuck inactive after rebuild node

2022-04-06 Thread Eugen Block
Basically, these are the steps to remove all OSDs from that host (OSDs  
are not "replaced" so they aren't marked "destroyed") [1]:


1) Call 'ceph osd out $id'
2) Call systemctl stop ceph-osd@$id
3) ceph osd purge $id --yes-i-really-mean-it
4) call ceph-volume lvm zap --osd-id $id --destroy

After all disks have been wiped there's a salt runner to deploy all  
available OSDs on that host again[2]. All OSDs are created with a  
normal weight. All OSD restarts I did were on different hosts, not on  
the rebuilt host. The only difference I can think of that may have an  
impact is that this cluster consists of two datacenters, the others  
were not devided into several buckets. Could that be an issue?


[1]  
https://github.com/SUSE/DeepSea/blob/master/srv/modules/runners/osd.py#L179

[2] https://github.com/SUSE/DeepSea/blob/master/srv/salt/_modules/dg.py#L1396

Zitat von Josh Baergen :


On Wed, Apr 6, 2022 at 11:20 AM Eugen Block  wrote:

I'm pretty sure that their cluster isn't anywhere near the limit for
mon_max_pg_per_osd, they currently have around 100 PGs per OSD and the
configs have not been touched, it's pretty basic.


How is the host being "rebuilt"? Depending on the CRUSH rule, if the
host's OSDs are all marked destroyed and then re-created one at a time
with normal weight, CRUSH may decide to put a large number of PGs on
the first OSD that is created, and so on, until the rest of the host's
OSDs are available to take those OSDs.

Josh




___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph DB mon increasing constantly + large osd_snap keys (nautilus)

2022-04-06 Thread Konstantin Shalygin
Hi

I suggest to upgrade to last Nautilus release!
Also, the last Nautilus release doesn't have fix for trimming osdmaps after PG 
merge [1] (and seems PR's for Nau never be merged). But we push the trimming 
via restart mon leader đŸ’â€â™‚ïž


[1] https://github.com/ceph/ceph/pull/43204
k

Sent from my iPhone

> On 6 Apr 2022, at 21:01, J-P Methot  wrote:
> ï»żHi,
> 
> 
> On a cluster running Nautilus 14.2.11, the store.db data space usage keeps 
> increasing. It went from 5GB to 20GB in a year.
> 
> We even had the following warning and adjust ‘mon_data_size_warn’ to 20Gi => 
> WARNING: MON_DISK_BIG( mon monitor1 is using a lot of disk space )
> 
> 
> But the disk space increase is constant about 1.5G per month.
> 
> 
> We did a 'ceph-monstore-tool /var/lib/ceph/mon/ceph-monitor1/ dump-keys | awk 
> '{print $1}'| uniq -c’ :
> 285 auth
>   2 config
>  10 health
>1435 logm
>   3 mdsmap
> 153 mgr
>   1 mgr_command_descs
>   3 mgr_metadata
>  51 mgrstat
>  13 mon_config_key
>   1 mon_sync
>   7 monitor
>   1 monitor_store
>   5 monmap
> 234 osd_metadata
>   1 osd_pg_creating
> 1152444 osd_snap
>  965071 osdmap
> 622 paxos
> 
> It appears that the osd_snap is eating up all the space. We have about 1100 
> snapshots total (they rotate every 72h).
> 
> I took a look at https://tracker.ceph.com/issues/42012 and it might be 
> related. However, from the bug report, that particular issue doesn't seem 
> fixed in Nautilus, but my 14.2.16 cluster that has similar usage doesn't have 
> this issue.
> 
> 
> Did anyone face the same issue and do you have a workaround/solution to avoid 
> mon’s db size increasing constantly ? Could a simple minor version upgrade 
> fix it or would I need to upgrade to Octopus?
> 
> -- 
> Jean-Philippe Méthot
> Senior Openstack system administrator
> Administrateur systÚme Openstack sénior
> PlanetHoster inc.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io