date:20231102

[ceph-users] diskprediction_local module and trained models

2023-11-02 Thread Can Özyurt

Hi everyone,

We have recently noticed diskprediction_local module only works for a
set of manufacturers. Hence we have the following questions:

Are there any plans to support more manufacturers in the near future?
Can we contribute to the process of training new models and how?
Can the existing models be used with disks of other vendors (with
hackish methods)? Is it that It just doesn't work or it would not be
reliable? Basically how does the trained model and the vendor relation
work?
Say we have a disk for an existing model, what is the best way to test
the trained model and the module itself?

Thanks in advance
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Ceph OSD reported Slow operations

2023-11-02 Thread Zakhar Kirpichenko

>1. The calculated IOPS is for the rw operation right ?

Total drive IOPS, read or write. Depending on the exact drive models, it
may be lower or higher than 200. I took the average for a smaller sized
7.2k rpm SAS drive. Modern drives usually deliver lower read IOPS and
higher write IOPS.

>2. Cluster is very busy? Is there any misconfiguration or missing tuning
paramater that makes the cluster busy?

You have almost 3k IOPS and your OSDs report slow ops. I'd say the cluster
is busy, as in loaded with I/O, perhaps more I/O than it can handle well.

>3. Nodes are not balanced?  you mean to say that the count of OSDs in each
server differs. But we have enabled autoscale and optimal distribution so
that you can see from the output of ceph osd df tree that is count of
pgs(45/OSD) and use% (65 to 67%). Is that not significant?

Yes, the OSD count differs. This means that the CPU, memory usage, network
load and latency differ per node and may cause performance variations,
depending on your workload.

/Z

On Thu, 2 Nov 2023 at 08:18, V A Prabha  wrote:

> Thanks for your prompt reply ..
> But the query is
> 1.The calculated IOPS is for the rw operation right ?
> 2. Cluster is very busy? Is there any misconfiguration or missing tuning
> paramater that makes the cluster busy?
> 3. Nodes are not balanced?  you mean to say that the count of OSDs in each
> server differs. But we have enabled autoscale and optimal distribution so
> that you can see from the output of ceph osd df tree that is count of
> pgs(45/OSD) and use% (65 to 67%). Is that not significant?
> Correct me if my queries are irrelevant
>
>
>
> On November 2, 2023 at 11:36 AM Zakhar Kirpichenko 
> wrote:
>
> Sure, it's 36 OSDs at 200 IOPS each (tops, likely lower), I assume size=3
> replication so 1/3 of the total performance, and some 30%-ish OSD
> overhead.
>
> (36 x 200) * 1/3 * 0.7 = 1680. That's how many IOPS you can realistically
> expect from your cluster. You get more than that, but the cluster is very
> busy and OSDs aren't coping.
>
> Also your nodes are not balanced.
>
> /Z
>
> On Thu, 2 Nov 2023 at 07:33, V A Prabha < prab...@cdac.in> wrote:
>
> Can you please elaborate your identifications and the statement .
>
>
> On November 2, 2023 at 9:40 AM Zakhar Kirpichenko < zak...@gmail.com>
> wrote:
>
> I'm afraid you're simply hitting the I/O limits of your disks.
>
> /Z
>
> On Thu, 2 Nov 2023 at 03:40, V A Prabha < prab...@cdac.in> wrote:
>
>  Hi Eugen
>  Please find the details below
>
>
> root@meghdootctr1:/var/log/ceph# ceph -s
> cluster:
> id: c59da971-57d1-43bd-b2b7-865d392412a5
> health: HEALTH_WARN
> nodeep-scrub flag(s) set
> 544 pgs not deep-scrubbed in time
>
> services:
> mon: 3 daemons, quorum meghdootctr1,meghdootctr2,meghdootctr3 (age 5d)
> mgr: meghdootctr1(active, since 5d), standbys: meghdootctr2, meghdootctr3
> mds: 3 up:standby
> osd: 36 osds: 36 up (since 34h), 36 in (since 34h)
> flags nodeep-scrub
>
> data:
> pools: 2 pools, 544 pgs
> objects: 10.14M objects, 39 TiB
> usage: 116 TiB used, 63 TiB / 179 TiB avail
> pgs: 544 active+clean
>
> io:
> client: 24 MiB/s rd, 16 MiB/s wr, 2.02k op/s rd, 907 op/s wr
>
>
> Ceph Versions:
>
> root@meghdootctr1:/var/log/ceph# ceph --version
> ceph version 14.2.16 (762032d6f509d5e7ee7dc008d80fe9c87086603c) nautilus
> (stable)
>
> Ceph df -h
> https://pastebin.com/1ffucyJg
>
> Ceph OSD performance dump
> https://pastebin.com/1R6YQksE
>
> Ceph tell osd.XX bench  (Out of 36 osds only 8 OSDs give High IOPS value
> of 250
> +. Out of that 4 OSDs are from HP 3PAR and 4 OSDS from DELL EMC. We are
> using
> only 4 OSDs from HP3 par and it is working fine without any latency and
> iops
> issues from the beginning but the remaining 32 OSDs are from DELL EMC in
> which 4
> OSDs are much better than the remaining 28 OSDs)
>
> https://pastebin.com/CixaQmBi
>
> Please help me to identify if the issue is with the DELL EMC Storage, Ceph
> configuration parameter tuning or the Overload in the cloud setup
>
>
>
> On November 1, 2023 at 9:48 PM Eugen Block < ebl...@nde.ag> wrote:
> > Hi,
> >
> > for starters please add more cluster details like 'ceph status', 'ceph
> > versions', 'ceph osd df tree'. Increasing the to 10G was the right
> > thing to do, you don't get far with 1G with real cluster load. How are
> > the OSDs configured (HDD only, SSD only or HDD with rocksdb on SSD)?
> > How is the disk utilization?
> >
> > Regards,
> > Eugen
> >
> > Zitat von prab...@cdac.in:
> >
> > > In a production setup of 36 OSDs( SAS disks) totalling 180 TB
> > > allocated to a single Ceph Cluster with 3 monitors and 3 managers.
> > > There were 830 volumes and VMs created in Openstack with Ceph as a
> > > backend. On Sep 21, users reported slowness in accessing the VMs.
> > > Analysing the logs lead us to problem with SAS , Network congestion
> > > and Ceph configuration( as all default values were used). We updated
> > > the Network from 1Gbps to 10Gbps for public and cluster networking.
> > > There was no change.
> > >

[ceph-users] Re: "cephadm version" in reef returns "AttributeError: 'CephadmContext' object has no attribute 'fsid'"

2023-11-02 Thread Eugen Block

There are a couple of examples in the docs [2], so in your case it  
probably would be something rather simple like:


service_type: osd
service_id: osd_spec_default
placement:
  host_pattern: '*'
spec:
  data_devices:
rotational: 1
  db_devices:
rotational: 0

You can apply that config to specific hosts or all of them, it really  
depends on your actual setup. You can also dry-run the config before  
applying it with the --dry-run flag:


ceph orch apply -i my-osd-specs.yaml --dry-run

I'd recommend to create a test cluster if possible to have some  
options to practice and get familiar with all that stuff.


Ideally I would use the commands to simply move the DB of my  
existing orchestrator deployed ODSs to the SSD, but when I tried  
that command it broke my OSD and I had to delete it and leave he  
cluster in a degraded state until it had recovered.


Do you still have the commands and the output somewhere what exactly  
went wrong? I haven't migrated DBs in quite some time, especially not  
in a newer version like Reef. I assume you tried it with  
bluefs-bdev-migrate?


I remember having seen these repeating scrub starts messages in this  
list, but I can't seem to find the right thread. I can't recall if  
there was a solution to that...
When was the last time you failed the mgr service? That still does  
help sometimes...


[2] https://docs.ceph.com/en/reef/cephadm/services/osd/#examples

Zitat von Martin Conway :

first of all, I'd still recommend to use the orchestrator to deploy  
OSDs. Building

OSDs manually and then adopt them is redundant. Or do you have issues with
the drivegroups?


I am having to do it this way because I couldn't find any doco on  
how to specify a separate DB/WAL device when deploying OSDs using  
the orchestrator. If there is such a command I agree it would be a  
better choice.


Ideally I would use the commands to simply move the DB of my  
existing orchestrator deployed ODSs to the SSD, but when I tried  
that command it broke my OSD and I had to delete it and leave he  
cluster in a degraded state until it had recovered. I find it very  
stressful when I get out of my depth with problems like that, so I  
gave up on that idea and am doing the remove, redploy, adopt method,  
which is working, but VERY slow.


I don't have *the* solution but you could try to disable the mclock  
scheduler
[1] which is the default since Quincy. Maybe that will speed up  
things? There

have been reports in the list about some unwanted or at least unexpected
behavior.


I did try this to try and speed up my rebalances, but it didn't seem  
to make much difference. I haven't tried it to see what difference  
it makes to scrubbing.



As for the "not (deep-)scrubbed in time" messages, there seems to be
progress (in your ceph status), but depending on the drive utilization you
could increase the number of scrubs per OSD (osd_max_scrubs).


There are lot of scrubs running, this morning after my rebalance  
finally completed it has 22 scrubbing, 6 deep scrubbing (across 28  
OSDs). This has fallen from the number it was running yesterday when  
the rebalance was still happening (38/9).


I believe if I kick the cluster by taking a host into maintenance  
and back the numbers will jump up again. The trouble is I don't know  
how to tell if a scrub is actually achieving something, stuck or  
restarting over and over.


My current ceph pg dump is:
https://pastebin.com/AQhNKSBN

and if I run it again a few minutes later:
https://pastebin.com/yfREzJ4s

I see evidence of scrubs not working because some of my OSD logs  
look like this:
2023-11-01T20:51:08.668+ 7f3be1328700  0 log_channel(cluster)  
log [DBG] : 6.2d scrub starts
2023-11-01T20:51:11.658+ 7f3bdfb25700  0 log_channel(cluster)  
log [DBG] : 5.1ac scrub starts
2023-11-01T20:51:19.565+ 7f3bdfb25700  0 log_channel(cluster)  
log [DBG] : 5.17 scrub starts
2023-11-01T20:51:20.516+ 7f3bdfb25700  0 log_channel(cluster)  
log [DBG] : 5.1d9 scrub starts
2023-11-01T20:51:22.463+ 7f3be1328700  0 log_channel(cluster)  
log [DBG] : 5.9b scrub starts
2023-11-01T20:51:24.488+ 7f3be0b27700  0 log_channel(cluster)  
log [DBG] : 5.65 scrub starts
2023-11-01T20:51:29.474+ 7f3be1328700  0 log_channel(cluster)  
log [DBG] : 6.2d scrub starts
2023-11-01T20:51:31.484+ 7f3bdfb25700  0 log_channel(cluster)  
log [DBG] : 5.1ac scrub starts
2023-11-01T20:51:34.455+ 7f3bdfb25700  0 log_channel(cluster)  
log [DBG] : 5.17 scrub starts
2023-11-01T20:51:39.444+ 7f3bdfb25700  0 log_channel(cluster)  
log [DBG] : 5.1d9 deep-scrub starts
2023-11-01T20:51:42.473+ 7f3be1328700  0 log_channel(cluster)  
log [DBG] : 5.9b scrub starts
2023-11-01T20:51:44.510+ 7f3be0b27700  0 log_channel(cluster)  
log [DBG] : 5.65 scrub starts
2023-11-01T20:51:46.491+ 7f3be1328700  0 log_channel(cluster)  
log [DBG] : 6.2d scrub starts
2023-11-01T20:51:47.465+ 7f3bdfb25700  0 log_channel(cluster)  
log [DBG] : 5.1ac scrub star

[ceph-users] Re: Setting S3 bucket policies with multi-tenants

2023-11-02 Thread Janne Johansson

Den ons 1 nov. 2023 kl 17:51 skrev Thomas Bennett :
>
> To update my own question, it would seem that  Principle should be
> defined like this:
>
>- "Principal": {"AWS": ["arn:aws:iam::Tenant1:user/readwrite"]}
>
> And resource should:
> "Resource": [ "arn:aws:s3:::backups"]
>
> Is it worth having the docs updates -
> https://docs.ceph.com/en/quincy/radosgw/bucketpolicy/
> to indicate that usfolks in the example is the tenant name?


A good idea.

Generally, docs should be lots more clear about which parts are chosen
by you, and which ones are inherent from some predefined role,
context, your setup, your domain or whatever.

It's hard enough to get all the finer points of rgw both from an admin
side, and as a power-user talking over the S3 apis, and if examples
"hide" things like the above as if perhaps "usfolks" is some weird
predefined thing AWS has brought along or something, then it gets lots
harder to grasp which parts I am supposed to replace and which must be
there.

Personally I would prefer colors, bold, or underlines or something to
distinguish things I should replace like endpoint url domains,
hostnames from the things which are not supposed to change like the
whole Resource thing up until the bucket name.

Looking at the example given in the docs:

"Principal": {"AWS": ["arn:aws:iam::usfolks:user/fred:subuser"]},
"Resource": [
  "arn:aws:s3:::happybucket/*"

the arn:aws:s3::: seems to indicate you can/should change only the
last part after the last : char, and then fill in the bucket name
there.

The arn:aws:iam on the other hand in this example is not solely the
last part after the last :, but also the next-to-last one. While this
probably is very obvious if you understand the AWS docs written
somewhere 35 links away, it would be nice IMHO if the ceph-rgw example
showed or at least hinted to me that it needs me to change two parts
in the iam entry and not only the last, because then the example would
not require me to also double-check the AWS reference manual to know
if I should edit one or two or all of the other  sections there.

Not saying ceph-rgw needs to fully replicate all of AWS S3 docs, but
at least help us out a bit here, please.

--
May the most significant bit of your life be positive.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Emergency, I lost 4 monitors but all osd disk are safe

2023-11-02 Thread Mohamed LAMDAOUAR

Hello,

  I have 7 machines on CEPH cluster, the service ceph runs on a docker
container.
 Each machine has 4 hdd of data (available) and 2 nvme sssd (bricked)
  During a reboot, the ssd bricked on 4 machines, the data are available on
the HDD disk but the nvme is bricked and the system is not available. is it
possible to recover the data of the cluster (the data disk are all
available)
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Emergency, I lost 4 monitors but all osd disk are safe

2023-11-02 Thread Boris Behrens

Hi Mohamed,
are all mons down, or do you still have at least one that is running?

AFAIK: the mons save their DB on the normal OS disks, and not within the
ceph cluster.
So if all mons are dead, which mean the disks which contained the mon data
are unrecoverable dead, you might need to bootstrap a new cluster and add
the OSDs to the new cluster. This will likely include tinkering with cephx
authentication, so you don't wipe the old OSD data.

If you still have at least ONE mon alive, you can shut it down, and remove
all the other mons from the monmap and start it again. You CAN have
clusters with only one mon.

Or is did your host just lost the boot disk and you just need to bring it
up somehow? losing 4x2 NVME disks at the same time, sounds a bit strange.

Am Do., 2. Nov. 2023 um 11:34 Uhr schrieb Mohamed LAMDAOUAR <
mohamed.lamdao...@enyx.fr>:

> Hello,
>
>   I have 7 machines on CEPH cluster, the service ceph runs on a docker
> container.
>  Each machine has 4 hdd of data (available) and 2 nvme sssd (bricked)
>   During a reboot, the ssd bricked on 4 machines, the data are available on
> the HDD disk but the nvme is bricked and the system is not available. is it
> possible to recover the data of the cluster (the data disk are all
> available)
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>


-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groÃƒ¼en Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Emergency, I lost 4 monitors but all osd disk are safe

2023-11-02 Thread Robert Sander


Hi,

On 11/2/23 11:28, Mohamed LAMDAOUAR wrote:


   I have 7 machines on CEPH cluster, the service ceph runs on a docker
container.
  Each machine has 4 hdd of data (available) and 2 nvme sssd (bricked)
   During a reboot, the ssd bricked on 4 machines, the data are available on
the HDD disk but the nvme is bricked and the system is not available. is it
possible to recover the data of the cluster (the data disk are all
available)


You can try to recover the MON db from the OSDs, as they keep a copy of it:

https://docs.ceph.com/en/reef/rados/troubleshooting/troubleshooting-mon/#monitor-store-failures

Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: ceph fs (meta) data inconsistent

2023-11-02 Thread Frank Schilder

Hi all,

the problem re-appeared in the following way. After moving the problematic 
folder out and copying it back, all files showed the correct sizes. Today, we 
observe that the issue is back in the copy that was fine yesterday:

[user1@host11 h2lib]$ ls -l
total 37198
-rw-rw 1 user1 user111641 Nov  2 11:32 dll_wrapper.py

[user2@host2 h2lib]# ls -l
total 44
-rw-rw. 1 user1 user1 0 Nov  2 11:28 dll_wrapper.py

It is correct in the snapshot though:

[user2@host2 h2lib]# cd .snap
[user2@host2 .snap]# ls
_2023-11-02_000611+0100_daily_1
[user2@host2 .snap]# cd _2023-11-02_000611+0100_daily_1/
[user2@host2 _2023-11-02_000611+0100_daily_1]# ls -l
total 37188
-rw-rw. 1 user1 user111641 Nov  1 13:30 dll_wrapper.py

It seems related to the path, bot the inode number. Could the re-appearance of 
the 0 length have been triggered by taking the snapshot?

We plan to reboot later today the server where the file was written. Until hen 
we can do diagnostics while the issue is visible. Please let us know what 
information we can provide.

Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14


From: Gregory Farnum 
Sent: Wednesday, November 1, 2023 4:57 PM
To: Frank Schilder; Xiubo Li
Cc: ceph-users@ceph.io
Subject: Re: [ceph-users] ceph fs (meta) data inconsistent

We have seen issues like this a few times and they have all been kernel client 
bugs with CephFS’ internal “capability” file locking protocol. I’m not aware of 
any extant bugs like this in our code base, but kernel patches can take a long 
and winding path before they end up on deployed systems.

Most likely, if you were to restart some combination of the client which wrote 
the file and the client(s) reading it, the size would propagate correctly. As 
long as you’ve synced the data, it’s definitely present in the cluster.

Adding Xiubo, who has worked on these and may have other comments.
-Greg

On Wed, Nov 1, 2023 at 7:16 AM Frank Schilder 
mailto:fr...@dtu.dk>> wrote:
Dear fellow cephers,

today we observed a somewhat worrisome inconsistency on our ceph fs. A file 
created on one host showed up as 0 length on all other hosts:

[user1@host1 h2lib]$ ls -lh
total 37M
-rw-rw 1 user1 user1  12K Nov  1 11:59 dll_wrapper.py

[user2@host2 h2lib]# ls -l
total 34
-rw-rw. 1 user1 user1 0 Nov  1 11:59 dll_wrapper.py

[user1@host1 h2lib]$ cp dll_wrapper.py dll_wrapper.py.test
[user1@host1 h2lib]$ ls -l
total 37199
-rw-rw 1 user1 user111641 Nov  1 11:59 dll_wrapper.py
-rw-rw 1 user1 user111641 Nov  1 13:10 dll_wrapper.py.test

[user2@host2 h2lib]# ls -l
total 45
-rw-rw. 1 user1 user1 0 Nov  1 11:59 dll_wrapper.py
-rw-rw. 1 user1 user1 11641 Nov  1 13:10 dll_wrapper.py.test

Executing a sync on all these hosts did not help. However, deleting the 
problematic file and replacing it with a copy seemed to work around the issue. 
We saw this with ceph kclients of different versions, it seems to be on the MDS 
side.

How can this happen and how dangerous is it?

ceph fs status (showing ceph version):

# ceph fs status
con-fs2 - 1662 clients
===
RANK  STATE MDS   ACTIVITY DNSINOS
 0active  ceph-15  Reqs:   14 /s  2307k  2278k
 1active  ceph-11  Reqs:  159 /s  4208k  4203k
 2active  ceph-17  Reqs:3 /s  4533k  4501k
 3active  ceph-24  Reqs:3 /s  4593k  4300k
 4active  ceph-14  Reqs:1 /s  4228k  4226k
 5active  ceph-13  Reqs:5 /s  1994k  1782k
 6active  ceph-16  Reqs:8 /s  5022k  4841k
 7active  ceph-23  Reqs:9 /s  4140k  4116k
POOL   TYPE USED  AVAIL
   con-fs2-meta1 metadata  2177G  7085G
   con-fs2-meta2   data   0   7085G
con-fs2-data   data1242T  4233T
con-fs2-data-ec-ssddata 706G  22.1T
   con-fs2-data2   data3409T  3848T
STANDBY MDS
  ceph-10
  ceph-08
  ceph-09
  ceph-12
MDS version: ceph version 15.2.17 (8a82819d84cf884bd39c17e3236e0632ac146dc4) 
octopus (stable)

There is no health issue:

# ceph status
  cluster:
id: abc
health: HEALTH_WARN
3 pgs not deep-scrubbed in time

  services:
mon: 5 daemons, quorum ceph-01,ceph-02,ceph-03,ceph-25,ceph-26 (age 9w)
mgr: ceph-25(active, since 7w), standbys: ceph-26, ceph-01, ceph-03, ceph-02
mds: con-fs2:8 4 up:standby 8 up:active
osd: 1284 osds: 1279 up (since 2d), 1279 in (since 5d)

  task status:

  data:
pools:   14 pools, 25065 pgs
objects: 2.20G objects, 3.9 PiB
usage:   4.9 PiB used, 8.2 PiB / 13 PiB avail
pgs: 25039 active+clean
 26active+clean+scrubbing+deep

  io:
client:   799 MiB/s rd, 55 MiB/s wr, 3.12k op/s rd, 1.82k op/s wr

The inconsistency seems undiagnosed, I couldn't find anything interesting in 
the cluster log. What should I look for and where?

I moved the folder to another location for diagnosis. Unfortunately, I don't 
have 2 clients any more showing differen

[ceph-users] Re: Emergency, I lost 4 monitors but all osd disk are safe

2023-11-02 Thread Joachim Kraftmayer - ceph ambassador


Hi,

another short note regarding the documentation, the paths are designed 
for a package installation.


the paths for container installation look a bit different e.g.: 
/var/lib/ceph//osd.y/


Joachim

___
ceph ambassador DACH
ceph consultant since 2012

Clyso GmbH - Premier Ceph Foundation Member

https://www.clyso.com/

Am 02.11.23 um 12:02 schrieb Robert Sander:

Hi,

On 11/2/23 11:28, Mohamed LAMDAOUAR wrote:


   I have 7 machines on CEPH cluster, the service ceph runs on a docker
container.
  Each machine has 4 hdd of data (available) and 2 nvme sssd (bricked)
   During a reboot, the ssd bricked on 4 machines, the data are 
available on
the HDD disk but the nvme is bricked and the system is not available. 
is it

possible to recover the data of the cluster (the data disk are all
available)


You can try to recover the MON db from the OSDs, as they keep a copy 
of it:


https://docs.ceph.com/en/reef/rados/troubleshooting/troubleshooting-mon/#monitor-store-failures 



Regards

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Emergency, I lost 4 monitors but all osd disk are safe

2023-11-02 Thread Mohamed LAMDAOUAR

Hello Boris,

I have one server monitor up and two other servers of the cluster are also
up (These two servers are not monitors ) .
I have four other servers down (the boot disk is out) but the osd data
disks are safe.
I reinstalled the OS on a  new SSD disk. How can I rebuild my cluster with
only one mons.
If you would like, you can join me for a meeting. I will give you more
information about the cluster.

Thanks for your help, I'm very stuck because the data is present but I
don't know how to add the old osd in the cluster to recover the data.



Le jeu. 2 nov. 2023 à 11:55, Boris Behrens  a écrit :

> Hi Mohamed,
> are all mons down, or do you still have at least one that is running?
>
> AFAIK: the mons save their DB on the normal OS disks, and not within the
> ceph cluster.
> So if all mons are dead, which mean the disks which contained the mon data
> are unrecoverable dead, you might need to bootstrap a new cluster and add
> the OSDs to the new cluster. This will likely include tinkering with cephx
> authentication, so you don't wipe the old OSD data.
>
> If you still have at least ONE mon alive, you can shut it down, and remove
> all the other mons from the monmap and start it again. You CAN have
> clusters with only one mon.
>
> Or is did your host just lost the boot disk and you just need to bring it
> up somehow? losing 4x2 NVME disks at the same time, sounds a bit strange.
>
> Am Do., 2. Nov. 2023 um 11:34 Uhr schrieb Mohamed LAMDAOUAR <
> mohamed.lamdao...@enyx.fr>:
>
> > Hello,
> >
> >   I have 7 machines on CEPH cluster, the service ceph runs on a docker
> > container.
> >  Each machine has 4 hdd of data (available) and 2 nvme sssd (bricked)
> >   During a reboot, the ssd bricked on 4 machines, the data are available
> on
> > the HDD disk but the nvme is bricked and the system is not available. is
> it
> > possible to recover the data of the cluster (the data disk are all
> > available)
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> >
>
>
> --
> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
> groÃƒ¼en Saal.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Emergency, I lost 4 monitors but all osd disk are safe

2023-11-02 Thread Mohamed LAMDAOUAR

Thanks Robert,

I tried this but I'm stuck. If you have some time, do help me with that I
will be very happy because I'm lost :(



8 rue greneta, 75003, Paris, FRANCE
enyx.com


*exegy.com
*

*Mohamed*

*Lamdaouar*

*Infrastructure Engineer*

mohamed.lamdao...@enyx.fr



Le jeu. 2 nov. 2023 à 12:08, Robert Sander  a
écrit :

> Hi,
>
> On 11/2/23 11:28, Mohamed LAMDAOUAR wrote:
>
> >I have 7 machines on CEPH cluster, the service ceph runs on a docker
> > container.
> >   Each machine has 4 hdd of data (available) and 2 nvme sssd (bricked)
> >During a reboot, the ssd bricked on 4 machines, the data are
> available on
> > the HDD disk but the nvme is bricked and the system is not available. is
> it
> > possible to recover the data of the cluster (the data disk are all
> > available)
>
> You can try to recover the MON db from the OSDs, as they keep a copy of it:
>
>
> https://antiphishing.vadesecure.com/v4?f=V3p0eFlQOUZ4czh2enpJS7j2EMloAUIO32fbOvv14lAE8TBfIaEnIXo1udoGNqFP&i=SHV0Y1JZQjNyckJFa3dUQgXEoh5gS3KVt16QIfDo2EM&k=ZVd0&r=T0hnMlUyVEgwNmlmdHc1NSadQBRlJgURlsomDarMY4-jUWYEpAndmhRvBGKG0NP9&s=ce3a282420f5ed2a0e4f461ee9ce796e4a8a9a8a48926d26f05d22ec8b4821cf&u=https%3A%2F%2Fdocs.ceph.com%2Fen%2Freef%2Frados%2Ftroubleshooting%2Ftroubleshooting-mon%2F%23monitor-store-failures
>
> Regards
> --
> Robert Sander
> Heinlein Consulting GmbH
> Schwedter Str. 8/9b, 10119 Berlin
>
>
> https://antiphishing.vadesecure.com/v4?f=V3p0eFlQOUZ4czh2enpJS7j2EMloAUIO32fbOvv14lAE8TBfIaEnIXo1udoGNqFP&i=SHV0Y1JZQjNyckJFa3dUQgXEoh5gS3KVt16QIfDo2EM&k=ZVd0&r=T0hnMlUyVEgwNmlmdHc1NSadQBRlJgURlsomDarMY4-jUWYEpAndmhRvBGKG0NP9&s=9be6bbd716ad7f48663775ee06c8a10f00ee18518820dff2f66261e44d97e44b&u=https%3A%2F%2Fwww.heinlein-support.de
>
> Tel: 030 / 405051-43
> Fax: 030 / 405051-19
>
> Amtsgericht Berlin-Charlottenburg - HRB 220009 B
> Geschäftsführer: Peer Heinlein - Sitz: Berlin
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Emergency, I lost 4 monitors but all osd disk are safe

2023-11-02 Thread Mohamed LAMDAOUAR

Thanks Joachim for the clarification ;)


8 rue greneta, 75003, Paris, FRANCE
enyx.com


*exegy.com
*

*Mohamed*

*Lamdaouar*

*Infrastructure Engineer*

mohamed.lamdao...@enyx.fr



Le jeu. 2 nov. 2023 à 12:32, Joachim Kraftmayer - ceph ambassador <
joachim.kraftma...@clyso.com> a écrit :

> Hi,
>
> another short note regarding the documentation, the paths are designed
> for a package installation.
>
> the paths for container installation look a bit different e.g.:
> /var/lib/ceph//osd.y/
>
> Joachim
>
> ___
> ceph ambassador DACH
> ceph consultant since 2012
>
> Clyso GmbH - Premier Ceph Foundation Member
>
>
> https://antiphishing.vadesecure.com/v4?f=YzVlb2dsZkpsODdFRWdva9L1IE1w2EQLvMdHLN3sufEeUE26-Zjd5he_hHQGVRiZ&i=QnFJT0s3VDByR0FXRXVPd5pLO-MYgJX1RP9aJCymua0&k=jSd7&r=S25kRlB1M01yME9kTDh5eGIDikNixeEjoSEXXcwvyF-L3OOm1Nlw9ziPIdSCShRH&s=7e8601d35f271e8f59c60a0f355ae783469031960405ac244a2a5c2a7cd1fffb&u=https%3A%2F%2Fwww.clyso.com%2F
>
> Am 02.11.23 um 12:02 schrieb Robert Sander:
> > Hi,
> >
> > On 11/2/23 11:28, Mohamed LAMDAOUAR wrote:
> >
> >>I have 7 machines on CEPH cluster, the service ceph runs on a docker
> >> container.
> >>   Each machine has 4 hdd of data (available) and 2 nvme sssd (bricked)
> >>During a reboot, the ssd bricked on 4 machines, the data are
> >> available on
> >> the HDD disk but the nvme is bricked and the system is not available.
> >> is it
> >> possible to recover the data of the cluster (the data disk are all
> >> available)
> >
> > You can try to recover the MON db from the OSDs, as they keep a copy
> > of it:
> >
> >
> https://antiphishing.vadesecure.com/v4?f=YzVlb2dsZkpsODdFRWdva9L1IE1w2EQLvMdHLN3sufEeUE26-Zjd5he_hHQGVRiZ&i=QnFJT0s3VDByR0FXRXVPd5pLO-MYgJX1RP9aJCymua0&k=jSd7&r=S25kRlB1M01yME9kTDh5eGIDikNixeEjoSEXXcwvyF-L3OOm1Nlw9ziPIdSCShRH&s=e6330edf5d51634451c52072757d4758b6043067ad1106fe0543ea0ae7eac543&u=https%3A%2F%2Fdocs.ceph.com%2Fen%2Freef%2Frados%2Ftroubleshooting%2Ftroubleshooting-mon%2F%23monitor-store-failures
> >
> >
> > Regards
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Emergency, I lost 4 monitors but all osd disk are safe

2023-11-02 Thread Robert Sander


On 11/2/23 12:48, Mohamed LAMDAOUAR wrote:


I reinstalled the OS on a  new SSD disk. How can I rebuild my cluster with
only one mons.


If there is one MON still operating you can try to extract its monmap 
and remove all the other MONs from it with the monmaptool:


https://docs.ceph.com/en/latest/man/8/monmaptool/
https://docs.ceph.com/en/latest/rados/troubleshooting/troubleshooting-mon/#recovering-a-monitor-s-broken-monmap

This way the remaining MON will be the only one in the map and will have 
quorum and the cluster will work again.


Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Emergency, I lost 4 monitors but all osd disk are safe

2023-11-02 Thread Mohamed LAMDAOUAR

Hi robert,

when I ran this command, I got this error (because the database of the osd
was on the boot disk)

ceph-objectstore-tool \
> --type bluestore \
> --data-path /var/lib/ceph/c80891ba-55f3-11ed-9389-919f4368965c/osd.9 \
> --op update-mon-db \
> --mon-store-path /home/enyx-admin/backup-osd-9 \
> --no-mon-config --debug
2023-11-02T10:59:33.381+ 7f6724da71c0  1 bdev(0x560257b42400
/var/lib/ceph/c80891ba-55f3-11ed-9389-919f4368965c/osd.9/block) open path
/var/lib/ceph/c80891ba-55f3-11ed-9389-919f4368965c/osd.9/block

2023-11-02T10:59:33.381+ 7f6724da71c0  1 bdev(0x560257b42400
/var/lib/ceph/c80891ba-55f3-11ed-9389-919f4368965c/osd.9/block) open size
1827154432 (0x9187fc0, 9.1 TiB) block_size 4096 (4 KiB) rotational
device, discard not supported

2023-11-02T10:59:33.381+ 7f6724da71c0  1 bluestore(/var/lib/ceph/
c80891ba-55f3-11ed-9389-919f4368965c/osd.9) _set_cache_sizes cache_size
1073741824 meta 0.45 kv 0.45 data 0.06

2023-11-02T10:59:33.381+ 7f6724da71c0  1 bdev(0x560257b42c00
/var/lib/ceph/c80891ba-55f3-11ed-9389-919f4368965c/osd.9/block) open path
/var/lib/ceph/c80891ba-55f3-11ed-9389-919f4368965c/osd.9/block

2023-11-02T10:59:33.381+ 7f6724da71c0  1 bdev(0x560257b42c00
/var/lib/ceph/c80891ba-55f3-11ed-9389-919f4368965c/osd.9/block) open size
1827154432 (0x9187fc0, 9.1 TiB) block_size 4096 (4 KiB) rotational
device, discard not supported

2023-11-02T10:59:33.381+ 7f6724da71c0  1 bluefs add_block_device bdev 1
path /var/lib/ceph/c80891ba-55f3-11ed-9389-919f4368965c/osd.9/block size
9.1 TiB

2023-11-02T10:59:33.381+ 7f6724da71c0  1 bluefs mount

2023-11-02T10:59:33.381+ 7f6724da71c0  1 bluefs _init_alloc shared, id
1, capacity 0x9187fc0, block size 0x1

2023-11-02T10:59:33.441+ 7f6724da71c0 -1 bluefs _replay 0x0: stop: uuid
369c96dd-2df1-8d88-2722-3f8334920e83 != super.uuid
ba94c6e8-394b-4a78-84d6-9afe1cbc280b,
block dump:
  42 9c 61 78 69 ec 36 9c  96 dd 2d f1 8d 88 27 22
 |B.axi.6...-...'"|
0010  3f 83 34 92 0e 83 e6 0f  8f 17 fc 3e ec 86 c5 15
 |?.4>|
0020  39 91 13 2e b0 14 92 86  65 75 5c 8e c1 ee fc 18
 |9...eu\.|
0030  f1 7b b2 37 f7 75 70 e2  5e da 79 cd e6 ad 27 40
 |.{.7.up.^.y...'@|
0040  d6 b8 3b da 81 1f 9b ba  c6 e8 b7 68 bc a1 77 ac
 |..;h..w.|
0050  7b a9 a3 cd 9d da b6 57  aa 40 bd ab d0 89 ec e6  |{..W.@
..|
0060  71 a2 2b 4d 87 74 2f ff  0a bf 3b da 3d da 93 52
 |q.+M.t/...;.=..R|
0070  1c ea f2 fb 8d e0 a1 e6  ef b5 42 5e 85 87 27 df
 |..B^..'.|
0080  ac f1 ae 08 9d c5 71 6f  0f f7 68 ce 28 3d 3e 6e
 |..qo..h.(=>n|
0090  94 b2 1a dc 3b f0 9e e9  6e 77 dd 95 b6 9e 94 56
 |;...nw.V|
00a0  f2 dd 9a 35 a0 65 78 05  bb a9 5f a1 99 6a 5c a1
 |...5.ex..._..j\.|
00b0  5d e9 6d 02 83 be 9d 60  d1 82 fc 6c 66 40 11 17  |].m`...lf@
..|
00c0  3a 4d 9d 73 f6 ec fb ed  41 db e2 39 15 e1 5f 28
 |:M.sA..9.._(|
00d0  c4 ce cf eb 93 f2 88 d5  af ae 11 14 d6 97 74 ff
 |..t.|
00e0  4b 7e 73 fe 97 4c 06 2a  3a bc b3 7f 04 94 6c 1d
 |K~s..L.*:.l.|
00f0  60 bf b1 42 fa 76 b0 df  33 ff bf 84 36 b1 b5 b3
 |`..B.v..3...6...|
0100  17 36 d6 b7 7d 4c d4 37  fa 7f 8e 59 1f 72 53 d5
 |.6..}L.7...Y.rS.|
0110  c4 d0 de d8 4e 13 ca c6  0a 60 87 3c e4 21 2b 1b
 |N`.<.!+.|
0120  00 f2 67 cf 0a 02 01 20  ec ec 7f c1 8f e3 df f8  |..g
|
0130  3f db 7f 60 28 14 8a fa  48 cb c6 f6 c7 9a 3f 71
 |?..`(...H.?q|
0140  bf 61 36 30 08 c0 f1 e7  f8 af b5 7f d2 fc ad a1
 |.a60|
0150  72 b2 40 ff 82 ff a3 c7  5f f0 a3 0e 8f b2 fe b6  |r.@
._...|
0160  ee 2f 5d fe 90 8b fa 28  8f 95 03 fa 5b ee e3 9c
 |./]([...|
0170  36 ea 3f 6a 1e c0 fe bb  c2 80 4a 56 ca 96 26 8f
 |6.?j..JV..&.|
0180  85 03 e0 f8 67 c9 3d a8  fa 97 af c5 c0 00 ce 7f
 |g.=.|
0190  cd 83 ff 36 ff c0 1c f0  7b c1 03 cf b7 b6 56 06
 |...6{.V.|
01a0  8a 30 7b 4d e0 5b 11 31  a0 12 cc d9 5e fb 7f 2d
 |.0{M.[.1^..-|
01b0  fb 47 04 df ea 1b 3d 3e  6c 1f f7 07 96 df 97 cf
 |.G=>l...|
01c0  15 60 76 56 0e b6 06 30  3b c0 6f 11 0a 40 19 98  |.`vV...0;.o..@
..|
01d0  a1 89 fe e3 b6 f3 28 80  83 e5 c1 1d 9c ac da 40
 |..(@|
01e0  71 5b 2b 07 eb 07 2e 8a  0f 71 7f 88 aa f5 23 0b
 |q[+..q#.|
01f0  03 17 a0 b0 e2 c3 76 e3  68 62 00 53 10 17 02 4a
 |..v.hb.S...J|
0200  02 ec 1f 72 82 8f 0f 28  fc a0 e0 83 04 3b c0 e3
 |...r...(.;..|
0210  6b 25 15 fe ae ce df de  33 29 6c e5 f0 68 c6 1f
 |k%..3)l..h..|
0220  e9 01 00 ff f1 be c9 c7  f4 f8 73 fc bf cc 80 fe
 |..s.|
0230  73 1d 08 a8 64 62 6f 0e  e3 11 13 15 13 03 81 58
 |s...dboX|
0240  a1 20 10 9b f0 43 e3 7c  68 2c 0f ed 21 34 10 10  |.
...C.|h,..!4..|
0250  08 04 05 f3 fd de 0f ed  35 ff b0 4d 4d 5d e3 a1
 |5..MM]..|
0260  7f 30 08 f0 b0 85 fe e9  06 f0 3f 95 64 f9 2f 3e
 |.0

[ceph-users] Re: Emergency, I lost 4 monitors but all osd disk are safe

2023-11-02 Thread Boris Behrens

Hi,
follow these instructions:
https://docs.ceph.com/en/quincy/rados/operations/add-or-rm-mons/#removing-monitors-from-an-unhealthy-cluster
As you are using containers, you might need to specify the --mon-data
directory (/var/lib/CLUSTER_UUID/mon.MONNAME) (actually I never did this in
an orchestrator environment)

Good luck.


Am Do., 2. Nov. 2023 um 12:48 Uhr schrieb Mohamed LAMDAOUAR <
mohamed.lamdao...@enyx.fr>:

> Hello Boris,
>
> I have one server monitor up and two other servers of the cluster are also
> up (These two servers are not monitors ) .
> I have four other servers down (the boot disk is out) but the osd data
> disks are safe.
> I reinstalled the OS on a  new SSD disk. How can I rebuild my cluster with
> only one mons.
> If you would like, you can join me for a meeting. I will give you more
> information about the cluster.
>
> Thanks for your help, I'm very stuck because the data is present but I
> don't know how to add the old osd in the cluster to recover the data.
>
>
>
> Le jeu. 2 nov. 2023 à 11:55, Boris Behrens  a écrit :
>
>> Hi Mohamed,
>> are all mons down, or do you still have at least one that is running?
>>
>> AFAIK: the mons save their DB on the normal OS disks, and not within the
>> ceph cluster.
>> So if all mons are dead, which mean the disks which contained the mon data
>> are unrecoverable dead, you might need to bootstrap a new cluster and add
>> the OSDs to the new cluster. This will likely include tinkering with cephx
>> authentication, so you don't wipe the old OSD data.
>>
>> If you still have at least ONE mon alive, you can shut it down, and remove
>> all the other mons from the monmap and start it again. You CAN have
>> clusters with only one mon.
>>
>> Or is did your host just lost the boot disk and you just need to bring it
>> up somehow? losing 4x2 NVME disks at the same time, sounds a bit strange.
>>
>> Am Do., 2. Nov. 2023 um 11:34 Uhr schrieb Mohamed LAMDAOUAR <
>> mohamed.lamdao...@enyx.fr>:
>>
>> > Hello,
>> >
>> >   I have 7 machines on CEPH cluster, the service ceph runs on a docker
>> > container.
>> >  Each machine has 4 hdd of data (available) and 2 nvme sssd (bricked)
>> >   During a reboot, the ssd bricked on 4 machines, the data are
>> available on
>> > the HDD disk but the nvme is bricked and the system is not available.
>> is it
>> > possible to recover the data of the cluster (the data disk are all
>> > available)
>> > ___
>> > ceph-users mailing list -- ceph-users@ceph.io
>> > To unsubscribe send an email to ceph-users-le...@ceph.io
>> >
>>
>>
>> --
>> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
>> groÃƒ¼en Saal.
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
>>
>

-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groÃƒ¼en Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Emergency, I lost 4 monitors but all osd disk are safe

2023-11-02 Thread Malte Stroem


Hey Mohamed,

just send us the output of

ceph -s

and

ceph mon dump

please.

Best,
Malte

On 02.11.23 13:05, Mohamed LAMDAOUAR wrote:

Hi robert,

when I ran this command, I got this error (because the database of the osd
was on the boot disk)

ceph-objectstore-tool \

--type bluestore \
--data-path /var/lib/ceph/c80891ba-55f3-11ed-9389-919f4368965c/osd.9 \
--op update-mon-db \
--mon-store-path /home/enyx-admin/backup-osd-9 \
--no-mon-config --debug

2023-11-02T10:59:33.381+ 7f6724da71c0  1 bdev(0x560257b42400
/var/lib/ceph/c80891ba-55f3-11ed-9389-919f4368965c/osd.9/block) open path
/var/lib/ceph/c80891ba-55f3-11ed-9389-919f4368965c/osd.9/block

2023-11-02T10:59:33.381+ 7f6724da71c0  1 bdev(0x560257b42400
/var/lib/ceph/c80891ba-55f3-11ed-9389-919f4368965c/osd.9/block) open size
1827154432 (0x9187fc0, 9.1 TiB) block_size 4096 (4 KiB) rotational
device, discard not supported

2023-11-02T10:59:33.381+ 7f6724da71c0  1 bluestore(/var/lib/ceph/
c80891ba-55f3-11ed-9389-919f4368965c/osd.9) _set_cache_sizes cache_size
1073741824 meta 0.45 kv 0.45 data 0.06

2023-11-02T10:59:33.381+ 7f6724da71c0  1 bdev(0x560257b42c00
/var/lib/ceph/c80891ba-55f3-11ed-9389-919f4368965c/osd.9/block) open path
/var/lib/ceph/c80891ba-55f3-11ed-9389-919f4368965c/osd.9/block

2023-11-02T10:59:33.381+ 7f6724da71c0  1 bdev(0x560257b42c00
/var/lib/ceph/c80891ba-55f3-11ed-9389-919f4368965c/osd.9/block) open size
1827154432 (0x9187fc0, 9.1 TiB) block_size 4096 (4 KiB) rotational
device, discard not supported

2023-11-02T10:59:33.381+ 7f6724da71c0  1 bluefs add_block_device bdev 1
path /var/lib/ceph/c80891ba-55f3-11ed-9389-919f4368965c/osd.9/block size
9.1 TiB

2023-11-02T10:59:33.381+ 7f6724da71c0  1 bluefs mount

2023-11-02T10:59:33.381+ 7f6724da71c0  1 bluefs _init_alloc shared, id
1, capacity 0x9187fc0, block size 0x1

2023-11-02T10:59:33.441+ 7f6724da71c0 -1 bluefs _replay 0x0: stop: uuid
369c96dd-2df1-8d88-2722-3f8334920e83 != super.uuid
ba94c6e8-394b-4a78-84d6-9afe1cbc280b,
block dump:
  42 9c 61 78 69 ec 36 9c  96 dd 2d f1 8d 88 27 22
  |B.axi.6...-...'"|
0010  3f 83 34 92 0e 83 e6 0f  8f 17 fc 3e ec 86 c5 15
  |?.4>|
0020  39 91 13 2e b0 14 92 86  65 75 5c 8e c1 ee fc 18
  |9...eu\.|
0030  f1 7b b2 37 f7 75 70 e2  5e da 79 cd e6 ad 27 40
  |.{.7.up.^.y...'@|
0040  d6 b8 3b da 81 1f 9b ba  c6 e8 b7 68 bc a1 77 ac
  |..;h..w.|
0050  7b a9 a3 cd 9d da b6 57  aa 40 bd ab d0 89 ec e6  |{..W.@
..|
0060  71 a2 2b 4d 87 74 2f ff  0a bf 3b da 3d da 93 52
  |q.+M.t/...;.=..R|
0070  1c ea f2 fb 8d e0 a1 e6  ef b5 42 5e 85 87 27 df
  |..B^..'.|
0080  ac f1 ae 08 9d c5 71 6f  0f f7 68 ce 28 3d 3e 6e
  |..qo..h.(=>n|
0090  94 b2 1a dc 3b f0 9e e9  6e 77 dd 95 b6 9e 94 56
  |;...nw.V|
00a0  f2 dd 9a 35 a0 65 78 05  bb a9 5f a1 99 6a 5c a1
  |...5.ex..._..j\.|
00b0  5d e9 6d 02 83 be 9d 60  d1 82 fc 6c 66 40 11 17  |].m`...lf@
..|
00c0  3a 4d 9d 73 f6 ec fb ed  41 db e2 39 15 e1 5f 28
  |:M.sA..9.._(|
00d0  c4 ce cf eb 93 f2 88 d5  af ae 11 14 d6 97 74 ff
  |..t.|
00e0  4b 7e 73 fe 97 4c 06 2a  3a bc b3 7f 04 94 6c 1d
  |K~s..L.*:.l.|
00f0  60 bf b1 42 fa 76 b0 df  33 ff bf 84 36 b1 b5 b3
  |`..B.v..3...6...|
0100  17 36 d6 b7 7d 4c d4 37  fa 7f 8e 59 1f 72 53 d5
  |.6..}L.7...Y.rS.|
0110  c4 d0 de d8 4e 13 ca c6  0a 60 87 3c e4 21 2b 1b
  |N`.<.!+.|
0120  00 f2 67 cf 0a 02 01 20  ec ec 7f c1 8f e3 df f8  |..g
|
0130  3f db 7f 60 28 14 8a fa  48 cb c6 f6 c7 9a 3f 71
  |?..`(...H.?q|
0140  bf 61 36 30 08 c0 f1 e7  f8 af b5 7f d2 fc ad a1
  |.a60|
0150  72 b2 40 ff 82 ff a3 c7  5f f0 a3 0e 8f b2 fe b6  |r.@
._...|
0160  ee 2f 5d fe 90 8b fa 28  8f 95 03 fa 5b ee e3 9c
  |./]([...|
0170  36 ea 3f 6a 1e c0 fe bb  c2 80 4a 56 ca 96 26 8f
  |6.?j..JV..&.|
0180  85 03 e0 f8 67 c9 3d a8  fa 97 af c5 c0 00 ce 7f
  |g.=.|
0190  cd 83 ff 36 ff c0 1c f0  7b c1 03 cf b7 b6 56 06
  |...6{.V.|
01a0  8a 30 7b 4d e0 5b 11 31  a0 12 cc d9 5e fb 7f 2d
  |.0{M.[.1^..-|
01b0  fb 47 04 df ea 1b 3d 3e  6c 1f f7 07 96 df 97 cf
  |.G=>l...|
01c0  15 60 76 56 0e b6 06 30  3b c0 6f 11 0a 40 19 98  |.`vV...0;.o..@
..|
01d0  a1 89 fe e3 b6 f3 28 80  83 e5 c1 1d 9c ac da 40
  |..(@|
01e0  71 5b 2b 07 eb 07 2e 8a  0f 71 7f 88 aa f5 23 0b
  |q[+..q#.|
01f0  03 17 a0 b0 e2 c3 76 e3  68 62 00 53 10 17 02 4a
  |..v.hb.S...J|
0200  02 ec 1f 72 82 8f 0f 28  fc a0 e0 83 04 3b c0 e3
  |...r...(.;..|
0210  6b 25 15 fe ae ce df de  33 29 6c e5 f0 68 c6 1f
  |k%..3)l..h..|
0220  e9 01 00 ff f1 be c9 c7  f4 f8 73 fc bf cc 80 fe
  |..s.|
0230  73 1d 08 a8 64 62 6f 0e  e3 11 13 15 13 03 81 58
  |s...dboX|
0240  a1 20 10 9b f0 43 e3 7c  68 2c 0f ed 21 34 10 10  |.
...C

[ceph-users] Re: Emergency, I lost 4 monitors but all osd disk are safe

2023-11-02 Thread Robert Sander


Hi,

On 11/2/23 13:05, Mohamed LAMDAOUAR wrote:


when I ran this command, I got this error (because the database of the 
osd was on the boot disk)


The RocksDB part of the OSD was on the failed SSD?

Then the OSD is lost and cannot be recovered.
The RocksDB contains the information where each object is stored on the 
OSD data partition and without it nobody knows where each object is. The 
data is lost.


Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Emergency, I lost 4 monitors but all osd disk are safe

2023-11-02 Thread David C.

Hi Mohamed,

I understand there's one operational monitor, isn't there?
If so, you need to reprovision the other monitors on an empty base so that
they synchronize with the only remaining monitor.


Cordialement,

*David CASIER*





Le jeu. 2 nov. 2023 à 13:42, Mohamed LAMDAOUAR 
a écrit :

> Hi robert,
>
> when I ran this command, I got this error (because the database of the osd
> was on the boot disk)
>
> ceph-objectstore-tool \
> > --type bluestore \
> > --data-path /var/lib/ceph/c80891ba-55f3-11ed-9389-919f4368965c/osd.9 \
> > --op update-mon-db \
> > --mon-store-path /home/enyx-admin/backup-osd-9 \
> > --no-mon-config --debug
> 2023-11-02T10:59:33.381+ 7f6724da71c0  1 bdev(0x560257b42400
> /var/lib/ceph/c80891ba-55f3-11ed-9389-919f4368965c/osd.9/block) open path
> /var/lib/ceph/c80891ba-55f3-11ed-9389-919f4368965c/osd.9/block
>
> 2023-11-02T10:59:33.381+ 7f6724da71c0  1 bdev(0x560257b42400
> /var/lib/ceph/c80891ba-55f3-11ed-9389-919f4368965c/osd.9/block) open size
> 1827154432 (0x9187fc0, 9.1 TiB) block_size 4096 (4 KiB) rotational
> device, discard not supported
>
> 2023-11-02T10:59:33.381+ 7f6724da71c0  1 bluestore(/var/lib/ceph/
> c80891ba-55f3-11ed-9389-919f4368965c/osd.9) _set_cache_sizes cache_size
> 1073741824 meta 0.45 kv 0.45 data 0.06
>
> 2023-11-02T10:59:33.381+ 7f6724da71c0  1 bdev(0x560257b42c00
> /var/lib/ceph/c80891ba-55f3-11ed-9389-919f4368965c/osd.9/block) open path
> /var/lib/ceph/c80891ba-55f3-11ed-9389-919f4368965c/osd.9/block
>
> 2023-11-02T10:59:33.381+ 7f6724da71c0  1 bdev(0x560257b42c00
> /var/lib/ceph/c80891ba-55f3-11ed-9389-919f4368965c/osd.9/block) open size
> 1827154432 (0x9187fc0, 9.1 TiB) block_size 4096 (4 KiB) rotational
> device, discard not supported
>
> 2023-11-02T10:59:33.381+ 7f6724da71c0  1 bluefs add_block_device bdev 1
> path /var/lib/ceph/c80891ba-55f3-11ed-9389-919f4368965c/osd.9/block size
> 9.1 TiB
>
> 2023-11-02T10:59:33.381+ 7f6724da71c0  1 bluefs mount
>
> 2023-11-02T10:59:33.381+ 7f6724da71c0  1 bluefs _init_alloc shared, id
> 1, capacity 0x9187fc0, block size 0x1
>
> 2023-11-02T10:59:33.441+ 7f6724da71c0 -1 bluefs _replay 0x0: stop: uuid
> 369c96dd-2df1-8d88-2722-3f8334920e83 != super.uuid
> ba94c6e8-394b-4a78-84d6-9afe1cbc280b,
> block dump:
>   42 9c 61 78 69 ec 36 9c  96 dd 2d f1 8d 88 27 22
>  |B.axi.6...-...'"|
> 0010  3f 83 34 92 0e 83 e6 0f  8f 17 fc 3e ec 86 c5 15
>  |?.4>|
> 0020  39 91 13 2e b0 14 92 86  65 75 5c 8e c1 ee fc 18
>  |9...eu\.|
> 0030  f1 7b b2 37 f7 75 70 e2  5e da 79 cd e6 ad 27 40
>  |.{.7.up.^.y...'@|
> 0040  d6 b8 3b da 81 1f 9b ba  c6 e8 b7 68 bc a1 77 ac
>  |..;h..w.|
> 0050  7b a9 a3 cd 9d da b6 57  aa 40 bd ab d0 89 ec e6  |{..W.@
> ..|
> 0060  71 a2 2b 4d 87 74 2f ff  0a bf 3b da 3d da 93 52
>  |q.+M.t/...;.=..R|
> 0070  1c ea f2 fb 8d e0 a1 e6  ef b5 42 5e 85 87 27 df
>  |..B^..'.|
> 0080  ac f1 ae 08 9d c5 71 6f  0f f7 68 ce 28 3d 3e 6e
>  |..qo..h.(=>n|
> 0090  94 b2 1a dc 3b f0 9e e9  6e 77 dd 95 b6 9e 94 56
>  |;...nw.V|
> 00a0  f2 dd 9a 35 a0 65 78 05  bb a9 5f a1 99 6a 5c a1
>  |...5.ex..._..j\.|
> 00b0  5d e9 6d 02 83 be 9d 60  d1 82 fc 6c 66 40 11 17  |].m`...lf@
> ..|
> 00c0  3a 4d 9d 73 f6 ec fb ed  41 db e2 39 15 e1 5f 28
>  |:M.sA..9.._(|
> 00d0  c4 ce cf eb 93 f2 88 d5  af ae 11 14 d6 97 74 ff
>  |..t.|
> 00e0  4b 7e 73 fe 97 4c 06 2a  3a bc b3 7f 04 94 6c 1d
>  |K~s..L.*:.l.|
> 00f0  60 bf b1 42 fa 76 b0 df  33 ff bf 84 36 b1 b5 b3
>  |`..B.v..3...6...|
> 0100  17 36 d6 b7 7d 4c d4 37  fa 7f 8e 59 1f 72 53 d5
>  |.6..}L.7...Y.rS.|
> 0110  c4 d0 de d8 4e 13 ca c6  0a 60 87 3c e4 21 2b 1b
>  |N`.<.!+.|
> 0120  00 f2 67 cf 0a 02 01 20  ec ec 7f c1 8f e3 df f8  |..g
> |
> 0130  3f db 7f 60 28 14 8a fa  48 cb c6 f6 c7 9a 3f 71
>  |?..`(...H.?q|
> 0140  bf 61 36 30 08 c0 f1 e7  f8 af b5 7f d2 fc ad a1
>  |.a60|
> 0150  72 b2 40 ff 82 ff a3 c7  5f f0 a3 0e 8f b2 fe b6  |r.@
> ._...|
> 0160  ee 2f 5d fe 90 8b fa 28  8f 95 03 fa 5b ee e3 9c
>  |./]([...|
> 0170  36 ea 3f 6a 1e c0 fe bb  c2 80 4a 56 ca 96 26 8f
>  |6.?j..JV..&.|
> 0180  85 03 e0 f8 67 c9 3d a8  fa 97 af c5 c0 00 ce 7f
>  |g.=.|
> 0190  cd 83 ff 36 ff c0 1c f0  7b c1 03 cf b7 b6 56 06
>  |...6{.V.|
> 01a0  8a 30 7b 4d e0 5b 11 31  a0 12 cc d9 5e fb 7f 2d
>  |.0{M.[.1^..-|
> 01b0  fb 47 04 df ea 1b 3d 3e  6c 1f f7 07 96 df 97 cf
>  |.G=>l...|
> 01c0  15 60 76 56 0e b6 06 30  3b c0 6f 11 0a 40 19 98  |.`vV...0;.o..@
> ..|
> 01d0  a1 89 fe e3 b6 f3 28 80  83 e5 c1 1d 9c ac da 40
>  |..(@|
> 01e0  71 5b 2b 07 eb 07 2e 8a  0f 71 7f 88 aa f5 23 0b
>  |q[+..q#.|
> 01f0  0

[ceph-users] Re: Emergency, I lost 4 monitors but all osd disk are safe

2023-11-02 Thread Anthony D'Atri

This admittedly is the case throughout the docs.

> On Nov 2, 2023, at 07:27, Joachim Kraftmayer - ceph ambassador 
>  wrote:
> 
> Hi,
> 
> another short note regarding the documentation, the paths are designed for a 
> package installation.
> 
> the paths for container installation look a bit different e.g.: 
> /var/lib/ceph//osd.y/
> 
> Joachim
> 
> ___
> ceph ambassador DACH
> ceph consultant since 2012
> 
> Clyso GmbH - Premier Ceph Foundation Member
> 
> https://www.clyso.com/
> 
> Am 02.11.23 um 12:02 schrieb Robert Sander:
>> Hi,
>> 
>> On 11/2/23 11:28, Mohamed LAMDAOUAR wrote:
>> 
>>>I have 7 machines on CEPH cluster, the service ceph runs on a docker
>>> container.
>>>   Each machine has 4 hdd of data (available) and 2 nvme sssd (bricked)
>>>During a reboot, the ssd bricked on 4 machines, the data are available on
>>> the HDD disk but the nvme is bricked and the system is not available. is it
>>> possible to recover the data of the cluster (the data disk are all
>>> available)
>> 
>> You can try to recover the MON db from the OSDs, as they keep a copy of it:
>> 
>> https://docs.ceph.com/en/reef/rados/troubleshooting/troubleshooting-mon/#monitor-store-failures
>>  
>> 
>> Regards
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Emergency, I lost 4 monitors but all osd disk are safe

2023-11-02 Thread David C.

Hi,

I've just checked with the team and the situation is much more serious than
it seems: the lost disks contained the MONs AND OSDs databases (5 servers
down out of 8, replica 3).

It seems that the team fell victim to a bad batch of Samsung 980 Pros (I'm
not a big fan of this "Pro" range, but that's not the point), which have
never been able to restart since the incident.

Someone please correct me, but as far as I'm concerned, the cluster is lost.


Cordialement,

*David CASIER*




*Ligne directe: +33(0) 9 72 61 98 29*




Le jeu. 2 nov. 2023 à 15:49, Anthony D'Atri  a écrit :

> This admittedly is the case throughout the docs.
>
> > On Nov 2, 2023, at 07:27, Joachim Kraftmayer - ceph ambassador <
> joachim.kraftma...@clyso.com> wrote:
> >
> > Hi,
> >
> > another short note regarding the documentation, the paths are designed
> for a package installation.
> >
> > the paths for container installation look a bit different e.g.:
> /var/lib/ceph//osd.y/
> >
> > Joachim
> >
> > ___
> > ceph ambassador DACH
> > ceph consultant since 2012
> >
> > Clyso GmbH - Premier Ceph Foundation Member
> >
> > https://www.clyso.com/
> >
> > Am 02.11.23 um 12:02 schrieb Robert Sander:
> >> Hi,
> >>
> >> On 11/2/23 11:28, Mohamed LAMDAOUAR wrote:
> >>
> >>>I have 7 machines on CEPH cluster, the service ceph runs on a docker
> >>> container.
> >>>   Each machine has 4 hdd of data (available) and 2 nvme sssd (bricked)
> >>>During a reboot, the ssd bricked on 4 machines, the data are
> available on
> >>> the HDD disk but the nvme is bricked and the system is not available.
> is it
> >>> possible to recover the data of the cluster (the data disk are all
> >>> available)
> >>
> >> You can try to recover the MON db from the OSDs, as they keep a copy of
> it:
> >>
> >>
> https://docs.ceph.com/en/reef/rados/troubleshooting/troubleshooting-mon/#monitor-store-failures
> >>
> >> Regards
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] resharding RocksDB after upgrade to Pacific breaks OSDs

2023-11-02 Thread Denis Polom


Hi

we upgraded our Ceph cluster from latest Octopus to Pacific 16.2.14 and 
then we followed the docs 
(https://docs.ceph.com/en/latest/rados/configuration/bluestore-config-ref/#rocksdb-sharding 
) 
to reshard RocksDB on our OSDs.


Despite resharding reports operation as successful, OSD fails to start.

# ceph-bluestore-tool  --path /var/lib/ceph/osd/ceph-5/ --sharding="m(3) 
p(3,0-12) o(3,0-13)=block_cache={type=binned_lru} l p" reshard

reshard success

Oct 30 12:44:17 octopus2 ceph-osd[4521]: 
/build/ceph-16.2.14/src/kv/RocksDBStore.cc: 1223: FAILED 
ceph_assert(recreate_mode)
Oct 30 12:44:17 octopus2 ceph-osd[4521]:  ceph version 16.2.14 
(238ba602515df21ea7ffc75c88db29f9e5ef12c9) pacific (stable)
Oct 30 12:44:17 octopus2 ceph-osd[4521]:  1: 
(ceph::__ceph_assert_fail(char const*, char const*, int, char 
const*)+0x14b) [0x564047cb92b2]
Oct 30 12:44:17 octopus2 ceph-osd[4521]:  2: 
/usr/bin/ceph-osd(+0xaa948a) [0x564047cb948a]
Oct 30 12:44:17 octopus2 ceph-osd[4521]:  3: 
(RocksDBStore::do_open(std::ostream&, bool, bool, 
std::__cxx11::basic_string, 
std::allocator > const&)+0x1609) [0x564048794829]
Oct 30 12:44:17 octopus2 ceph-osd[4521]:  4: (BlueStore::_open_db(bool, 
bool, bool)+0x601) [0x564048240421]
Oct 30 12:44:17 octopus2 ceph-osd[4521]:  5: 
(BlueStore::_open_db_and_around(bool, bool)+0x26b) [0x5640482a5f8b]
Oct 30 12:44:17 octopus2 ceph-osd[4521]:  6: (BlueStore::_mount()+0x9c) 
[0x5640482a896c]
Oct 30 12:44:17 octopus2 ceph-osd[4521]:  7: (OSD::init()+0x38a) 
[0x564047daacea]

Oct 30 12:44:17 octopus2 ceph-osd[4521]:  8: main()
Oct 30 12:44:17 octopus2 ceph-osd[4521]:  9: __libc_start_main()
Oct 30 12:44:17 octopus2 ceph-osd[4521]:  10: _start()
Oct 30 12:44:17 octopus2 ceph-osd[4521]:  0> 
2023-10-30T12:44:17.088+ 7f4971ed2100 -1 *** Caught signal (Aborted) **
Oct 30 12:44:17 octopus2 ceph-osd[4521]:  in thread 7f4971ed2100 
thread_name:ceph-osd
Oct 30 12:44:17 octopus2 ceph-osd[4521]:  ceph version 16.2.14 
(238ba602515df21ea7ffc75c88db29f9e5ef12c9) pacific (stable)
Oct 30 12:44:17 octopus2 ceph-osd[4521]:  1: 
/lib/x86_64-linux-gnu/libpthread.so.0(+0x12730) [0x7f4972921730]

Oct 30 12:44:17 octopus2 ceph-osd[4521]:  2: gsignal()
Oct 30 12:44:17 octopus2 ceph-osd[4521]:  3: abort()
Oct 30 12:44:17 octopus2 ceph-osd[4521]:  4: 
(ceph::__ceph_assert_fail(char const*, char const*, int, char 
const*)+0x19c) [0x564047cb9303]
Oct 30 12:44:17 octopus2 ceph-osd[4521]:  5: 
/usr/bin/ceph-osd(+0xaa948a) [0x564047cb948a]
Oct 30 12:44:17 octopus2 ceph-osd[4521]:  6: 
(RocksDBStore::do_open(std::ostream&, bool, bool, 
std::__cxx11::basic_string, 
std::allocator > const&)+0x1609) [0x564048794829]
Oct 30 12:44:17 octopus2 ceph-osd[4521]:  7: (BlueStore::_open_db(bool, 
bool, bool)+0x601) [0x564048240421]
Oct 30 12:44:17 octopus2 ceph-osd[4521]:  8: 
(BlueStore::_open_db_and_around(bool, bool)+0x26b) [0x5640482a5f8b]
Oct 30 12:44:17 octopus2 ceph-osd[4521]:  9: (BlueStore::_mount()+0x9c) 
[0x5640482a896c]
Oct 30 12:44:17 octopus2 ceph-osd[4521]:  10: (OSD::init()+0x38a) 
[0x564047daacea]

Oct 30 12:44:17 octopus2 ceph-osd[4521]:  11: main()
Oct 30 12:44:17 octopus2 ceph-osd[4521]:  12: __libc_start_main()
Oct 30 12:44:17 octopus2 ceph-osd[4521]:  13: _start()
Oct 30 12:44:17 octopus2 ceph-osd[4521]:  NOTE: a copy of the 
executable, or `objdump -rdS ` is needed to interpret this.
Oct 30 12:44:17 octopus2 ceph-osd[4521]: -1> 
2023-10-30T12:44:17.084+ 7f4971ed2100 -1 
/build/ceph-16.2.14/src/kv/RocksDBStore.cc: In function 'int 
RocksDBStore::do_open(std::ostream&, bool, bool, const string&)' thread 
7f4971ed2100 time 2023-10-30T12:44:17.087172+


I've submitted bug report here https://tracker.ceph.com/issues/63353 but 
may be community here have some ideas how to fix it unless it's really a 
bug.


Thanks
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: RGW access logs with bucket name

2023-11-02 Thread Dan van der Ster

Using the ops log is a good option -- I had missed that it can now log
to a file. In Quincy:

# ceph config set global rgw_ops_log_rados false
# ceph config set global rgw_ops_log_file_path
'/var/log/ceph/ops-log-$cluster-$name.log'
# ceph config set global rgw_enable_ops_log true

Then restart all RGWs.

Thanks!

Dan

--
Dan van der Ster
CTO

Clyso GmbH
p: +49 89 215252722 | a: Vancouver, Canada
w: https://clyso.com | e: dan.vanders...@clyso.com

We are hiring: https://www.clyso.com/jobs/

On Mon, Oct 30, 2023 at 7:19 AM Casey Bodley  wrote:
>
> another option is to enable the rgw ops log, which includes the bucket
> name for each request
>
> the http access log line that's visible at log level 1 follows a known
> apache format that users can scrape, so i've resisted adding extra
> s3-specific stuff like bucket/object names there. there was some
> recent discussion around this in
> https://github.com/ceph/ceph/pull/50350, which had originally extended
> that access log line
>
> On Mon, Oct 30, 2023 at 6:03 AM Boris Behrens  wrote:
> >
> > Hi Dan,
> >
> > we are currently moving all the logging into lua scripts, so it is not an
> > issue anymore for us.
> >
> > Thanks
> >
> > ps: the ceph analyzer is really cool. plusplus
> >
> > Am Sa., 28. Okt. 2023 um 22:03 Uhr schrieb Dan van der Ster <
> > dan.vanders...@clyso.com>:
> >
> > > Hi Boris,
> > >
> > > I found that you need to use debug_rgw=10 to see the bucket name :-/
> > >
> > > e.g.
> > > 2023-10-28T19:55:42.288+ 7f34dde06700 10 req 3268931155513085118
> > > 0.0s s->object=... s->bucket=xyz-bucket-123
> > >
> > > Did you find a more convenient way in the meantime? I think we should
> > > log bucket name at level 1.
> > >
> > > Cheers, Dan
> > >
> > > --
> > > Dan van der Ster
> > > CTO
> > >
> > > Clyso GmbH
> > > p: +49 89 215252722 | a: Vancouver, Canada
> > > w: https://clyso.com | e: dan.vanders...@clyso.com
> > >
> > > Try our Ceph Analyzer: https://analyzer.clyso.com
> > >
> > > On Thu, Mar 30, 2023 at 4:15 AM Boris Behrens  wrote:
> > > >
> > > > Sadly not.
> > > > I only see the the path/query of a request, but not the hostname.
> > > > So when a bucket is accessed via hostname (
> > > https://bucket.TLD/object?query)
> > > > I only see the object and the query (GET /object?query).
> > > > When a bucket is accessed bia path (https://TLD/bucket/object?query) I
> > > can
> > > > see also the bucket in the log (GET bucket/object?query)
> > > >
> > > > Am Do., 30. März 2023 um 12:58 Uhr schrieb Szabo, Istvan (Agoda) <
> > > > istvan.sz...@agoda.com>:
> > > >
> > > > > It has the full url begins with the bucket name in the beast logs http
> > > > > requests, hasn’t it?
> > > > >
> > > > > Istvan Szabo
> > > > > Staff Infrastructure Engineer
> > > > > ---
> > > > > Agoda Services Co., Ltd.
> > > > > e: istvan.sz...@agoda.com
> > > > > ---
> > > > >
> > > > > On 2023. Mar 30., at 17:44, Boris Behrens  wrote:
> > > > >
> > > > > Email received from the internet. If in doubt, don't click any link
> > > nor
> > > > > open any attachment !
> > > > > 
> > > > >
> > > > > Bringing up that topic again:
> > > > > is it possible to log the bucket name in the rgw client logs?
> > > > >
> > > > > currently I am only to know the bucket name when someone access the
> > > bucket
> > > > > via https://TLD/bucket/object instead of https://bucket.TLD/object.
> > > > >
> > > > > Am Di., 3. Jan. 2023 um 10:25 Uhr schrieb Boris Behrens 
> > > > >  > > >:
> > > > >
> > > > > Hi,
> > > > >
> > > > > I am looking forward to move our logs from
> > > > >
> > > > > /var/log/ceph/ceph-client...log to our logaggregator.
> > > > >
> > > > >
> > > > > Is there a way to have the bucket name in the log file?
> > > > >
> > > > >
> > > > > Or can I write the rgw_enable_ops_log into a file? Maybe I could work
> > > with
> > > > >
> > > > > this.
> > > > >
> > > > >
> > > > > Cheers and happy new year
> > > > >
> > > > > Boris
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend
> > > im
> > > > > groÃƒ¼en Saal.
> > > > > ___
> > > > > ceph-users mailing list -- ceph-users@ceph.io
> > > > > To unsubscribe send an email to ceph-users-le...@ceph.io
> > > > >
> > > > >
> > > > > --
> > > > > This message is confidential and is for the sole use of the intended
> > > > > recipient(s). It may also be privileged or otherwise protected by
> > > copyright
> > > > > or other legal rules. If you have received it by mistake please let us
> > > know
> > > > > by reply email and delete it from your system. It is prohibited to 
> > > > > copy
> > > > > this message or disclose its content to anyone. Any confidentiality or
> > > > > privilege is not waived or lost by any mistaken delivery or
> > > unauthorized
> >

[ceph-users] Re: Ceph OSD reported Slow operations

2023-11-02 Thread V A Prabha

Is it possible to move the OSDs safe (making the OSDs out and move the content
to other OSDs and remove it and map it fresh to other nodes which is less
loaded) as I have very critical production workloads ( Government applications)
?
Please guide what is the safer means to stabilize the environment without data
loss?
Moreover we have 64 TB of available storage in the ceph cluster..Is it possible
to release the storage atleast 50 % or will it make the cluster very busy again?
As the client feels that using 3 replicas and holding these much spare storage ,
we are not using the storage in an optimal way?
Please guide us


On November 2, 2023 at 12:47 PM Zakhar Kirpichenko  wrote:

>  >1. The calculated IOPS is for the rw operation right ?
> 
>  Total drive IOPS, read or write. Depending on the exact drive models, it may
> be lower or higher than 200. I took the average for a smaller sized 7.2k rpm
> SAS drive. Modern drives usually deliver lower read IOPS and higher write
> IOPS.
> 
>  >2. Cluster is very busy? Is there any misconfiguration or missing tuning
>  >paramater that makes the cluster busy?
> 
>  You have almost 3k IOPS and your OSDs report slow ops. I'd say the cluster is
> busy, as in loaded with I/O, perhaps more I/O than it can handle well.
> 
>  >3. Nodes are not balanced?  you mean to say that the count of OSDs in each
>  >server differs. But we have enabled autoscale and optimal distribution so
>  >that you can see from the output of ceph osd df tree that is count of
>  >pgs(45/OSD) and use% (65 to 67%). Is that not significant?
> 
>  Yes, the OSD count differs. This means that the CPU, memory usage, network
> load and latency differ per node and may cause performance variations,
> depending on your workload.
> 
>  /Z
> 
>  On Thu, 2 Nov 2023 at 08:18, V A Prabha < prab...@cdac.in
>  > wrote:
>> >Thanks for your prompt reply ..
> >But the query is
> >1.The calculated IOPS is for the rw operation right ?
> >2. Cluster is very busy? Is there any misconfiguration or missing tuning
> > paramater that makes the cluster busy?
> >3. Nodes are not balanced?  you mean to say that the count of OSDs in
> > each server differs. But we have enabled autoscale and optimal distribution
> > so that you can see from the output of ceph osd df tree that is count of
> > pgs(45/OSD) and use% (65 to 67%). Is that not significant?
> >Correct me if my queries are irrelevant
> > 
> > 
> > 
> >On November 2, 2023 at 11:36 AM Zakhar Kirpichenko < zak...@gmail.com
> >  > wrote:
> > 
> > > > > Sure, it's 36 OSDs at 200 IOPS each (tops, likely lower), I
> > > > > assume size=3 replication so 1/3 of the total performance, and
> > > > > some 30%-ish OSD overhead.
> > > 
> > > (36 x 200) * 1/3 * 0.7 = 1680. That's how many IOPS you can
> > > realistically expect from your cluster. You get more than that, but the
> > > cluster is very busy and OSDs aren't coping.
> > > 
> > > Also your nodes are not balanced.
> > > 
> > > /Z
> > > 
> > > On Thu, 2 Nov 2023 at 07:33, V A Prabha < prab...@cdac.in
> > >  > wrote:
> > >   > > > >   Can you please elaborate your identifications and the
> > >   > > > > statement .
> > > > 
> > > > 
> > > >   On November 2, 2023 at 9:40 AM Zakhar Kirpichenko <
> > > > zak...@gmail.com  > wrote:
> > > > 
> > > >> > > > >I'm afraid you're simply hitting the I/O limits
> > > >> > > > > of your disks.
> > > > > 
> > > > >/Z
> > > > > 
> > > > >On Thu, 2 Nov 2023 at 03:40, V A Prabha < prab...@cdac.in
> > > > >  > wrote:
> > > > >  > > > > > >  Hi Eugen
> > > > > >   Please find the details below
> > > > > > 
> > > > > > 
> > > > > >  root@meghdootctr1:/var/log/ceph# ceph -s
> > > > > >  cluster:
> > > > > >  id: c59da971-57d1-43bd-b2b7-865d392412a5
> > > > > >  health: HEALTH_WARN
> > > > > >  nodeep-scrub flag(s) set
> > > > > >  544 pgs not deep-scrubbed in time
> > > > > > 
> > > > > >  services:
> > > > > >  mon: 3 daemons, quorum
> > > > > > meghdootctr1,meghdootctr2,meghdootctr3 (age 5d)
> > > > > >  mgr: meghdootctr1(active, since 5d), standbys:
> > > > > > meghdootctr2, meghdootctr3
> > > > > >  mds: 3 up:standby
> > > > > >  osd: 36 osds: 36 up (since 34h), 36 in (since 34h)
> > > > > >  flags nodeep-scrub
> > > > > > 
> > > > > >  data:
> > > > > >  pools: 2 pools, 544 pgs
> > > > > >  objects: 10.14M objects, 39 TiB
> > > > > >  usage: 116 TiB used, 63 TiB / 179 TiB avail
> > > > > >  pgs: 544 active+clean
> > > > > > 
> > > > > >  io:
> > > > > >  client: 24 MiB/s rd, 16 MiB/s wr, 2.02k op/s rd, 907 op/s
> > > > > > wr
> > > > > > 
> > > > > > 
> > > > > >  Ceph Versions:
> > > > >

[ceph-users] Re: 17.2.7 quincy dashboard issues

2023-11-02 Thread Matthew Darwin

In my case I'm adding a label that is unique to each ceph cluster and 
then can filter on that.  In my ceph dashboard in grafana I've added a 
pull-down list to check each different ceph cluster.

You need a way for me to configure what labels to filter on so I can 
match it up with how I configured the prometheus. Alternately publish 
all metrics to prometheus with fsid label then you can auto-filter 
based on the fsid of the ceph cluster since fsid is unique.

On 2023-11-02 01:03, Nizamudeen A wrote:

We have 4 ceph clusters going into the same prometheus instance.

Just curious, In the prometheus, if you want to see the details for 
a single cluster, how's it done through query?

For reference, these are the queries that we are currently using now.

USEDCAPACITY = 'ceph_cluster_total_used_bytes',
  WRITEIOPS = 'sum(rate(ceph_pool_wr[1m]))',
  READIOPS = 'sum(rate(ceph_pool_rd[1m]))',
  READLATENCY = 'avg_over_time(ceph_osd_apply_latency_ms[1m])',
  WRITELATENCY = 'avg_over_time(ceph_osd_commit_latency_ms[1m])',
  READCLIENTTHROUGHPUT = 'sum(rate(ceph_pool_rd_bytes[1m]))',
  WRITECLIENTTHROUGHPUT = 'sum(rate(ceph_pool_wr_bytes[1m]))',
  RECOVERYBYTES = 'sum(rate(ceph_osd_recovery_bytes[1m]))'

We might not have considered the possibility of multiple 
ceph-clusters pointing to a single prometheus instance.
In that case there should be some filtering done with cluster id or 
something to properly identify it.

FYI @Pedro Gonzalez Gomez  @Ankush Behl 
 @Aashish Sharma 

Regards,
Nizam

On Mon, Oct 30, 2023 at 11:05 PM Matthew Darwin  wrote:

Ok, so I tried the new ceph dashboard by "set-prometheus-api-host"
(note "host" and not "url") and it returns the wrong data.  We
have 4
ceph clusters going into the same prometheus instance.  How does it
know which data to pull? Do I need to pass a promql query?

The capacity widget at the top right (not using prometheus)
shows 35%
of 51 TiB used (test cluster data)... This is correct. The chart
shows
use capacity is 1.7 PiB, which is coming from the production
cluster
(incorrect).

Ideas?

On 2023-10-30 11:30, Nizamudeen A wrote:
> Ah yeah, probably that's why the utilization charts are empty
> because it relies on
> the prometheus info.
>
> And I raised a PR to disable the new dashboard in quincy.
> https://github.com/ceph/ceph/pull/54250
>
> Regards,
> Nizam
>
> On Mon, Oct 30, 2023 at 6:09 PM Matthew Darwin
 wrote:
>
>     Hello,
>
>     We're not using prometheus within ceph (ceph dashboards
show in our
>     grafana which is hosted elsewhere). The old dashboard
showed the
>     metrics fine, so not sure why in a patch release we would need
>     to make
>     configuration changes to get the same metrics Agree it
>     should be
>     off by default.
>
>     "ceph dashboard feature disable dashboard" works to put
the old
>     dashboard back.  Thanks.
>
>     On 2023-10-30 00:09, Nizamudeen A wrote:
>     > Hi Matthew,
>     >
>     > Is the prometheus configured in the cluster? And also the
>     > PROMETHUEUS_API_URL is set? You can set it manually by ceph
>     dashboard
>     > set-prometheus-api-url .
>     >
>     > You can switch to the old Dashboard by switching the feature
>     toggle in the
>     > dashboard. `ceph dashboard feature disable dashboard` and
>     reloading the
>     > page. Probably this should have been disabled by default.
>     >
>     > Regards,
>     > Nizam
>     >
>     > On Sun, Oct 29, 2023, 23:04 Matthew
Darwin wrote:
>     >
>     >> Hi all,
>     >>
>     >> I see17.2.7 quincy is published as debian-bullseye
packages.
>     So I
>     >> tried it on a test cluster.
>     >>
>     >> I must say I was not expecting the big dashboard change
in a
>     patch
>     >> release.  Also all the "cluster utilization" numbers
are all
>     blank now
>     >> (any way to fix it?), so the dashboard is much less
usable now.
>     >>
>     >> Thoughts?
>     >> ___
>     >> ceph-users mailing list --ceph-users@ceph.io
>     >> To unsubscribe send an email toceph-users-le...@ceph.io
>     >>
>     > ___
>     > ceph-users mailing list --ceph-users@ceph.io
>     > To unsubscribe send an email toceph-users-le...@ceph.io
>     ___
>     ceph-users mailing list -- ceph-users@ceph.io
>     To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph

[ceph-users] Re: upgrade 17.2.6 to 17.2.7 , any issues?

2023-11-02 Thread Dmitry Melekhov


03.11.2023 04:33, Reto Gysi пишет:

Hi

I had 2 issues:

1. I got hit by https://tracker.ceph.com/issues/63118 which also happened
with multi-arch deployment upgrade from 17.2.5 to 17.2.7.
 The workaround  worked for me.

2. I got some BLUEFS_SPILLOVER warnings after the upgrade.  So I increased
the block.db size and the warnings eventually went away after some time.


Thank you!

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Ceph OSD reported Slow operations

2023-11-02 Thread Janne Johansson

Den tors 2 nov. 2023 kl 23:46 skrev V A Prabha :
>
> Is it possible to move the OSDs safe (making the OSDs out and move the content
> to other OSDs and remove it and map it fresh to other nodes which is less
> loaded)


> As the client feels that using 3 replicas and holding these much spare 
> storage ,
> we are not using the storage in an optimal way?

These two sentences don't really add up.

Ceph has replica=3 as a default in order for drives and hosts to be
able to crash, so that you can recover without losing redundancy. As
soon as you lose redundancy, there is no "safe" anything, you are
immediately in danger of losing data so that you can never get it back
from ceph.
If the client thinks you are wasting space, then they must not care
for the data, because ANY random hiccup, any broken sector somewhere
becomes a data-loss event if the other copy is being moved, or that
server is having maintenance or whatever. With only two copies of the
data, you can never reboot a server, upgrades means the cluster stops
serving data.

The joke in the 80s (might be older than that of course) was:
"Data is binary, either it is important and backed up, or it is not important"

Ceph chooses to treat your data as important. You can lower the
expectations by reducing replicas, or reduce perf with erasure coding
(but I understand that this whole thread is about poor total
performance of both client traffic and scrubs and so on), but the
defaults are there to protect you from any random bit flip on one of
the disks and this will happen. Not "perhaps", with enough drives
and/or enough time, this is a certainty. Your design of the storage
decides how your will survive such an event.

If you have three copies of all data, you can be doing (planned or
unplanned) maintenance on one OSD box, notice an error somewhere on
another box and still recover from this using the third copy.

With only two replicas, you can't. The maintenance OSD host will have
missed IO that went on while it was away, and the second copy is known
bad. You could hope that you never need to do maintenance, but with
few exceptions, hosts will reboot at times, whether you plan it or
not.



-- 
May the most significant bit of your life be positive.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: 17.2.7 quincy dashboard issues

2023-11-02 Thread Nizamudeen A

>
> Alternately publish
> all metrics to prometheus with fsid label then you can auto-filter
> based on the fsid of the ceph cluster since fsid is unique.
>

This is exactly something that we are looking into as we are looking into
providing
the support for multi-cluster monitoring and management from the Ceph
Dashboard
which is right now an ongoing PoC. Thanks for providing more context here.

But as of now, I don't see a way to make it configurable in the dashboard.
But in the near future
you can expect these to be added.

Regards,
Nizam

On Fri, Nov 3, 2023 at 7:54 AM Matthew Darwin  wrote:

> In my case I'm adding a label that is unique to each ceph cluster and
> then can filter on that.  In my ceph dashboard in grafana I've added a
> pull-down list to check each different ceph cluster.
>
> You need a way for me to configure what labels to filter on so I can
> match it up with how I configured the prometheus. Alternately publish
> all metrics to prometheus with fsid label then you can auto-filter
> based on the fsid of the ceph cluster since fsid is unique.
>
> On 2023-11-02 01:03, Nizamudeen A wrote:
> >
> > We have 4 ceph clusters going into the same prometheus instance.
> >
> > Just curious, In the prometheus, if you want to see the details for
> > a single cluster, how's it done through query?
> >
> > For reference, these are the queries that we are currently using now.
> >
> > USEDCAPACITY = 'ceph_cluster_total_used_bytes',
> >   WRITEIOPS = 'sum(rate(ceph_pool_wr[1m]))',
> >   READIOPS = 'sum(rate(ceph_pool_rd[1m]))',
> >   READLATENCY = 'avg_over_time(ceph_osd_apply_latency_ms[1m])',
> >   WRITELATENCY = 'avg_over_time(ceph_osd_commit_latency_ms[1m])',
> >   READCLIENTTHROUGHPUT = 'sum(rate(ceph_pool_rd_bytes[1m]))',
> >   WRITECLIENTTHROUGHPUT = 'sum(rate(ceph_pool_wr_bytes[1m]))',
> >   RECOVERYBYTES = 'sum(rate(ceph_osd_recovery_bytes[1m]))'
> >
> > We might not have considered the possibility of multiple
> > ceph-clusters pointing to a single prometheus instance.
> > In that case there should be some filtering done with cluster id or
> > something to properly identify it.
> >
> > FYI @Pedro Gonzalez Gomez  @Ankush Behl
> >  @Aashish Sharma 
> >
> > Regards,
> > Nizam
> >
> > On Mon, Oct 30, 2023 at 11:05 PM Matthew Darwin  wrote:
> >
> > Ok, so I tried the new ceph dashboard by "set-prometheus-api-host"
> > (note "host" and not "url") and it returns the wrong data.  We
> > have 4
> > ceph clusters going into the same prometheus instance.  How does it
> > know which data to pull? Do I need to pass a promql query?
> >
> > The capacity widget at the top right (not using prometheus)
> > shows 35%
> > of 51 TiB used (test cluster data)... This is correct. The chart
> > shows
> > use capacity is 1.7 PiB, which is coming from the production
> > cluster
> > (incorrect).
> >
> > Ideas?
> >
> >
> > On 2023-10-30 11:30, Nizamudeen A wrote:
> > > Ah yeah, probably that's why the utilization charts are empty
> > > because it relies on
> > > the prometheus info.
> > >
> > > And I raised a PR to disable the new dashboard in quincy.
> > > https://github.com/ceph/ceph/pull/54250
> > >
> > > Regards,
> > > Nizam
> > >
> > > On Mon, Oct 30, 2023 at 6:09 PM Matthew Darwin
> >  wrote:
> > >
> > > Hello,
> > >
> > > We're not using prometheus within ceph (ceph dashboards
> > show in our
> > > grafana which is hosted elsewhere). The old dashboard
> > showed the
> > > metrics fine, so not sure why in a patch release we would need
> > > to make
> > > configuration changes to get the same metrics Agree it
> > > should be
> > > off by default.
> > >
> > > "ceph dashboard feature disable dashboard" works to put
> > the old
> > > dashboard back.  Thanks.
> > >
> > > On 2023-10-30 00:09, Nizamudeen A wrote:
> > > > Hi Matthew,
> > > >
> > > > Is the prometheus configured in the cluster? And also the
> > > > PROMETHUEUS_API_URL is set? You can set it manually by ceph
> > > dashboard
> > > > set-prometheus-api-url .
> > > >
> > > > You can switch to the old Dashboard by switching the feature
> > > toggle in the
> > > > dashboard. `ceph dashboard feature disable dashboard` and
> > > reloading the
> > > > page. Probably this should have been disabled by default.
> > > >
> > > > Regards,
> > > > Nizam
> > > >
> > > > On Sun, Oct 29, 2023, 23:04 Matthew
> > Darwin wrote:
> > > >
> > > >> Hi all,
> > > >>
> > > >> I see17.2.7 quincy is published as debian-bullseye
> > packages.
> > > So I
> > > >> tried

[ceph-users] diskprediction_local module and trained models

[ceph-users] Re: Ceph OSD reported Slow operations

[ceph-users] Re: "cephadm version" in reef returns "AttributeError: 'CephadmContext' object has no attribute 'fsid'"

[ceph-users] Re: Setting S3 bucket policies with multi-tenants

[ceph-users] Emergency, I lost 4 monitors but all osd disk are safe

[ceph-users] Re: Emergency, I lost 4 monitors but all osd disk are safe

[ceph-users] Re: Emergency, I lost 4 monitors but all osd disk are safe

[ceph-users] Re: ceph fs (meta) data inconsistent

[ceph-users] Re: Emergency, I lost 4 monitors but all osd disk are safe

[ceph-users] Re: Emergency, I lost 4 monitors but all osd disk are safe

[ceph-users] Re: Emergency, I lost 4 monitors but all osd disk are safe

[ceph-users] Re: Emergency, I lost 4 monitors but all osd disk are safe

[ceph-users] Re: Emergency, I lost 4 monitors but all osd disk are safe

[ceph-users] Re: Emergency, I lost 4 monitors but all osd disk are safe

[ceph-users] Re: Emergency, I lost 4 monitors but all osd disk are safe

[ceph-users] Re: Emergency, I lost 4 monitors but all osd disk are safe

[ceph-users] Re: Emergency, I lost 4 monitors but all osd disk are safe

[ceph-users] Re: Emergency, I lost 4 monitors but all osd disk are safe

[ceph-users] Re: Emergency, I lost 4 monitors but all osd disk are safe

[ceph-users] Re: Emergency, I lost 4 monitors but all osd disk are safe

[ceph-users] resharding RocksDB after upgrade to Pacific breaks OSDs

[ceph-users] Re: RGW access logs with bucket name

[ceph-users] Re: Ceph OSD reported Slow operations

[ceph-users] Re: 17.2.7 quincy dashboard issues

[ceph-users] Re: upgrade 17.2.6 to 17.2.7 , any issues?

[ceph-users] Re: Ceph OSD reported Slow operations

[ceph-users] Re: 17.2.7 quincy dashboard issues

27 matches

Site Navigation

Mail list logo

Footer information