[ceph-users] Re: use of db_slots in DriveGroup specification?

2024-07-11 Thread Eugen Block

Hi,

apparently, db_slots is still not implemented. I just tried it on a  
test cluster with 18.2.2:


# ceph orch apply -i osd-slots.yaml --dry-run
Error EINVAL: Failed to validate OSD spec "osd-hdd-ssd.db_devices":  
Filtering for `db_slots` is not supported


If it was, I would be interested as well if it could be used in  
combination with block_db_size to not consume the entire disk. We  
haven't had too wild configuration requirements yet, I usually stick  
to block_db_size (without limit) if there's a requirement to not use  
the entire db disk. Because as you already stated, in that case the  
orchestrator creates one db device per HDD with the specified db size,  
leaving space on the db device.

Maybe that's one of the reasons that db_slots still hasn't been implemented?

Regards,
Eugen

Zitat von Robert Sander :


Hi,

what is the purpose of the db_slots attribute in a DriveGroup specification?

My interpretation of the documentation is that I can define how many  
OSDs use one db device.


https://docs.ceph.com/en/reef/cephadm/services/osd/#additional-options

"db_slots - How many OSDs per DB device"

The default for the cephadm orchestrator is to create as many DB  
volumes on a DB device as needed for the number of OSDs.



In a scenario where there are empty slots for HDDs and an existing  
DB device should not be used fully "db_slots" could be used.


But even if db_slots is in an OSD service spec the orchestrator will  
only create as many DB volumes as there are HDDs currently available.



There is a discussion from 2021 where the use of "block_db_size" and  
"limit" is suggested:  
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/6EVOYOHS3BTTNLKBRGLPTZ76HPNLP6FC/#6EVOYOHS3BTTNLKBRGLPTZ76HPNLP6FC


Shouldn't db_slots make that easier?

Is this a bug in the orchestrator?

Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

http://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Zwangsangaben lt. §35a GmbHG:
HRB 220009 B / Amtsgericht Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein -- Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Use of db_slots in DriveGroup specification?

2024-07-11 Thread Robert Sander

Hi,

On 7/11/24 09:01, Eugen Block wrote:

apparently, db_slots is still not implemented. I just tried it on a test 
cluster with 18.2.2:


I am thinking about a PR to correct the documentation.

Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: cephadm for Ubuntu 24.04

2024-07-11 Thread Malte Stroem

Hello Stefan,

have a look:

https://docs.ceph.com/en/latest/cephadm/install/#curl-based-installation

Just download cephadm. It will work on any distro.

You do not need any ceph package like ceph-common for example to run 
cephadm.


Best,
Malte

On 11.07.24 08:17, Stefan Kooman wrote:

Hi,

Is it possible to only build "cephadm", so not the other ceph packages / 
daemons? Or can we think about a way to have cephadm packages build for 
all supported mainstream linux releases during the supported lifetime of 
a Ceph release: i.e. debian, Ubuntu LTS, CentOS Stream?


I went ahead and upgraded one of our (pre-prod) ceph nodes to Ubuntu 
24.04 LTS. As it is managed by cephadm and running containers this 
should be no problem, right? Right. Except that there are is no 
"cephadm" package (or any other ceph package) for Ubuntu 24.04 from Ceph 
repository. So, then just build those packages yourself, right? Right. 
After some python pyyaml 6.0.0 / cython issues in 
monitoring/ceph-mixin/requirements-alerts.txt and requirements-lint.txt, 
fixed by bumping pyyaml to 6.0.1) and a hard python 3.10 requirement (I 
bumped that to 3.12) I ran into an issue with "arrow_ext" breaking the 
build process which I'm currently trying to figure out ...


I know supporting Ceph packages for a bunch of distros is a lot of work, 
but having _just_ cephadm available on a wider range of platforms, would 
really help here. It helps avoid upgrading both Ceph and the OS at the 
same time. This allows the use of the latest (kernel) OS improvements, 
all while not touching Ceph.


Thanks,

Gr. Stefan
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: cephadm for Ubuntu 24.04

2024-07-11 Thread Stefan Kooman

On 11-07-2024 09:55, Malte Stroem wrote:

Hello Stefan,

have a look:

https://docs.ceph.com/en/latest/cephadm/install/#curl-based-installation


Yeah, I have read that part.



Just download cephadm. It will work on any distro.


curl --silent --remote-name --location 
https://download.ceph.com/rpm-18.2.1/el9/noarch/cephadm


./cephadm gather-facts
Traceback (most recent call last):
  File "", line 198, in _run_module_as_main
  File "", line 88, in _run_code
  File "/root/./cephadm/__main__.py", line 10700, in 
  File "/root/./cephadm/__main__.py", line 10688, in main
  File "/root/./cephadm/__main__.py", line 9772, in command_gather_facts
  File "/root/./cephadm/__main__.py", line 9762, in dump
  File "/root/./cephadm/__main__.py", line 9677, in kernel_security
  File "/root/./cephadm/__main__.py", line 9658, in _fetch_apparmor
ValueError: too many values to unpack (expected 2)

The version that works for 22.04 also doesn't work on 24.04 and fails in 
a similar way.


It just isn't that simple I'm afraid. But it _should_ be that simple, 
agreed.


You do not need any ceph package like ceph-common for example to run 
cephadm.


Indeed, which is pretty neat. If cephadm works you can basically do all 
ceph related stuff in a container. Getting cephadm to run on more 
platforms is the last mile.


Gr. Stefan
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Changing ip addr

2024-07-11 Thread Albert Shih
Hi everyone

I just change the subnet of my cluster. 

The cephfs part seem to working well. 

But I got many error with 

Jul 11 10:08:35 hostname ceph-*** ts=2024-07-11T08:08:35.364Z 
caller=refresh.go:99 level=error component="discovery manager notify" 
discovery=http config=config-0 msg="Unable to refresh target groups" err="Get 
\"http://OLD_IP:8765/sd/prometheus/sd-config?service=alertmanager\": context 
deadline exceeded (Client.Timeout exceeded while awaiting headers)"

I didn't find which service does the discovery and was unable to change the
URL. 

I try a 

  ceph config-key dump |grep OLD_IP

and didn't find it. 

So where this information are store ? 

Regards

JAS
-- 
Albert SHIH 🦫 🐸
Observatoire de Paris
France
Heure locale/Local time:
jeu. 11 juil. 2024 10:22:58 CEST
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Changing ip addr

2024-07-11 Thread Albert Shih
Le 11/07/2024 à 10:27:09+0200, Albert Shih a écrit
> Hi everyone
> 
> I just change the subnet of my cluster. 
> 
> The cephfs part seem to working well. 
> 
> But I got many error with 
> 
> Jul 11 10:08:35 hostname ceph-*** ts=2024-07-11T08:08:35.364Z 
> caller=refresh.go:99 level=error component="discovery manager notify" 
> discovery=http config=config-0 msg="Unable to refresh target groups" err="Get 
> \"http://OLD_IP:8765/sd/prometheus/sd-config?service=alertmanager\": context 
> deadline exceeded (Client.Timeout exceeded while awaiting headers)"
> 
> I didn't find which service does the discovery and was unable to change the
> URL. 
> 
> I try a 
> 
>   ceph config-key dump |grep OLD_IP
> 
> and didn't find it. 
> 
> So where this information are store ? 

Forget to say I'm running reef 18.2.2

Regards
-- 
Albert SHIH 🦫 🐸
Observatoire de Paris
France
Heure locale/Local time:
jeu. 11 juil. 2024 10:30:41 CEST
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Multi site sync details

2024-07-11 Thread Huseyin Cotuk
Hello Cephers,

I am wondering whether it is possible to get the number of objects that are not 
synced yet in a multi site radosgw configuration. "radosgw-admin sync status” 
gives the number of shards that are behind. Similarly “radosgw-admin bucket 
sync status —bucket {bucket_name}” also gives the status of the shards belong 
to the given bucket. 

I just want to get the number of objects that are not synced yet. The only way 
I found is to get bucket stats from each zone and compare the number of 
objects. While one of the zone the following output (only related part), 

"usage": {
"rgw.main": {
"size": 20896350288253,
"size_actual": 22604324229120,
"size_utilized": 20896350288253,
"size_kb": 20406592079,
"size_kb_actual": 22074535380,
"size_kb_utilized": 20406592079,
"num_objects": 749099512
}
},

the other zone gives the output below. 

"usage": {
"rgw.main": {
"size": 20896350967086,
"size_actual": 22604325040128,
"size_utilized": 20896350967086,
"size_kb": 20406592742,
"size_kb_actual": 22074536172,
"size_kb_utilized": 20406592742,
"num_objects": 749099571
}
},

I look for the difference of number of objects between two zones ( 749099571 - 
749099512 = 59 objects). But I am not sure about these stats that they provide 
the number of objects that are already synced and related to its own zone. The 
metadata of the multisite radosgw is hold on the master zone. Does anybody 
clarify that the number of objects on the bucket stats query belongs to every 
zone’s own bucket or multi site shared metadata? I have doubts that the 
difference between the bucket stats output from each zone may be a result of 
command’s execution time. 

Any clue will be appreciated. 

Best regards,
Huseyin Cotuk
hco...@gmail.com




___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Changing ip addr

2024-07-11 Thread Eugen Block
Do you see it in 'ceph mgr services'? You might need to change the  
prometheus config as well and redeploy.



Zitat von Albert Shih :


Hi everyone

I just change the subnet of my cluster.

The cephfs part seem to working well.

But I got many error with

Jul 11 10:08:35 hostname ceph-***  
ts=2024-07-11T08:08:35.364Z caller=refresh.go:99 level=error  
component="discovery manager notify" discovery=http config=config-0  
msg="Unable to refresh target groups" err="Get  
\"http://OLD_IP:8765/sd/prometheus/sd-config?service=alertmanager\":  
context deadline exceeded (Client.Timeout exceeded while awaiting  
headers)"


I didn't find which service does the discovery and was unable to change the
URL.

I try a

  ceph config-key dump |grep OLD_IP

and didn't find it.

So where this information are store ?

Regards

JAS
--
Albert SHIH 🦫 🐸
Observatoire de Paris
France
Heure locale/Local time:
jeu. 11 juil. 2024 10:22:58 CEST
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Changing ip addr

2024-07-11 Thread Albert Shih
Le 11/07/2024 à 08:34:21+, Eugen Block a écrit

Hi, 

Sorry I miss the answer to the list.

> Do you see it in 'ceph mgr services'? You might need to change the

Yes I did 

root@cthulhu1:/etc# ceph mgr services
{
"dashboard": "https://NEW_SUBNET.189.35:8443/";,
"prometheus": "http://NEW_SUBNET.189.35:9283/";
}
root@cthulhu1:/etc#

> prometheus config as well and redeploy.

I'm not sure, but I don't think the problem is on the «server side» (server
= prometheus) because I got every daemon listen on every IP


root@cthulhu1:/etc# ceph orch ps|grep pro
prometheus.cthulhu1  cthulhu1   *:9095   running (22h)  
   9m ago   5M58.8M-  2.43.0   a07b618ecd1d  77abd8ebe0c2
prometheus.cthulhu2  cthulhu2   *:9095   running (22h)  
   9m ago   5M55.6M-  2.43.0   a07b618ecd1d  a1346925807a
prometheus.cthulhu3  cthulhu3   *:9095   running (22h)  
   9m ago   5M61.5M-  2.43.0   a07b618ecd1d  2fb558b9d300
prometheus.cthulhu4  cthulhu4   *:9095   running (22h)  
   9m ago   5M56.3M-  2.43.0   a07b618ecd1d  119ada9ba6de
prometheus.cthulhu5  cthulhu5   *:9095   running (22h)  
   9m ago   5M60.6M-  2.43.0   a07b618ecd1d  c355ed607fae

If I check with ss same result. 

I'm guessing is on the client side somewhere they are the old IP
address...but I can't find where and of course I'm unable to change it. 

Regards
-- 
Albert SHIH 🦫 🐸
Observatoire de Paris
France
Heure locale/Local time:
jeu. 11 juil. 2024 10:51:25 CEST
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Changing ip addr

2024-07-11 Thread Eugen Block

And how about the prometheus.yml?

/var/lib/ceph/{fsid}/prometheus.{node}/etc/prometheus/prometheus.yml

It contains an IP address as well:

alerting:
  alertmanagers:
- scheme: http
  http_sd_configs:
- url: http://{IP}:8765/sd/prometheus/sd-config?service=alertmanager

I misread the line, maybe you need to update alertmanager instead of  
prometheus.



Zitat von Albert Shih :


Le 11/07/2024 à 08:34:21+, Eugen Block a écrit

Hi,

Sorry I miss the answer to the list.


Do you see it in 'ceph mgr services'? You might need to change the


Yes I did

root@cthulhu1:/etc# ceph mgr services
{
"dashboard": "https://NEW_SUBNET.189.35:8443/";,
"prometheus": "http://NEW_SUBNET.189.35:9283/";
}
root@cthulhu1:/etc#


prometheus config as well and redeploy.


I'm not sure, but I don't think the problem is on the «server side» (server
= prometheus) because I got every daemon listen on every IP


root@cthulhu1:/etc# ceph orch ps|grep pro
prometheus.cthulhu1  cthulhu1   *:9095
running (22h) 9m ago   5M58.8M-  2.43.0
a07b618ecd1d  77abd8ebe0c2
prometheus.cthulhu2  cthulhu2   *:9095
running (22h) 9m ago   5M55.6M-  2.43.0
a07b618ecd1d  a1346925807a
prometheus.cthulhu3  cthulhu3   *:9095
running (22h) 9m ago   5M61.5M-  2.43.0
a07b618ecd1d  2fb558b9d300
prometheus.cthulhu4  cthulhu4   *:9095
running (22h) 9m ago   5M56.3M-  2.43.0
a07b618ecd1d  119ada9ba6de
prometheus.cthulhu5  cthulhu5   *:9095
running (22h) 9m ago   5M60.6M-  2.43.0
a07b618ecd1d  c355ed607fae


If I check with ss same result.

I'm guessing is on the client side somewhere they are the old IP
address...but I can't find where and of course I'm unable to change it.

Regards
--
Albert SHIH 🦫 🐸
Observatoire de Paris
France
Heure locale/Local time:
jeu. 11 juil. 2024 10:51:25 CEST



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Changing ip addr

2024-07-11 Thread Albert Shih
Le 11/07/2024 à 09:00:02+, Eugen Block a écrit
Hi, 

Thanks, but ... nope.

> And how about the prometheus.yml?
> 
> /var/lib/ceph/{fsid}/prometheus.{node}/etc/prometheus/prometheus.yml
> 
> It contains an IP address as well:
> 
> alerting:
>   alertmanagers:
> - scheme: http
>   http_sd_configs:
> - url: http://{IP}:8765/sd/prometheus/sd-config?service=alertmanager

nothingeven a

  cd /var/lib/ceph
  rg -l OLD_SUBNET 

got nothing.

But no worries...win98 method (reboot the cluster) solve the issue. I'm
guessing something/some service need a restart. 

> 
> I misread the line, maybe you need to update alertmanager instead of
> prometheus.

Nope either 

Thanks. 

Regards
-- 
Albert SHIH 🦫 🐸
Observatoire de Paris
France
Heure locale/Local time:
jeu. 11 juil. 2024 11:23:38 CEST
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Large omap in index pool even if properly sharded and not "OVER"

2024-07-11 Thread Szabo, Istvan (Agoda)
Hi Casey,

So the multisite thing when we resharded the bucket I've completely disabled 
and removed the bucket from the sync before like, disable, removed the pipe and 
everything step by step, finally period updated so this is not syncing I'm kind 
of sure so I think we can focus on the master zone only and after I check on 
the secondary site the bilogs, there I can try to trim.
2nd zone shouldn't cause large omap in the master zone if bucket sync already 
disabled I guess.

The 2nd that you've suggested is more interesting, so when you say shard id, 
let's say from this entry ( 
.dir.9213182a-14ba-48ad-bde9-289a1c0c0de8.2479481907.1.151), the 151 is the 
shard that you mentioned, which means in cli I guess:
radosgw-admin bilog list --bucket=bucketname --shard-id=151
(We use octopus)

And this huge list we should identify with the app owner objects by objects 
which is not there anymore and they don't use? Not easy for sure 😄Not sure is 
there any proper way to do this? We've run bucket  check with fix but that one 
didn't help.

Thank you


From: Casey Bodley 
Sent: Wednesday, July 10, 2024 8:24 PM
To: Szabo, Istvan (Agoda) 
Cc: Eugen Block ; Ceph Users 
Subject: Re: [ceph-users] Re: Large omap in index pool even if properly sharded 
and not "OVER"

Email received from the internet. If in doubt, don't click any link nor open 
any attachment !


On Tue, Jul 9, 2024 at 12:41 PM Szabo, Istvan (Agoda)
 wrote:
>
> Hi Casey,
>
> 1.
> Regarding versioning, the user doesn't use verisoning it if I'm not mistaken:
> https://gist.githubusercontent.com/Badb0yBadb0y/d80c1bdb8609088970413969826d2b7d/raw/baee46865178fff454c224040525b55b54e27218/gistfile1.txt
>
> 2.
> Regarding multiparts, if it would have multipart thrash, it would be listed 
> here:
> https://gist.githubusercontent.com/Badb0yBadb0y/d80c1bdb8609088970413969826d2b7d/raw/baee46865178fff454c224040525b55b54e27218/gistfile1.txt
> as a rgw.multimeta under the usage, right?
>
> 3.
> Regarding the multisite idea, this bucket has been a multisite bucket last 
> year but we had to reshard (accepting to loose the replica on the 2nd site 
> and just keep it in the master site) and that time as expected it has 
> disappeared completely from the 2nd site (I guess the 40TB thrash still there 
> but can't really find it how to clean 🙁 ). Now it is a single site bucket.
> Also it is the index pool, multisite logs should go to the rgw.log pool 
> shouldn't it?

some replication logs are in the log pool, but the per-object logs are
stored in the bucket index objects. you can inspect these with
`radosgw-admin bilog list --bucket=X`. by default, that will only list
--max-entries=1000. you can add --shard-id=Y to look at specific
'large omap' objects

even if your single-site bucket doesn't exist on the secondary zone,
changes on the primary zone are probably still generating these bilog
entries. you would need to do something like `radosgw-admin bucket
sync disable --bucket=X` to make it stop. because you don't expect
these changes to replicate, it's safe to delete any of this bucket's
bilog entries with `radosgw-admin bilog trim --end-marker 9
--bucket=X`. depending on ceph version, you may need to run this trim
command in a loop until the `bilog list` output is empty

radosgw does eventually trim bilogs in the background after they're
processed, but the secondary zone isn't processing them in this case

>
> Thank you
>
>
> 
> From: Casey Bodley 
> Sent: Tuesday, July 9, 2024 10:39 PM
> To: Szabo, Istvan (Agoda) 
> Cc: Eugen Block ; ceph-users@ceph.io 
> Subject: Re: [ceph-users] Re: Large omap in index pool even if properly 
> sharded and not "OVER"
>
> Email received from the internet. If in doubt, don't click any link nor open 
> any attachment !
> 
>
> in general, these omap entries should be evenly spread over the
> bucket's index shard objects. but there are two features that may
> cause entries to clump on a single shard:
>
> 1. for versioned buckets, multiple versions of the same object name
> map to the same index shard. this can become an issue if an
> application is repeatedly overwriting an object without cleaning up
> old versions. lifecycle rules can help to manage these noncurrent
> versions
>
> 2. during a multipart upload, all of the parts are tracked on the same
> index shard as the final object name. if applications are leaving a
> lot of incomplete multipart uploads behind (especially if they target
> the same object name) this can lead to similar clumping. the S3 api
> has operations to list and abort incomplete multipart uploads, along
> with lifecycle rules to automate their cleanup
>
> separately, multisite clusters use these same index shards to store
> replication logs. if sync gets far enough behind, these log entries
> can also lead to large omap warnings
>
> On Tue, Jul 9, 2024 at 10:25 AM Szabo, Istvan (Agoda)
>  w

[ceph-users] Re: cephadm for Ubuntu 24.04

2024-07-11 Thread John Mulligan
On Thursday, July 11, 2024 4:22:28 AM EDT Stefan Kooman wrote:
> On 11-07-2024 09:55, Malte Stroem wrote:
> > Hello Stefan,
> > 
> > have a look:
> > 
> > https://docs.ceph.com/en/latest/cephadm/install/#curl-based-installation
> 
> Yeah, I have read that part.
> 
> > Just download cephadm. It will work on any distro.
> 
> curl --silent --remote-name --location
> https://download.ceph.com/rpm-18.2.1/el9/noarch/cephadm
> 
> ./cephadm gather-facts
> Traceback (most recent call last):
>File "", line 198, in _run_module_as_main
>File "", line 88, in _run_code
>File "/root/./cephadm/__main__.py", line 10700, in 
>File "/root/./cephadm/__main__.py", line 10688, in main
>File "/root/./cephadm/__main__.py", line 9772, in command_gather_facts
>File "/root/./cephadm/__main__.py", line 9762, in dump
>File "/root/./cephadm/__main__.py", line 9677, in kernel_security
>File "/root/./cephadm/__main__.py", line 9658, in _fetch_apparmor
> ValueError: too many values to unpack (expected 2)
> 
> The version that works for 22.04 also doesn't work on 24.04 and fails in
> a similar way.
> 
> It just isn't that simple I'm afraid. But it _should_ be that simple,
> agreed.
> 
> > You do not need any ceph package like ceph-common for example to run
> > cephadm.
> 
> Indeed, which is pretty neat. If cephadm works you can basically do all
> ceph related stuff in a container. Getting cephadm to run on more
> platforms is the last mile.
> 
> Gr. Stefan


Hi there,
The traceback you are hitting is a bug - there's a fix already applied to main: 
 
https://github.com/ceph/ceph/pull/57955
I'll ask to have backport PRs get generated. I'm personally pretty clueless as 
to how to process backports.

The bug is independent of how cephadm is packaged FWIW. Even if you had a 
package of just cephadm built for ubuntu 24.04 it would have still hit the 
problem. The code simply didn't understand all the possible syntax that can 
appear in the apparmor profiles and newer versions of ubuntu appear to use 
apparmor profiles with spaces in the name more commonly than older versions.


The current cephadm build process creates a "zipapp" out of a few select 
python packages and the cephadm source code. If you really want to you could 
wrap that, and just that, in a system package what would not need many 
dependencies. However, this would need to be a bespoke package as the packages 
created by the ceph project include "everything" ceph builds. But the build 
script for cephadm (./src/cephadm/build.py) doesn't need any of those other 
binaries to be built to work. - In case you were still curious about that and 
want to tinker.


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] [RFC][UADK integration][Acceleration of zlib compressor]

2024-07-11 Thread Rongqi Sun
Hi Ceph community,

UADK is an open source accelerator framework, the kernel support part is
UACCE , which has been
merged in Kernel for several years, targeting to provide Shared Virtual
Addressing (SVA) between accelerators and processes. UADK provider users
with a unified programming interface to efficiently harness the
capabilities of hardware accelerators. It furnishes users with fundamental
library and driver support. Now UADK can offer the hardware accelerator
engine(e.g: Kunpeng KAE), Arm SVE and Crypto Extension CPU. UADK
 is hosted by Linaro.

UADK now has already supports different communities for compressor and
encryption, such as OpenSSL, DPDK and SPDK, so now, we would like to bring
it to Ceph for Acceleration of  zlib compressor. As first step, we can see
that,

   1. save almost 50% cpu usage compared with no-isal compression in RBD 4M
   workload
   2. save almost 40% cpu usage compared with no-isal compression in RGW
   put op (4M) workload

The PR  is under review, welcome
any comments or reviews.

Have a nice day~

Rongqi Sun
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Repurposing some Dell R750s for Ceph

2024-07-11 Thread Drew Weaver
Hello,

We would like to repurpose some Dell PowerEdge R750s for a Ceph cluster.

Currently the servers have one H755N RAID controller for each 8 drives. (2 
total)

I have been asking their technical support what needs to happen in order for us 
to just rip out those raid controllers and cable the backplane directly to the 
motherboard/PCIe lanes and they haven't been super enthusiastic about helping 
me. I get it just buy another 50 servers, right? No big deal.

I have the diagrams that show how each configuration should be connected, I 
think I just need the right cable(s), my question is has anyone done this work 
before and was it worth it?

Also bonus if anyone has an R750 that has the drives directly connected to the 
backplane and can find the part number of the cable that connects the backplane 
to the motherboard I would greatly appreciate that part number. My sales guys 
are "having a hard time locating it".

Thanks,
-Drew

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Repurposing some Dell R750s for Ceph

2024-07-11 Thread Frank Schilder
Hi Drew,

as far as I know Dell's drive bays for RAID controllers are not the same as the 
drive bays for CPU attached disks. In particular, I don't think they have that 
config for 3.5" drive bays and your description sounds a lot like that's what 
you have. Are you trying to go from 16x2.5" HDD to something like 24xNVMe?

Maybe you could provide a bit more information here, like (links to) the wiring 
diagrams you mentioned? From the description I cannot entirely deduce what 
exactly you have and where you want to go to.

Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14


From: Drew Weaver 
Sent: Thursday, July 11, 2024 3:16 PM
To: 'ceph-users@ceph.io'
Subject: [ceph-users] Repurposing some Dell R750s for Ceph

Hello,

We would like to repurpose some Dell PowerEdge R750s for a Ceph cluster.

Currently the servers have one H755N RAID controller for each 8 drives. (2 
total)

I have been asking their technical support what needs to happen in order for us 
to just rip out those raid controllers and cable the backplane directly to the 
motherboard/PCIe lanes and they haven't been super enthusiastic about helping 
me. I get it just buy another 50 servers, right? No big deal.

I have the diagrams that show how each configuration should be connected, I 
think I just need the right cable(s), my question is has anyone done this work 
before and was it worth it?

Also bonus if anyone has an R750 that has the drives directly connected to the 
backplane and can find the part number of the cable that connects the backplane 
to the motherboard I would greatly appreciate that part number. My sales guys 
are "having a hard time locating it".

Thanks,
-Drew

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Repurposing some Dell R750s for Ceph

2024-07-11 Thread John Jasen
retrofitting the guts of a Dell PE R7xx server is not straightforward. You
could be looking into replacing the motherboard, the backplane, and so
forth.

You can probably convert the H755N card to present the drives to the OS, so
you can use them for Ceph. This may be AHCI mode, pass-through mode,
non-RAID device, or some other magic words in the raid configuration
utility.

This should be in the raid documentation, somewhere.



On Thu, Jul 11, 2024 at 9:17 AM Drew Weaver  wrote:

> Hello,
>
> We would like to repurpose some Dell PowerEdge R750s for a Ceph cluster.
>
> Currently the servers have one H755N RAID controller for each 8 drives. (2
> total)
>
> I have been asking their technical support what needs to happen in order
> for us to just rip out those raid controllers and cable the backplane
> directly to the motherboard/PCIe lanes and they haven't been super
> enthusiastic about helping me. I get it just buy another 50 servers, right?
> No big deal.
>
> I have the diagrams that show how each configuration should be connected,
> I think I just need the right cable(s), my question is has anyone done this
> work before and was it worth it?
>
> Also bonus if anyone has an R750 that has the drives directly connected to
> the backplane and can find the part number of the cable that connects the
> backplane to the motherboard I would greatly appreciate that part number.
> My sales guys are "having a hard time locating it".
>
> Thanks,
> -Drew
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: AssumeRoleWithWebIdentity in RGW with Azure AD

2024-07-11 Thread Ryan Rempel
Thanks!

I took a crack at it myself, and have some work-in-progress here:

https://github.com/cmu-rgrempel/ceph/pull/1

Feel free to use any of that if you like it. It's working for me, but I've only 
tested it with Azure AD – I haven't tested the cases that it used to work for. 
(I believe it doesn't break them, but haven't tested).

--

Ryan Rempel



From: Pritha Srivastava 
Sent: Monday, July 8, 2024 10:38 PM

Hi Ryan,

This appears to be a known issue and is tracked here: 
https://tracker.ceph.com/issues/54562. There is a workaround mentioned in the 
tracker that has worked and you can try that. Otherwise, I will be working on 
this 'invalid padding' problem very soon.

Thanks,
Pritha

On Tue, Jul 9, 2024 at 1:16 AM Ryan Rempel 
mailto:rgrem...@cmu.ca>> wrote:
I'm trying to setup the OIDC provider for RGW so that I can have roles that can 
be assumed by people logging into their regular Azure AD identities. The client 
I'm planning to use is Cyberduck – it seems like one of the few GUI S3 clients 
that manages the OIDC login process in a way that could work for relatively 
naive users.

I've gotten a fair ways down the road. I've been able to configure Cyberduck so 
that it performs the login with Azure AD, gets an identity token, and then 
sends it to Ceph to engage with the AssumeRoleWithWebIdentity process. However, 
I then get an error, which shows up in the Ceph rgw logs like this:

2024-07-08T17:18:09.749+ 7fb2d7845700  0 req 15967124976712370684 
1.284013867s sts:assume_role_web_identity Signature validation failed: evp 
verify final failed: 0 error:0407008A:rsa 
routines:RSA_padding_check_PKCS1_type_1:invalid padding

I turned the logging for rgw up to 20 to see if I could follow along to see how 
much of the process succeeds and learn more about what fails. I can then see 
logging messages from this file in the source code:

https://github.com/ceph/ceph/blob/08d7ff952d78d1bbda04d5ff7e3db1e733301072/src/rgw/rgw_rest_sts.cc

We get to WebTokenEngine::get_from_jwt, and it logs the JWT payload in a way 
that seems to be as expected. The logs then indicate that a request is sent to 
the /.well-known/openid-configuration endpoint that appears to be appropriate 
for the issuer of the JWT. The logs eventually indicate what looks like a 
successful and appropriate response to that. The logs then show that a request 
is sent to the jwks_uri that is indicated in the openid-configuration document. 
The response to that is logged, and it appears to be appropriate.

We then get some logging starting with "Certificate is", so it looks like we're 
getting as far as WebTokenEngine::validate_signature. So, several things appear 
to have happened successfully – we've loading the OIDC provider that 
corresponds to the iss, and we've found a client ID that corresponds to what I 
registered when I configured things. (This is why I say we appear to be a fair 
ways down the road – a lot of this is working).

It looks as though what's happening in the code now is that it's iterating 
through the certificates given in the jwks_uri content. There are 6 
certificates listed, but the code only gets as far as the first one. Looking at 
the code, what appears to be happening is that, among the various certificates 
in the jwks_uri, it's finding the first one which matches a thumbprint 
registered with Ceph (that is, which I registered with Ceph). This must be 
succeeding (for the first certificate), because the "Signature validation 
failed" logging comes later. So, the code does verify that the thumbprint of 
the first certificate matches one of the thumbprints I registered with Ceph for 
this OIDC provider.

We then get to a part of the code where it tries to verify the JWT using the 
certificate, with jwt::verify. Given what gets logged ("Signature validateion 
failed: ", this must be throwing an exception.

The thing I find surprising about this is that there really isn't any reason to 
think that the first certificate listed in the jwks_uri content is going to be 
the certificate used to sign the JWT. If I understand JWT correctly, it's 
appropriate to sign the JWT with any of the certificates listed in the jwks_uri 
content. Furthermore, the JWT header includes a reference to the kid, so it's 
possible for Ceph to know exactly which certificate the JWT purports to be 
signed by. And, Ceph knows that there might be multiple thumbprints, because we 
can register 5. So, the logic of trying the first valid certificate in x5c and 
then stopping if it fails seems broken, actually.

I suppose what I could do as a workaround is try to figure out whether Azure AD 
is consistently using the same kid to sign the JWTs for me, and then only 
register that thumbprint with Ceph. Then, Ceph would actually choose the 
correct certificate (as the others wouldn't match a thumbprint I registered). I 
may try this – in part, just to verify what I think is happening. But it would 
be awfully fragile – 

[ceph-users] Re: cephadm for Ubuntu 24.04

2024-07-11 Thread Stefan Kooman

On 11-07-2024 14:20, John Mulligan wrote:

On Thursday, July 11, 2024 4:22:28 AM EDT Stefan Kooman wrote:

On 11-07-2024 09:55, Malte Stroem wrote:

Hello Stefan,

have a look:

https://docs.ceph.com/en/latest/cephadm/install/#curl-based-installation


Yeah, I have read that part.


Just download cephadm. It will work on any distro.


curl --silent --remote-name --location
https://download.ceph.com/rpm-18.2.1/el9/noarch/cephadm

./cephadm gather-facts
Traceback (most recent call last):
File "", line 198, in _run_module_as_main
File "", line 88, in _run_code
File "/root/./cephadm/__main__.py", line 10700, in 
File "/root/./cephadm/__main__.py", line 10688, in main
File "/root/./cephadm/__main__.py", line 9772, in command_gather_facts
File "/root/./cephadm/__main__.py", line 9762, in dump
File "/root/./cephadm/__main__.py", line 9677, in kernel_security
File "/root/./cephadm/__main__.py", line 9658, in _fetch_apparmor
ValueError: too many values to unpack (expected 2)

The version that works for 22.04 also doesn't work on 24.04 and fails in
a similar way.

It just isn't that simple I'm afraid. But it _should_ be that simple,
agreed.


You do not need any ceph package like ceph-common for example to run
cephadm.


Indeed, which is pretty neat. If cephadm works you can basically do all
ceph related stuff in a container. Getting cephadm to run on more
platforms is the last mile.

Gr. Stefan



Hi there,
The traceback you are hitting is a bug - there's a fix already applied to main:
https://github.com/ceph/ceph/pull/57955


Thanks, I hadn't found that one.


I'll ask to have backport PRs get generated. I'm personally pretty clueless as
to how to process backports.


Ah, yeah, that's on my todo list as well (work on backports). AFAIK you 
need to cherry pick from main branch when processing PRs.




The bug is independent of how cephadm is packaged FWIW. Even if you had a
package of just cephadm built for ubuntu 24.04 it would have still hit the
problem. The code simply didn't understand all the possible syntax that can
appear in the apparmor profiles and newer versions of ubuntu appear to use
apparmor profiles with spaces in the name more commonly than older versions.


Ah, that explains ...




The current cephadm build process creates a "zipapp" out of a few select
python packages and the cephadm source code. If you really want to you could
wrap that, and just that, in a system package what would not need many
dependencies. However, this would need to be a bespoke package as the packages
created by the ceph project include "everything" ceph builds. But the build
script for cephadm (./src/cephadm/build.py) doesn't need any of those other
binaries to be built to work. - In case you were still curious about that and
want to tinker.


You betcha. Okay, that worked perfectly. I backported the fixes from 
#57955 in cephadm.py (apparently there was a code refactor and 
cephadmlib does not exist in 18.2.1). This worked for me:


./src/cephadm/build.py --python /usr/bin/python3 /tmp/cephadm 
-SCEPH_GIT_VER=$(git rev-parse HEAD) -SCEPH_GIT_NICE_VER=$(git describe) 
-SCEPH_RELEASE_NAME=reef -SCEPH_RELEASE_TYPE=stable


This works fine on 24.04.

Getting this in a CI/CD pipeline whenever we need a new cephadm version 
and built a proper cephadm package we can install (instead of 
downloading a binary) would certainly be an option for us. Not sure if 
we are the only ones that want cephadm installed through packages 
though. Otherwise it would warrant an upstream cephadm package IMHO.


Thanks a lot!

Gr. Stefan
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Repurposing some Dell R750s for Ceph

2024-07-11 Thread Robin H. Johnson
On Thu, Jul 11, 2024 at 01:16:22PM +, Drew Weaver wrote:
> Hello,
> 
> We would like to repurpose some Dell PowerEdge R750s for a Ceph cluster.
> 
> Currently the servers have one H755N RAID controller for each 8 drives. (2 
> total)
The N variant of H755N specifically? So you have 16 NVME drives in each
server?

> I have been asking their technical support what needs to happen in
> order for us to just rip out those raid controllers and cable the
> backplane directly to the motherboard/PCIe lanes and they haven't been
> super enthusiastic about helping me. I get it just buy another 50
> servers, right? No big deal.
I don't think the motherboard has enough PCIe lanes to natively connect
all the drives: the RAID controller effectively functioned as a
expander, so you needed less PCIe lanes on the motherboard.

As the quickest way forward: look for passthrough / single-disk / RAID0
options, in that order, in the controller management tools (perccli etc).

I haven't used the N variant at all, and since it's NVME presented as
SCSI/SAS, I don't want to trust the solution of reflashing the
controller for IT (passthrough) mode.

-- 
Robin Hugh Johnson
Gentoo Linux: Dev, Infra Lead, Foundation President & Treasurer
E-Mail   : robb...@gentoo.org
GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136


signature.asc
Description: PGP signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: cephadm for Ubuntu 24.04

2024-07-11 Thread Konstantin Shalygin


> On 11 Jul 2024, at 15:20, John Mulligan  wrote:
> 
> I'll ask to have backport PRs get generated. I'm personally pretty clueless 
> as 
> to how to process backports.

The how-to described in this doc [1]

> Thanks, I hadn't found that one.
Added backport for squid release [2], as far as I understand Ceph in containers 
- you need "latest" `adm`


Thanks,
k
[1] 
https://github.com/ceph/ceph/blob/main/SubmittingPatches-backports.rst#opening-a-backport-pr
[2] https://github.com/ceph/ceph/pull/58542
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Repurposing some Dell R750s for Ceph

2024-07-11 Thread Anthony D'Atri
Agree with everything Robin wrote here.  RAID HBAs FTL.  Even in passthrough 
mode, it’s still an [absurdly expensive] point of failure, but a server in the 
rack is worth two on backorder.

Moreover, I’m told that it is possible to retrofit with cables and possibly an 
AIC mux / expander.

e.g.
https://www.ebay.com/itm/176400760681

Granted, I haven’t done this personally so I can’t speak to the BOM and 
procedure.  For OSD nodes it probably isn’t worth the effort.

Some of the LSI^H^H^H^HPERC HBAs — to my astonishment — don’t have a 
passthrough setting/mode.  This document though implies that this SKU does.



https://www.dell.com/support/manuals/en-ae/poweredge-r7525/perc11_ug/technical-specifications-of-perc-11-cards?guid=guid-aaaf8b59-903f-49c1-8832-f3997d125edf&lang=en-us


You should be able to set individual drives to passthrough:

storcli64 /call /eall /sall set jbod=on

or depending on the SKU and storcli revision, for the whole HBA

storcli64 /call set personality=JBOD

racadm set Storage.Controller.1.RequestedControllerMode HBA
or
racadm set Storage.Controller.1.RequestedControllerMode EnhancedHBA
then
  jobqueue create RAID.Integrated.1-1
  server action power cycle

LSI and Dell have not been particularly consistent with these beasts.

— aad



>> Hello,
>> 
>> We would like to repurpose some Dell PowerEdge R750s for a Ceph cluster.
>> 
>> Currently the servers have one H755N RAID controller for each 8 drives. (2 
>> total)
> The N variant of H755N specifically? So you have 16 NVME drives in each
> server?
> 
>> I have been asking their technical support what needs to happen in
>> order for us to just rip out those raid controllers and cable the
>> backplane directly to the motherboard/PCIe lanes and they haven't been
>> super enthusiastic about helping me. I get it just buy another 50
>> servers, right? No big deal.
> I don't think the motherboard has enough PCIe lanes to natively connect
> all the drives: the RAID controller effectively functioned as a
> expander, so you needed less PCIe lanes on the motherboard.
> 
> As the quickest way forward: look for passthrough / single-disk / RAID0
> options, in that order, in the controller management tools (perccli etc).
> 
> I haven't used the N variant at all, and since it's NVME presented as
> SCSI/SAS, I don't want to trust the solution of reflashing the
> controller for IT (passthrough) mode.
> 
> -- 
> Robin Hugh Johnson
> Gentoo Linux: Dev, Infra Lead, Foundation President & Treasurer
> E-Mail   : robb...@gentoo.org
> GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
> GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Help with Mirroring

2024-07-11 Thread Dave Hall
Hello.

I would like to use mirroring to facilitate migrating from an existing
Nautilus cluster to a new cluster running Reef.  RIght now I'm looking at
RBD mirroring.  I have studied the RBD Mirroring section of the
documentation, but it is unclear to me which commands need to be issued on
each cluster and, for commands that have both clusters as arguments, when
to specify site-a where vs. site-b.

Another concern:  Both the old and new cluster internally have the default
name 'Ceph' - when I set up the second cluster I saw no obvious reason to
change from the default.  If these will cause a problem with mirroring, is
there a workaround?

In the long run I will also be migrating a bunch of RGW data.  If there are
advantages to using mirroring for this I'd be glad to know.

(BTW, the plan is to gradually decommission the systems from the old
cluster and add them to the new cluster.  In this context, I am looking to
enable and disable mirroring on specific RBD images and RGW buckets as the
client workload is migrated from accessing the old cluster to accessing the
new.

Thanks.

-Dave

--
Dave Hall
Binghamton University
kdh...@binghamton.edu
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] July's User + Developer Monthly Meeting

2024-07-11 Thread Noah Lehman
Hi Ceph users,

July's User + Developer Monthly meeting will be happening Wednesday, July
24th at 10AM EDT. We look forward to seeing you there!

Event details: https://hubs.la/Q02GgVNb0

Best,

Noah
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Repurposing some Dell R750s for Ceph

2024-07-11 Thread Drew Weaver
Hi,

I'm a bit confused by your question the 'drive bays' or backplane is the same 
for an NVMe system, it's either a SATA/SAS/NVME backplane or a NVMe backplane.

I don't understand why you believe that my configuration has to be 3.5" as it 
isn't. It's a 16x2.5" chassis with two H755N controllers (one for each set of 8 
drives).

The H755N controller is a hardware raid adapter for NVMe.

I hope this clarifies the confusion.

Thank you,
-Drew


-Original Message-
From: Frank Schilder  
Sent: Thursday, July 11, 2024 9:57 AM
To: Drew Weaver ; 'ceph-users@ceph.io' 

Subject: Re: Repurposing some Dell R750s for Ceph

Hi Drew,

as far as I know Dell's drive bays for RAID controllers are not the same as the 
drive bays for CPU attached disks. In particular, I don't think they have that 
config for 3.5" drive bays and your description sounds a lot like that's what 
you have. Are you trying to go from 16x2.5" HDD to something like 24xNVMe?

Maybe you could provide a bit more information here, like (links to) the wiring 
diagrams you mentioned? From the description I cannot entirely deduce what 
exactly you have and where you want to go to.

Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14


From: Drew Weaver 
Sent: Thursday, July 11, 2024 3:16 PM
To: 'ceph-users@ceph.io'
Subject: [ceph-users] Repurposing some Dell R750s for Ceph

Hello,

We would like to repurpose some Dell PowerEdge R750s for a Ceph cluster.

Currently the servers have one H755N RAID controller for each 8 drives. (2 
total)

I have been asking their technical support what needs to happen in order for us 
to just rip out those raid controllers and cable the backplane directly to the 
motherboard/PCIe lanes and they haven't been super enthusiastic about helping 
me. I get it just buy another 50 servers, right? No big deal.

I have the diagrams that show how each configuration should be connected, I 
think I just need the right cable(s), my question is has anyone done this work 
before and was it worth it?

Also bonus if anyone has an R750 that has the drives directly connected to the 
backplane and can find the part number of the cable that connects the backplane 
to the motherboard I would greatly appreciate that part number. My sales guys 
are "having a hard time locating it".

Thanks,
-Drew

___
ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to 
ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Repurposing some Dell R750s for Ceph

2024-07-11 Thread Drew Weaver
Hi,

Isn’t the supported/recommended configuration to use an HBA if you have to but 
never use a RAID controller?

The backplane is already NVMe as the drives installed in the system currently 
are already NVMe.

Also I was looking through some diagrams of the R750 and it appears that if you 
order them with the RAID controller(s) the bandwidth between the backplane and 
the system is hamstrung to some degree because of the cables they are using so 
even if I could configure them in NON raid it would still be suboptimal.

Thanks for the information.
-Drew


From: John Jasen 
Sent: Thursday, July 11, 2024 10:06 AM
To: Drew Weaver 
Cc: ceph-users@ceph.io
Subject: Re: [ceph-users] Repurposing some Dell R750s for Ceph

retrofitting the guts of a Dell PE R7xx server is not straightforward. You 
could be looking into replacing the motherboard, the backplane, and so forth.

You can probably convert the H755N card to present the drives to the OS, so you 
can use them for Ceph. This may be AHCI mode, pass-through mode, non-RAID 
device, or some other magic words in the raid configuration utility.

This should be in the raid documentation, somewhere.



On Thu, Jul 11, 2024 at 9:17 AM Drew Weaver 
mailto:drew.wea...@thenap.com>> wrote:
Hello,

We would like to repurpose some Dell PowerEdge R750s for a Ceph cluster.

Currently the servers have one H755N RAID controller for each 8 drives. (2 
total)

I have been asking their technical support what needs to happen in order for us 
to just rip out those raid controllers and cable the backplane directly to the 
motherboard/PCIe lanes and they haven't been super enthusiastic about helping 
me. I get it just buy another 50 servers, right? No big deal.

I have the diagrams that show how each configuration should be connected, I 
think I just need the right cable(s), my question is has anyone done this work 
before and was it worth it?

Also bonus if anyone has an R750 that has the drives directly connected to the 
backplane and can find the part number of the cable that connects the backplane 
to the motherboard I would greatly appreciate that part number. My sales guys 
are "having a hard time locating it".

Thanks,
-Drew

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to 
ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Repurposing some Dell R750s for Ceph

2024-07-11 Thread Drew Weaver
Hi,

>I don't think the motherboard has enough PCIe lanes to natively connect all 
>the drives: the RAID controller effectively functioned as a expander, so you 
>needed less PCIe lanes on the motherboard.
>As the quickest way forward: look for passthrough / single-disk / RAID0 
>options, in that order, in the controller management tools (perccli etc).
>I haven't used the N variant at all, and since it's NVME presented as 
>SCSI/SAS, I don't want to trust the solution of reflashing the controller for 
>IT (passthrough) mode.

Reviewing the diagrams of the system with the H755N and without the H755N it 
looks like the PCIe lanes are limited by the cables they use when you order 
them with these RAID controllers.  (which I guess is why in order to put 16x 
drives in a system you need two controllers). It looks like 

As far as passthrough RAID0 disks, etc I was told that is not a valid 
configuration and wouldn't be supported.

Thanks for the information.
-Drew
 
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Repurposing some Dell R750s for Ceph

2024-07-11 Thread pe...@boku.net
I’ve replaced R640 drive backplanes (off ebay) to use U.2 NVMe instead of RAID.  Yes, I had to replace the backplane in order to talk to NVMe and in that work it removes exposure to RAID. peter On 7/11/24, 2:25 PM, "Drew Weaver"  wrote:Hi, Isn’t the supported/recommended configuration to use an HBA if you have to but never use a RAID controller? The backplane is already NVMe as the drives installed in the system currently are already NVMe. Also I was looking through some diagrams of the R750 and it appears that if you order them with the RAID controller(s) the bandwidth between the backplane and the system is hamstrung to some degree because of the cables they are using so even if I could configure them in NON raid it would still be suboptimal. Thanks for the information.-Drew  From: John Jasen Sent: Thursday, July 11, 2024 10:06 AMTo: Drew Weaver Cc: ceph-users@ceph.ioSubject: Re: [ceph-users] Repurposing some Dell R750s for Ceph retrofitting the guts of a Dell PE R7xx server is not straightforward. You could be looking into replacing the motherboard, the backplane, and so forth. You can probably convert the H755N card to present the drives to the OS, so you can use them for Ceph. This may be AHCI mode, pass-through mode, non-RAID device, or some other magic words in the raid configuration utility. This should be in the raid documentation, somewhere.   On Thu, Jul 11, 2024 at 9:17 AM Drew Weaver > wrote:Hello, We would like to repurpose some Dell PowerEdge R750s for a Ceph cluster. Currently the servers have one H755N RAID controller for each 8 drives. (2 total) I have been asking their technical support what needs to happen in order for us to just rip out those raid controllers and cable the backplane directly to the motherboard/PCIe lanes and they haven't been super enthusiastic about helping me. I get it just buy another 50 servers, right? No big deal. I have the diagrams that show how each configuration should be connected, I think I just need the right cable(s), my question is has anyone done this work before and was it worth it? Also bonus if anyone has an R750 that has the drives directly connected to the backplane and can find the part number of the cable that connects the backplane to the motherboard I would greatly appreciate that part number. My sales guys are "having a hard time locating it". Thanks,-Drew ___ceph-users mailing list -- ceph-users@ceph.ioceph-users@ceph.io>To unsubscribe send an email to ceph-users-le...@ceph.ioceph-users-le...@ceph.io>___ceph-users mailing list -- ceph-users@ceph.ioTo unsubscribe send an email to ceph-users-le...@ceph.io ___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: cephadm for Ubuntu 24.04

2024-07-11 Thread Tim Holloway


Just my €.02. There is, in fact a cephadm package for the Raspberry Pi
OS. If I read the synopis correctly, it's for ceph 16.2.11, which I
think is the same release of Ceph Pacific that I'm presently running my
own farm on. It appears to derive off Debian Bookworm.

Since cephadm is mainly a program to control other programs, I'd not be
surprised if it can at least partly manage newer ceph systems. Then
again, I don't know what it uses for interfaces beyond its ability to
pull down and setup ceph containers.

For my newer systems, cephadm is the only package I install and I use
cephadm shell. For general management, I do like to install the ceph-
common to get the "ceph" command down to the bash shell level without
invoking cephadm. Everything else, I haven't found a use for.

I do like managed packages because it's useful to have them show up in
a software inventory. However, once I have the cephadm package
installed, I prefer to let it manage the rest of the ceph
infrastructure.

  Best Regards,
 Tim
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [RFC][UADK integration][Acceleration of zlib compressor]

2024-07-11 Thread Brad Hubbard
On Thu, Jul 11, 2024 at 10:42 PM Rongqi Sun  wrote:
>
> Hi Ceph community,

Hi Rongqi,

Thanks for proposing this and for attending CDM to discuss it
yesterday. I see we have received some good feedback in the PR and
it's awaiting some suggested changes. I think this will be a useful
and performant addition to the code base and encourage its inclusion.

>
> UADK is an open source accelerator framework, the kernel support part is
> UACCE , which has been
> merged in Kernel for several years, targeting to provide Shared Virtual
> Addressing (SVA) between accelerators and processes. UADK provider users
> with a unified programming interface to efficiently harness the
> capabilities of hardware accelerators. It furnishes users with fundamental
> library and driver support. Now UADK can offer the hardware accelerator
> engine(e.g: Kunpeng KAE), Arm SVE and Crypto Extension CPU. UADK
>  is hosted by Linaro.
>
> UADK now has already supports different communities for compressor and
> encryption, such as OpenSSL, DPDK and SPDK, so now, we would like to bring
> it to Ceph for Acceleration of  zlib compressor. As first step, we can see
> that,
>
>1. save almost 50% cpu usage compared with no-isal compression in RBD 4M
>workload
>2. save almost 40% cpu usage compared with no-isal compression in RGW
>put op (4M) workload
>
> The PR  is under review, welcome
> any comments or reviews.
>
> Have a nice day~
>
> Rongqi Sun
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>


-- 
Cheers,
Brad
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] v19.1.0 Squid RC0 released

2024-07-11 Thread Yuri Weinstein
This is the first release candidate for Squid.

Feature highlights:

RGW
The User Accounts feature unlocks several new AWS-compatible IAM APIs
  for the self-service management of users, keys, groups, roles, policy and
  more.

RADOS
BlueStore has been optimized for better performance in
snapshot-intensive workloads.
BlueStore RocksDB LZ4 compression is now enabled by default to improve
average performance
and "fast device" space usage. Other improvements include more
flexible EC configurations,
an OpTracker to help debug mgr module issues, and better scrub scheduling.

Dashboard
* Rearranged Navigation Layout: The navigation layout has been reorganized
  for improved usability and easier access to key features.

CephFS Improvements
  * Support for managing CephFS snapshots and clones, as well as
snapshot schedule
management
  * Manage authorization capabilities for CephFS resources
  * Helpers on mounting a CephFS volume

RGW Improvements
  * Support for managing bucket policies
  * Add/Remove bucket tags
  * ACL Management
  * Several UI/UX Improvements to the bucket form
Monitoring: Grafana dashboards are now loaded into the container at
runtime rather than
  building a grafana image with the grafana dashboards. Official Ceph
grafana images
  can be found in quay.io/ceph/grafana
* Monitoring: RGW S3 Analytics: A new Grafana dashboard is now
available, enabling you to
  visualize per bucket and user analytics data, including total GETs,
PUTs, Deletes,
  Copies, and list metrics.

Crimson/Seastore
Crimson's first tech preview release!
Supporting RBD workloads on Replicated pools.
For more information please visit: https://ceph.io/en/news/crimson

If any of our community members would like to help us with performance
investigations or regression testing of the Squid release candidate,
please feel free to provide feedback via email or in
https://pad.ceph.com/p/squid_scale_testing. For more active
discussions, please use the #ceph-at-scale slack channel in
ceph-storage.slack.com.

* Git at git://github.com/ceph/ceph.git
* Tarball at https://download.ceph.com/tarballs/ceph-19.1.0.tar.gz
* Containers at https://quay.io/repository/ceph/ceph
* For packages, see https://docs.ceph.com/en/latest/install/get-packages/
* Release git sha1: 9025b9024baf597d63005552b5ee004013630404
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Repurposing some Dell R750s for Ceph

2024-07-11 Thread Anthony D'Atri


> 
> Isn’t the supported/recommended configuration to use an HBA if you have to 
> but never use a RAID controller?

That may be something I added to the docs.  My contempt for RAID HBAs knows no 
bounds ;)

Ceph doesn’t care.  Passthrough should work fine, I’ve done that for tends of 
thousands of OSDs, albeit on different LSI HBA SKUs.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Help with Mirroring

2024-07-11 Thread Anthony D'Atri
> 
> I would like to use mirroring to facilitate migrating from an existing
> Nautilus cluster to a new cluster running Reef.  RIght now I'm looking at
> RBD mirroring.  I have studied the RBD Mirroring section of the
> documentation, but it is unclear to me which commands need to be issued on
> each cluster and, for commands that have both clusters as arguments, when
> to specify site-a where vs. site-b.

I won’t go into the nitty-gritty, but note that you’ll likely run the 
rbd-mirror daemon on the destination cluster, and it will need reachability to 
all of the source cluster’s mons and OSDs.  Maybe mgrs, not sure.

> Another concern:  Both the old and new cluster internally have the default
> name 'Ceph' - when I set up the second cluster I saw no obvious reason to
> change from the default.  If these will cause a problem with mirroring, is
> there a workaround?

The docs used to imply that the clusters need to have distinct vanity names, 
but that was never actually the case — and vanity names are no longer supported 
for clusters.

The ceph.conf files for both clusters need to be distinct and present on the 
system where rbd-mirror runs.  You can do this by putting them in different 
subdirectories or calling them like cephsource.conf and cephdest.conf.  The 
filenames are arbitrary, you’ll just have to specify them when setting up 
rbd-mirror peers.


> In the long run I will also be migrating a bunch of RGW data.  If there are
> advantages to using mirroring for this I'd be glad to know.

Whole different ballgame.  You can use multisite or rclone or the new Clyso 
“Chorus” tool for that.

> (BTW, the plan is to gradually decommission the systems from the old
> cluster and add them to the new cluster.  In this context, I am looking to
> enable and disable mirroring on specific RBD images and RGW buckets as the
> client workload is migrated from accessing the old cluster to accessing the
> new.

I’ve migrated thousands of RBD volumes between clusters this way.  It gets a 
bit tricky if a volume is currently attached.

> 
> Thanks.
> 
> -Dave
> 
> --
> Dave Hall
> Binghamton University
> kdh...@binghamton.edu
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: AssumeRoleWithWebIdentity in RGW with Azure AD

2024-07-11 Thread Pritha Srivastava
This is very helpful, I'll take a look at it.

Thanks,
Pritha

On Thu, Jul 11, 2024 at 8:04 PM Ryan Rempel  wrote:

> Thanks!
>
> I took a crack at it myself, and have some work-in-progress here:
>
> https://github.com/cmu-rgrempel/ceph/pull/1
>
> Feel free to use any of that if you like it. It's working for me, but I've
> only tested it with Azure AD – I haven't tested the cases that it used to
> work for. (I believe it doesn't break them, but haven't tested).
>
> --
>
> Ryan Rempel
>
>
> --
> *From:* Pritha Srivastava 
> *Sent:* Monday, July 8, 2024 10:38 PM
>
> Hi Ryan,
>
> This appears to be a known issue and is tracked here:
> https://tracker.ceph.com/issues/54562. There is a workaround mentioned in
> the tracker that has worked and you can try that. Otherwise, I will be
> working on this 'invalid padding' problem very soon.
>
> Thanks,
> Pritha
>
> On Tue, Jul 9, 2024 at 1:16 AM Ryan Rempel  wrote:
>
> I'm trying to setup the OIDC provider for RGW so that I can have roles
> that can be assumed by people logging into their regular Azure AD
> identities. The client I'm planning to use is Cyberduck – it seems like one
> of the few GUI S3 clients that manages the OIDC login process in a way that
> could work for relatively naive users.
>
> I've gotten a fair ways down the road. I've been able to configure
> Cyberduck so that it performs the login with Azure AD, gets an identity
> token, and then sends it to Ceph to engage with the
> AssumeRoleWithWebIdentity process. However, I then get an error, which
> shows up in the Ceph rgw logs like this:
>
> 2024-07-08T17:18:09.749+ 7fb2d7845700  0 req 15967124976712370684
> 1.284013867s sts:assume_role_web_identity Signature validation failed: evp
> verify final failed: 0 error:0407008A:rsa
> routines:RSA_padding_check_PKCS1_type_1:invalid padding
>
> I turned the logging for rgw up to 20 to see if I could follow along to
> see how much of the process succeeds and learn more about what fails. I can
> then see logging messages from this file in the source code:
>
>
> https://github.com/ceph/ceph/blob/08d7ff952d78d1bbda04d5ff7e3db1e733301072/src/rgw/rgw_rest_sts.cc
>
> We get to WebTokenEngine::get_from_jwt, and it logs the JWT payload in a
> way that seems to be as expected. The logs then indicate that a request is
> sent to the /.well-known/openid-configuration endpoint that appears to be
> appropriate for the issuer of the JWT. The logs eventually indicate what
> looks like a successful and appropriate response to that. The logs then
> show that a request is sent to the jwks_uri that is indicated in the
> openid-configuration document. The response to that is logged, and it
> appears to be appropriate.
>
> We then get some logging starting with "Certificate is", so it looks like
> we're getting as far as WebTokenEngine::validate_signature. So, several
> things appear to have happened successfully – we've loading the OIDC
> provider that corresponds to the iss, and we've found a client ID that
> corresponds to what I registered when I configured things. (This is why I
> say we appear to be a fair ways down the road – a lot of this is working).
>
> It looks as though what's happening in the code now is that it's iterating
> through the certificates given in the jwks_uri content. There are 6
> certificates listed, but the code only gets as far as the first one.
> Looking at the code, what appears to be happening is that, among the
> various certificates in the jwks_uri, it's finding the first one which
> matches a thumbprint registered with Ceph (that is, which I registered with
> Ceph). This must be succeeding (for the first certificate), because the
> "Signature validation failed" logging comes later. So, the code does verify
> that the thumbprint of the first certificate matches one of the thumbprints
> I registered with Ceph for this OIDC provider.
>
> We then get to a part of the code where it tries to verify the JWT using
> the certificate, with jwt::verify. Given what gets logged ("Signature
> validateion failed: ", this must be throwing an exception.
>
> The thing I find surprising about this is that there really isn't any
> reason to think that the first certificate listed in the jwks_uri content
> is going to be the certificate used to sign the JWT. If I understand JWT
> correctly, it's appropriate to sign the JWT with any of the certificates
> listed in the jwks_uri content. Furthermore, the JWT header includes a
> reference to the kid, so it's possible for Ceph to know exactly which
> certificate the JWT purports to be signed by. And, Ceph knows that there
> might be multiple thumbprints, because we can register 5. So, the logic of
> trying the first valid certificate in x5c and then stopping if it fails
> seems broken, actually.
>
> I suppose what I could do as a workaround is try to figure out whether
> Azure AD is consistently using the same kid to sign the JWTs for me, and
> then only register that thumbprint with Cep

[ceph-users] Re: reef 18.2.3 QE validation status

2024-07-11 Thread Venky Shankar
Hi Yuri,

On Wed, Jul 10, 2024 at 7:28 PM Yuri Weinstein  wrote:
>
> We built a new branch with all the cherry-picks on top
> (https://pad.ceph.com/p/release-cherry-pick-coordination).
>
> I am rerunning fs:upgrade:
> https://pulpito.ceph.com/yuriw-2024-07-10_13:47:23-fs:upgrade-reef-release-distro-default-smithi/
>
> Venky, pls review it after it's done.

Thanks for retesting


https://pulpito.ceph.com/yuriw-2024-07-11_17:23:59-fs:upgrade-reef-release-distro-default-smithi/

Test run looks good!

>
> Neha, do you want to upgrade gibba/LRC to this build? (note there is
> no pre-release 18.2.4 built yet)
>
> On Tue, Jul 9, 2024 at 6:41 AM Casey Bodley  wrote:
> >
> > this was discussed in the ceph leadership team meeting yesterday, and
> > we've agreed to re-number this release to 18.2.4
> >
> > On Wed, Jul 3, 2024 at 1:08 PM  wrote:
> > >
> > >
> > > On Jul 3, 2024 5:59 PM, Kaleb Keithley  wrote:
> > > >
> > > >
> > > >
> > >
> > > > Replacing the tar file is problematic too, if only because it's a 
> > > > potential source of confusion for people who aren't paying attention.
> > >
> > > It'd be really the worst thing to do.
> > >
> > > > I'm not sure I believe that making this next release 18.2.4 really 
> > > > solves anything
> > >
> > > It solves *my* problem that the old version of the file is already in the 
> > > Debian archive and cannot be replaced there. By all means, please find a 
> > > better solution for long term. In the mean time, do *not* re-release an 
> > > already released tarball.
> > >
> > > Cheers,
> > >
> > > Thomas Goirand (zigo)
> > >
> >
> ___
> Dev mailing list -- d...@ceph.io
> To unsubscribe send an email to dev-le...@ceph.io



-- 
Cheers,
Venky
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Help with Mirroring

2024-07-11 Thread Eugen Block

Hi,

just one question coming to mind, if you intend to migrate the images  
separately, is it really necessary to set up mirroring? You could just  
'rbd export' on the source cluster and 'rbd import' on the destination  
cluster.



Zitat von Anthony D'Atri :



I would like to use mirroring to facilitate migrating from an existing
Nautilus cluster to a new cluster running Reef.  RIght now I'm looking at
RBD mirroring.  I have studied the RBD Mirroring section of the
documentation, but it is unclear to me which commands need to be issued on
each cluster and, for commands that have both clusters as arguments, when
to specify site-a where vs. site-b.


I won’t go into the nitty-gritty, but note that you’ll likely run  
the rbd-mirror daemon on the destination cluster, and it will need  
reachability to all of the source cluster’s mons and OSDs.  Maybe  
mgrs, not sure.



Another concern:  Both the old and new cluster internally have the default
name 'Ceph' - when I set up the second cluster I saw no obvious reason to
change from the default.  If these will cause a problem with mirroring, is
there a workaround?


The docs used to imply that the clusters need to have distinct  
vanity names, but that was never actually the case — and vanity  
names are no longer supported for clusters.


The ceph.conf files for both clusters need to be distinct and  
present on the system where rbd-mirror runs.  You can do this by  
putting them in different subdirectories or calling them like  
cephsource.conf and cephdest.conf.  The filenames are arbitrary,  
you’ll just have to specify them when setting up rbd-mirror peers.




In the long run I will also be migrating a bunch of RGW data.  If there are
advantages to using mirroring for this I'd be glad to know.


Whole different ballgame.  You can use multisite or rclone or the  
new Clyso “Chorus” tool for that.



(BTW, the plan is to gradually decommission the systems from the old
cluster and add them to the new cluster.  In this context, I am looking to
enable and disable mirroring on specific RBD images and RGW buckets as the
client workload is migrated from accessing the old cluster to accessing the
new.


I’ve migrated thousands of RBD volumes between clusters this way.   
It gets a bit tricky if a volume is currently attached.




Thanks.

-Dave

--
Dave Hall
Binghamton University
kdh...@binghamton.edu
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io