[ceph-users] Re: squid 19.2.2 - osd_memory_target_autotune - best practices when host has lots of RAM

2025-08-01 Thread Steven Vacaroaia
Excellent info Anthony Many thanks Steven On Fri, 1 Aug 2025 at 09:29, Anthony D'Atri wrote: > > The servers are dedicated to Ceph > Yes, it is perhaps too much but my IT philosophy is "there is always room > for more RAM" as it usually helps running things faster &g

[ceph-users] Re: squid 19.2.2 - osd_memory_target_autotune - best practices when host has lots of RAM

2025-08-01 Thread Steven Vacaroaia
, it would probably make sense to tune it manually , no ? How would I check status of autotune ...other than checking individual OSD config ? Many thanks Steven On Thu, 31 Jul 2025 at 10:43, Anthony D'Atri wrote: > IMHO the autotuner is awesome. > > 1TB of RAM is an embarrassment of r

[ceph-users] Re: squid 19.2.2 deployed with cephadmin - no grafana data on some dashboards ( RGW, MDS)

2025-07-31 Thread Steven Vacaroaia
with something like this service_type: prometheus service_name: prometheus placement: hosts: - host01 - host02 networks: - 192.169.142.0/24 Is there a better fix ? Steven On Wed, 23 Jul 2025 at 11:14, Ryan Sleeth wrote: > Make sure Grafana's self-signed certs are permitted

[ceph-users] squid 19.2.2 - osd_memory_target_autotune - best practices when host has lots of RAM

2025-07-31 Thread Steven Vacaroaia
thanks Steven ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] squid 19.2.2 - RGW performance tuning

2025-07-29 Thread Steven Vacaroaia
his is 1GB objecter inflight ops = 24576 Many thanks Steven ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Squid 19.2.2 - mon_target_pg_per_osd change not applied

2025-07-24 Thread Steven Vacaroaia
... Any ideas or suggestions for properly applying the change would be appreciated Steven [image: image.png] ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: squid 19.2.2 - cannot bootstrap - error writing to /tmp/monmap (21) Is a directory

2025-07-24 Thread Steven Vacaroaia
removing both docker and cephadm, rebooting and then reinstalling fixed the issue Thank you for your willingness to help Steven On Thu, 24 Jul 2025 at 12:22, Adam King wrote: > I can't say I know why this is happening, but I can try to give some > context into what cephadm is do

[ceph-users] Re: squid 19.2.2 - cannot bootstrap - error writing to /tmp/monmap (21) Is a directory

2025-07-24 Thread Steven Vacaroaia
to 24.04.2 remove/reinstall docker delete /etc/ceph/*, /var/lib/ceph/*, /etc/systemd/system/ceph* pkill -9 -f ceph* reboot Thanks Steven On Thu, 24 Jul 2025 at 12:22, Adam King wrote: > I can't say I know why this is happening, but I can try to give some > c

[ceph-users] Re: squid 19.2.2 - cannot bootstrap - error writing to /tmp/monmap (21) Is a directory

2025-07-24 Thread Steven Vacaroaia
rapping I’m not sure about the z on the next line, but -v > can be considered like a bind mount of the first path so that container > sees it on the second path. > > Now, as to what happened to /tmp/ceph-tmp, I can’t say. > > > > On Jul 24, 2025, at 10:53 AM, Steven Vaca

[ceph-users] squid 19.2.2 deployed with cephadmin - no grafana data on some dashboards ( RGW, MDS)

2025-07-22 Thread Steven Vacaroaia
/ ideas for how to troubleshoot /fix this ? Many thanks Steven ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Newby woes with ceph

2025-07-22 Thread Steven Vacaroaia
Hi Malte why " And do not use Ubuntu 24.04." please ? I just reinstalled my cluster and use 24.04 and 19.2.2. so , if need be, there is still time to redo / reconfigure Steven On Tue, 22 Jul 2025 at 04:05, Malte Stroem wrote: > Hello Stéphane, > > I think, you're mix

[ceph-users] Re: squid 19.2.2 - cannot remove 'unknown" OSD

2025-07-21 Thread Steven Vacaroaia
ing else that I can try (other than a reboot) ? Many thanks Steven On Mon, 21 Jul 2025 at 17:25, Anthony D'Atri wrote: > Look at /var/lib/ceph on ceph-host-7 for a leftover directory for osd.8 > > Also try > ceph osd crush remove osd.8 > ceph auth del osd.8 > ceph osd rm

[ceph-users] Re: squid 19.2.2 - discrepancies between GUI and CLI

2025-07-17 Thread Steven Vacaroaia
Thanks for the suggestion Unfortunately "ceph mgr fail" did not solve the issue Is there a better way to "fail" ? Steven On Thu, 17 Jul 2025 at 14:04, Anthony D'Atri wrote: > Try failing the mgr > > > On Jul 17, 2025, at 1:48 PM, Steven Vacaroaia

[ceph-users] Re: Rocky8 (el8) client for squid 19.2.2

2025-07-17 Thread Steven Vacaroaia
Awesome, thanks for the info! Steven On Thu, 17 Jul 2025 at 13:11, Malte Stroem wrote: > Hi Steven, > > there is no need for ceph-common. > > You can mount the CephFS with the mount command because the Ceph kernel > client is part of the kernel for a long time now. &

[ceph-users] Rocky8 (el8) client for squid 19.2.2

2025-07-17 Thread Steven Vacaroaia
Hi, I noticed there is no client /rpms for Rocky8 (el8) on 19.2.2 repository Would ceph-common for reef allow me to mount cephfs file systems without issues ? Many thanks Steven ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an

[ceph-users] Re: squid 19.2.2 - troubleshooting pgs in active+remapped+backfill - no pictures

2025-07-11 Thread Steven Vacaroaia
Thanks Anthony changing the scheduler will require to restart all OSDs , right using "ceph orch restart osd " Is this done in a staggering manner or I need to "stagger " them ? Steven On Fri, 11 Jul 2025 at 12:14, Anthony D'Atri wrote: > What you describe s

[ceph-users] squid 19.2.2 - troubleshooting pgs in active+remapped+backfill - no pictures

2025-07-11 Thread Steven Vacaroaia
ecovery_max_active_hdd ... etc) redeploying some of the OSDs that were "UP_PRIMARY but part of the backfill_wait PGs query the PGs and look for a "stuck reason" stop scrub and deep-scrub repair the PGs (some) change the pg_autoscale_mode to true

[ceph-users] Re: ceph squid - huge difference between capacity reported by "ceph -s" and "ceph df "

2025-06-30 Thread Steven Vacaroaia
s rotational: 0 size: 6000G:8000G service_id: nvme_osd crush_device_class: nvme_class placement: host_pattern: * spec: data_devices rotational: 0 size: 4000G:5500G Many thanks Steven On Sun, 29 Jun 2025 at 17:45, Anthony D'Atri wrote: > So you have NVMe SSD OSDs,

[ceph-users] Re: ceph squid - huge difference between capacity reported by "ceph -s" and "ceph df "

2025-06-29 Thread Steven Vacaroaia
Hi Janne Thanks That make sense since I have allocated 196GB for DB and 5 GB for WALL for all 42 spinning OSDs Again, thanks Steveb On Sun, 29 Jun 2025 at 12:02, Janne Johansson wrote: > Den sön 29 juni 2025 kl 17:22 skrev Steven Vacaroaia : > >> Hi, >> >> I just built

[ceph-users] CEPH Reef - HDD with WAL and DB on NVME

2025-05-27 Thread Steven Vacaroaia
adm .? Is there a documented procedure for fixing this if it was not done from the beginning ? Many thanks Steven ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: reef upgrade 2.2 to 2.7 - slow operations in bluestore

2025-05-13 Thread Steven Vacaroaia
Hi, Thanks for your suggestions There is nothing significant in the log files expect "transitioning to stray" ( see attached) Restarting the daemons does not help as , after few minutes, they are complaining again ALL my HDD based OSDs on ALL 7 hosts are complaining [image: image.p

[ceph-users] reef upgrade 2.2 to 2.7 - slow operations in bluestore

2025-05-12 Thread Steven Vacaroaia
lp Anyone else having this issue ? Thanks Steven bdev_enable_discard: "true" # quote bdev_async_discard_threads: "1" # quote ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Ceph reef ingress service - v4v6_flag undefined

2025-05-08 Thread Steven Vacaroaia
excellent resource Many thanks Steven On Thu, 8 May 2025 at 09:34, Anthony D'Atri wrote: > https://docs.ceph.com/en/latest/cephadm/services/monitoring/ May help > > On May 8, 2025, at 9:05 AM, Steven Vacaroaia wrote: > > Hi, > > I thought about that and disab

[ceph-users] Re: Ceph reef ingress service - v4v6_flag undefined

2025-05-08 Thread Steven Vacaroaia
e a way to tell ceph orch to use a specific version ? or is there a way to deploy daemons using a "pulled " version Many thanks Steven On Thu, 8 May 2025 at 05:03, Anthony D'Atri wrote: > Any chance you have a fancy network proxy ? > > On May 8, 2025, at 1:45 AM, Steven Vac

[ceph-users] Re: Ceph reef ingress service - v4v6_flag undefined

2025-05-07 Thread Steven Vacaroaia
ay.io/ceph [image: image.png] Any hints ? Steven On Wed, 7 May 2025 at 15:24, Adam King wrote: > I would think you just need to remove the two instances of "{{ v4v6_flag > }}" from > https://github.com/ceph/ceph/blob/main/src/pybind/mgr/cephadm/templates/services/ingress/

[ceph-users] Re: Ceph reef ingress service - v4v6_flag undefined

2025-05-07 Thread Steven Vacaroaia
Hi Adam Thanks for offering to help yes, I did a ceph config set using below template as some people reported that it will help solve NFS HA issue ( e.g. haproxy,cfg deployed missing "check") Now neither NFS nor RGW works :-( How do I fix this ? thanks Steven https://github.com

[ceph-users] Ceph reef ingress service - v4v6_flag undefined

2025-05-07 Thread Steven Vacaroaia
Hi, I am unable to deploy ingress service because "v4v6_flag" is undefined I couldn't find any information about this flag The ingress.yaml file used is similar with this one Any help would be greatly appreciated Steven service_type: ingress service_id: rgw placement: ho

[ceph-users] ceph NFS reef - cephadm reconfigure haproxy

2025-05-05 Thread Steven Vacaroaia
Hi I am testing NFS trying to make sure that, deploying it as below will give me redundancy ceph nfs cluster create main "1 ceph-01,ceph-02" --port 2049 --ingress --virtual-ip 10.90.0.90 For what I have read so far ( and some of the posts on this list) the only way to get the NFS "surviving "

[ceph-users] recommendation for buying CEPH appliance

2024-06-18 Thread Steven Vacaroaia
Hi, Could you please recommend a vendor that sells CEPH appliances ? ( preferable a CEPH+PROXMOX) In USA or Canada would be great but , not necessary Among the ones I know are Eurostor ( Germany) 45drives ( Canada) Many thanks Steven ___ ceph

[ceph-users] Re: Persistent Bucket Notification performance

2022-11-24 Thread Steven Goodliff
stest media i have available ? On Thu, 24 Nov 2022 at 13:37, Yuval Lifshitz wrote: > Hi Steven, > When using synchronous (=non-persistent) notifications, the overall rate > is dependent on the latency between the RGW and the endpoint to which you > are sending the notifications. The prot

[ceph-users] Persistent Bucket Notification performance

2022-11-24 Thread Steven Goodliff
ainly down to being throttled by using 1 rgw rather than all the rgw's the async method allows. We would prefer to use persistent but can't get the throughput we need, any suggestions would be much appreciated. Thanks Steven ___ ceph-users

[ceph-users] Re: Ceph cluster shutdown procedure

2022-11-24 Thread Steven Goodliff
Hi, Thanks Eugen,I found some similar docs on the Redhat site as well and made a Ansible playbook to follow the steps. Cheers On Thu, 17 Nov 2022 at 13:28, Steven Goodliff wrote: > Hi, > > Is there a recommended way of shutting a cephadm cluster down completely? > > I tried u

[ceph-users] Ceph cluster shutdown procedure

2022-11-17 Thread Steven Goodliff
Hi, Is there a recommended way of shutting a cephadm cluster down completely? I tried using cephadm to stop all the services but hit the following message. "Stopping entire osd.osd service is prohibited" Thanks ___ ceph-users mailing list -- ceph-user

[ceph-users] Re: RGW multi site replication performance

2022-09-28 Thread Steven Goodliff
Hi, >From what I've discovered so far with one bucket and one topic max out on our >system around ~1k second notifications but multiple buckets with multiple >topics (even if the topics all point to the same push endpoint gives more >performance), still digging. Steve

[ceph-users] RGW multi site replication performance

2022-09-21 Thread Steven Goodliff
bject requests have finished. Are there any configuration options i can look at trying ? Thanks Steven Goodliff Global Relay ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: cephadm host maintenance

2022-07-14 Thread Steven Goodliff
. Cheers Steven Goodliff From: Robert Gallop Sent: 13 July 2022 16:55 To: Adam King Cc: Steven Goodliff; ceph-users@ceph.io Subject: Re: [ceph-users] Re: cephadm host maintenance This brings up a good follow on…. Rebooting in general for OS patching. I have not

[ceph-users] cephadm host maintenance

2022-07-13 Thread Steven Goodliff
witch active Mgrs with 'ceph mgr fail node2-cobj2-atdev1-nvan.ghxlvw' on one instance. should cephadm handle the switch ? thanks Steven Goodliff Global Relay ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email t

[ceph-users] Re: Very slow I/O during rebalance - options to tune?

2021-08-12 Thread Steven Pine
gt; osd_op_queue = wpq > > osd_op_queue_cut_off = high > > > Afaik, the default for osd_op_queue_cut_off was set to low by mistake > prior to Octopus. > > > Peter > > > ___ > ceph-users mailing list -- ceph-users@ceph.

[ceph-users] Re: [External Email] Re: Re: Failure Domain = NVMe?

2021-03-11 Thread Steven Pine
and room for error and bugs this can cause is not recommended. On Thu, Mar 11, 2021 at 3:27 PM Dave Hall wrote: > Steven, > > In my current hardware configurations each NVMe supports multiple OSDs. > In my earlier nodes it is 8 OSDs sharing one NVMe (which is also too > small). I

[ceph-users] Re: Failure Domain = NVMe?

2021-03-11 Thread Steven Pine
; > >> > ____ > >> > This message is confidential and is for the sole use of the intended > >> > recipient(s). It may also be privileged or otherwise protected by > >> copyright > >> > or other legal rules. If you h

[ceph-users] Re: Failure Domain = NVMe?

2021-03-11 Thread Steven Pine
> > by reply email and delete it from your system. It is prohibited to copy > > this message or disclose its content to anyone. Any confidentiality or > > privilege is not waived or lost by any mistaken delivery or unauthorized > > disclosure of the message. All messages

[ceph-users] Re: Small RGW objects and RADOS 64KB minimun size

2021-02-16 Thread Steven Pine
llocated. > > In terms of making this easier, we're looking to automate rolling format > changes across a cluster with cephadm in the future. > > Josh > > On 2/16/21 9:58 AM, Steven Pine wrote: > > Will there be a well documented strategy / method for changing block >

[ceph-users] Re: Small RGW objects and RADOS 64KB minimun size

2021-02-16 Thread Steven Pine
t;> [1] https://docs.ceph.com/en/latest/radosgw/layout/ > >> [2] https://github.com/ceph/ceph/pull/32809 > >> [3] https://www.spinics.net/lists/ceph-users/msg45755.html > > > > -- > Loïc Dachary, Artisan Logiciel Libre > > >

[ceph-users] Re: Speed of S3 Ceph gateways

2021-02-08 Thread Steven Pine
S3 gateways have much more computer power and bandwidth to internet then > it is used right now. > > Thank you > > Regards > Michal Strnad > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to

[ceph-users] Re: NVMe and 2x Replica

2021-02-04 Thread Steven Pine
imilar number of NVMe drives > will bottleneck. Unless perhaps you have the misfortune of a chassis > manufacturer who for some reason runs NVMe PCI lanes *though* an HBA. > > > > ___ > ceph-users mailing list -- ceph-users@ceph.io >

[ceph-users] Re: Documentation of older Ceph version not accessible anymore on docs.ceph.com

2020-11-24 Thread Steven Pine
older ceph version not accessible anymore on docs.ceph.com > >> > >> It's changed UI because we're hosting them on readthedocs.com now. See > >> the dropdown in the lower right corner. > >> > > > ___ >

[ceph-users] Re: The confusing output of ceph df command

2020-09-10 Thread Steven Pine
; > ... > > > > The USED = 3 * STORED in 3-replica mode is completely right, but for EC > 4+2 pool > > (for default-fs-data0 ) > > > > the USED is not equal 1.5 * STORED, why...:( > > > > > > ___ > > ceph-users mailing list -- ceph-users@c

[ceph-users] Re: Is it possible to mount a cephfs within a container?

2020-08-27 Thread steven prothero
Hello, octopus 15.2.4 just as a test, I put my OSDs each inside of a LXD container. Set up cephFS and mounted it inside a LXD container and it works. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph

[ceph-users] Re: pg stuck in unknown state

2020-08-26 Thread steven prothero
Hello, I started a new fresh ceph cluster and have the exact same problem and also the slow op warnings. I found this bug report that seems to be about this problem: https://158.69.68.89/issues/46743 "... mgr/devicehealth: device_health_metrics pool gets created even without any OSDs in the clus

[ceph-users] Resolving a pg inconsistent Issue

2020-08-14 Thread Steven Pine
pg repair attempt? Thank you for any suggestions or advice, -- Steven Pine webair.com *P* 516.938.4100 x *E * steven.p...@webair.com <https://www.facebook.com/WebairInc/> <https://www.linkedin.com/company/webair> ___ ceph-users mailing list

[ceph-users] Re: ceph rbd iscsi gwcli Non-existent images

2020-08-10 Thread Steven Vacaroaia
: layering, exclusive-lock, object-map, fast-diff, deep-flatten op_features: flags: create_timestamp: Thu Nov 29 13:56:28 2018 On Mon, 10 Aug 2020 at 09:21, Jason Dillaman wrote: > On Fri, Aug 7, 2020 at 2:37 PM Steven Vacaroaia wrote: > > > > Hi, > &

[ceph-users] ceph rbd iscsi gwcli Non-existent images

2020-08-07 Thread Steven Vacaroaia
Hi, I would appreciate any help/hints to solve this issue iscis (gwcli) cannot see the images anymore This configuration worked fine for many months What changed was that ceph is "nearly full" I am in the process of cleaning it up ( by deleting objects from one of the pools) and I do see reads

[ceph-users] Re: Module 'cephadm' has failed: auth get failed: failed to find client.crash.ceph0-ote in keyring retval:

2020-07-22 Thread steven prothero
Hello, on my system it solved it but then a different node suddenly started the same error. I tried it on the new problem and it did not help. I notice on: https://tracker.ceph.com/issues/45726 it says resolved, but on next version v15.2.5 ___ ceph

[ceph-users] Re: Help add node to cluster using cephadm

2020-07-21 Thread steven prothero
Hello, Yes, make sure docker & ntp is setup on the new node first. Also, make sure the public key is added on the new node and firewall is allowing it through ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le

[ceph-users] Re: Help add node to cluster using cephadm

2020-07-21 Thread steven prothero
Hello, is podman installed on the new node? also make sure the NTP time sync is on for new node. The ceph orch checks those on the new node and then dies if not ready with an error like you see. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubs

[ceph-users] Re: EC profile datastore usage - question

2020-07-20 Thread Steven Pine
gt; > ___ > > ceph-users mailing list -- ceph-users@ceph.io > > To unsubscribe send an email to ceph-users-le...@ceph.io > _______ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscrib

[ceph-users] Re: Module 'cephadm' has failed: auth get failed: failed to find client.crash.ceph0-ote in keyring retval:

2020-07-06 Thread steven
Hello, I used: ceph auth ls to list all services keys and authorize configs to find that info and added it to the keyring file that was missing it it worked for a few days and now this morning same error warning but the caps mgr = "profile crash" caps mon = "profile crash" are still in

[ceph-users] Re: Module 'cephadm' has failed: auth get failed: failed to find client.crash.ceph0-ote in keyring retval:

2020-07-05 Thread steven prothero
Hello, same here. My fix was to examine the keyring file in the misbehaving server and compare to a different server, I found the file had the key but was missing: caps mgr = "profile crash" caps mon = "profile crash" I added that back in and now its OK. /var/lib/ceph/./crash.node1/keyring No

[ceph-users] How to ceph-volume on remote hosts?

2020-06-23 Thread steven prothero
Hello, I am new to CEPH and on a few test servers attempting to setup and learn a test ceph system. I started off the install with the "Cephadm" option and it uses podman containers. Followed steps here: https://docs.ceph.com/docs/master/cephadm/install/ I ran the bootstrap, added remote hosts,

[ceph-users] Re: Space leak in Bluestore

2020-03-24 Thread Steven Pine
; I posted the same message in the issue tracker, > https://tracker.ceph.com/issues/44731 > > -- > Vitaliy Filippov > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an ema

[ceph-users] Re: [EXTERNAL] How can I fix "object unfound" error?

2020-03-02 Thread Steven . Scheit
Can you share "ceph pg 6.36a query" output Steve On 3/2/20, 2:53 AM, "Simone Lazzaris" wrote: Hi there; I've got a ceph cluster with 4 nodes, each with 9 4TB drives. Last night a disk failed, and unfortunately this lead to a kernel panic on the hosting server (supermicro: ne