Excellent info Anthony
Many thanks
Steven
On Fri, 1 Aug 2025 at 09:29, Anthony D'Atri wrote:
>
> The servers are dedicated to Ceph
> Yes, it is perhaps too much but my IT philosophy is "there is always room
> for more RAM" as it usually helps running things faster
&g
, it would probably make sense to
tune it manually , no ?
How would I check status of autotune ...other than checking individual OSD
config ?
Many thanks
Steven
On Thu, 31 Jul 2025 at 10:43, Anthony D'Atri wrote:
> IMHO the autotuner is awesome.
>
> 1TB of RAM is an embarrassment of r
with something like this
service_type: prometheus
service_name: prometheus
placement:
hosts:
- host01
- host02
networks:
- 192.169.142.0/24
Is there a better fix ?
Steven
On Wed, 23 Jul 2025 at 11:14, Ryan Sleeth wrote:
> Make sure Grafana's self-signed certs are permitted
thanks
Steven
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
his is 1GB
objecter inflight ops = 24576
Many thanks
Steven
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
...
Any ideas or suggestions for properly applying the change would be
appreciated
Steven
[image: image.png]
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
removing both docker and cephadm, rebooting and then reinstalling fixed the
issue
Thank you for your willingness to help
Steven
On Thu, 24 Jul 2025 at 12:22, Adam King wrote:
> I can't say I know why this is happening, but I can try to give some
> context into what cephadm is do
to 24.04.2
remove/reinstall docker
delete /etc/ceph/*, /var/lib/ceph/*, /etc/systemd/system/ceph*
pkill -9 -f ceph*
reboot
Thanks
Steven
On Thu, 24 Jul 2025 at 12:22, Adam King wrote:
> I can't say I know why this is happening, but I can try to give some
> c
rapping I’m not sure about the z on the next line, but -v
> can be considered like a bind mount of the first path so that container
> sees it on the second path.
>
> Now, as to what happened to /tmp/ceph-tmp, I can’t say.
>
>
> > On Jul 24, 2025, at 10:53 AM, Steven Vaca
/ ideas for how to troubleshoot /fix this ?
Many thanks
Steven
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
Hi Malte
why " And do not use Ubuntu 24.04." please ?
I just reinstalled my cluster and use 24.04 and 19.2.2. so , if need be,
there is still time to redo / reconfigure
Steven
On Tue, 22 Jul 2025 at 04:05, Malte Stroem wrote:
> Hello Stéphane,
>
> I think, you're mix
ing else that I can try (other than a reboot) ?
Many thanks
Steven
On Mon, 21 Jul 2025 at 17:25, Anthony D'Atri wrote:
> Look at /var/lib/ceph on ceph-host-7 for a leftover directory for osd.8
>
> Also try
> ceph osd crush remove osd.8
> ceph auth del osd.8
> ceph osd rm
Thanks for the suggestion
Unfortunately "ceph mgr fail" did not solve the issue
Is there a better way to "fail" ?
Steven
On Thu, 17 Jul 2025 at 14:04, Anthony D'Atri wrote:
> Try failing the mgr
>
> > On Jul 17, 2025, at 1:48 PM, Steven Vacaroaia
Awesome, thanks for the info!
Steven
On Thu, 17 Jul 2025 at 13:11, Malte Stroem wrote:
> Hi Steven,
>
> there is no need for ceph-common.
>
> You can mount the CephFS with the mount command because the Ceph kernel
> client is part of the kernel for a long time now.
&
Hi,
I noticed there is no client /rpms for Rocky8 (el8) on 19.2.2 repository
Would ceph-common for reef allow me to mount cephfs file systems without
issues ?
Many thanks
Steven
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an
Thanks Anthony
changing the scheduler will require to restart all OSDs , right
using "ceph orch restart osd "
Is this done in a staggering manner or I need to "stagger " them ?
Steven
On Fri, 11 Jul 2025 at 12:14, Anthony D'Atri wrote:
> What you describe s
ecovery_max_active_hdd ... etc)
redeploying some of the OSDs that were "UP_PRIMARY but part of the
backfill_wait PGs
query the PGs and look for a "stuck reason"
stop scrub and deep-scrub
repair the PGs (some)
change the pg_autoscale_mode to true
s
rotational: 0
size: 6000G:8000G
service_id: nvme_osd
crush_device_class: nvme_class
placement:
host_pattern: *
spec:
data_devices
rotational: 0
size: 4000G:5500G
Many thanks
Steven
On Sun, 29 Jun 2025 at 17:45, Anthony D'Atri wrote:
> So you have NVMe SSD OSDs,
Hi Janne
Thanks
That make sense since I have allocated 196GB for DB and 5 GB for WALL for
all 42 spinning OSDs
Again, thanks
Steveb
On Sun, 29 Jun 2025 at 12:02, Janne Johansson wrote:
> Den sön 29 juni 2025 kl 17:22 skrev Steven Vacaroaia :
>
>> Hi,
>>
>> I just built
adm .?
Is there a documented procedure for fixing this if it was not done from the
beginning ?
Many thanks
Steven
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
Hi,
Thanks for your suggestions
There is nothing significant in the log files expect "transitioning to
stray" ( see attached)
Restarting the daemons does not help as , after few minutes, they are
complaining again
ALL my HDD based OSDs on ALL 7 hosts are complaining
[image: image.p
lp
Anyone else having this issue ?
Thanks
Steven
bdev_enable_discard: "true" # quote
bdev_async_discard_threads: "1" # quote
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
excellent resource
Many thanks
Steven
On Thu, 8 May 2025 at 09:34, Anthony D'Atri wrote:
> https://docs.ceph.com/en/latest/cephadm/services/monitoring/ May help
>
> On May 8, 2025, at 9:05 AM, Steven Vacaroaia wrote:
>
> Hi,
>
> I thought about that and disab
e a way to tell ceph orch to use a specific version ?
or
is there a way to deploy daemons using a "pulled " version
Many thanks
Steven
On Thu, 8 May 2025 at 05:03, Anthony D'Atri wrote:
> Any chance you have a fancy network proxy ?
>
> On May 8, 2025, at 1:45 AM, Steven Vac
ay.io/ceph
[image: image.png]
Any hints ?
Steven
On Wed, 7 May 2025 at 15:24, Adam King wrote:
> I would think you just need to remove the two instances of "{{ v4v6_flag
> }}" from
> https://github.com/ceph/ceph/blob/main/src/pybind/mgr/cephadm/templates/services/ingress/
Hi Adam
Thanks for offering to help
yes, I did a ceph config set using below template
as some people reported that it will help solve NFS HA issue ( e.g.
haproxy,cfg deployed missing "check")
Now neither NFS nor RGW works :-(
How do I fix this ?
thanks
Steven
https://github.com
Hi,
I am unable to deploy ingress service because "v4v6_flag" is undefined
I couldn't find any information about this flag
The ingress.yaml file used is similar with this one
Any help would be greatly appreciated
Steven
service_type: ingress
service_id: rgw
placement:
ho
Hi
I am testing NFS trying to make sure that, deploying it as below will give
me redundancy
ceph nfs cluster create main "1 ceph-01,ceph-02" --port 2049 --ingress
--virtual-ip 10.90.0.90
For what I have read so far ( and some of the posts on this list)
the only way to get the NFS "surviving "
Hi,
Could you please recommend a vendor that sells CEPH appliances ?
( preferable a CEPH+PROXMOX)
In USA or Canada would be great but , not necessary
Among the ones I know are
Eurostor ( Germany)
45drives ( Canada)
Many thanks
Steven
___
ceph
stest media i
have available ?
On Thu, 24 Nov 2022 at 13:37, Yuval Lifshitz wrote:
> Hi Steven,
> When using synchronous (=non-persistent) notifications, the overall rate
> is dependent on the latency between the RGW and the endpoint to which you
> are sending the notifications. The prot
ainly down to being throttled by using 1 rgw rather than
all the rgw's the async method allows.
We would prefer to use persistent but can't get the throughput we need, any
suggestions would be much appreciated.
Thanks
Steven
___
ceph-users
Hi,
Thanks Eugen,I found some similar docs on the Redhat site as well and made
a Ansible playbook to follow the steps.
Cheers
On Thu, 17 Nov 2022 at 13:28, Steven Goodliff wrote:
> Hi,
>
> Is there a recommended way of shutting a cephadm cluster down completely?
>
> I tried u
Hi,
Is there a recommended way of shutting a cephadm cluster down completely?
I tried using cephadm to stop all the services but hit the following
message.
"Stopping entire osd.osd service is prohibited"
Thanks
___
ceph-users mailing list -- ceph-user
Hi,
>From what I've discovered so far with one bucket and one topic max out on our
>system around ~1k second notifications but multiple buckets with multiple
>topics (even if the topics all point to the same push endpoint gives more
>performance), still digging.
Steve
bject requests have
finished. Are there any configuration options i can look at trying ?
Thanks
Steven Goodliff
Global Relay
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
.
Cheers
Steven Goodliff
From: Robert Gallop
Sent: 13 July 2022 16:55
To: Adam King
Cc: Steven Goodliff; ceph-users@ceph.io
Subject: Re: [ceph-users] Re: cephadm host maintenance
This brings up a good follow on…. Rebooting in general for OS patching.
I have not
witch active Mgrs with 'ceph mgr
fail node2-cobj2-atdev1-nvan.ghxlvw'
on one instance. should cephadm handle the switch ?
thanks
Steven Goodliff
Global Relay
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email t
gt; osd_op_queue = wpq
> > osd_op_queue_cut_off = high
>
>
> Afaik, the default for osd_op_queue_cut_off was set to low by mistake
> prior to Octopus.
>
>
> Peter
>
>
> ___
> ceph-users mailing list -- ceph-users@ceph.
and room for
error and bugs this can cause is not recommended.
On Thu, Mar 11, 2021 at 3:27 PM Dave Hall wrote:
> Steven,
>
> In my current hardware configurations each NVMe supports multiple OSDs.
> In my earlier nodes it is 8 OSDs sharing one NVMe (which is also too
> small). I
;
> >> > ____
> >> > This message is confidential and is for the sole use of the intended
> >> > recipient(s). It may also be privileged or otherwise protected by
> >> copyright
> >> > or other legal rules. If you h
> > by reply email and delete it from your system. It is prohibited to copy
> > this message or disclose its content to anyone. Any confidentiality or
> > privilege is not waived or lost by any mistaken delivery or unauthorized
> > disclosure of the message. All messages
llocated.
>
> In terms of making this easier, we're looking to automate rolling format
> changes across a cluster with cephadm in the future.
>
> Josh
>
> On 2/16/21 9:58 AM, Steven Pine wrote:
> > Will there be a well documented strategy / method for changing block
>
t;> [1] https://docs.ceph.com/en/latest/radosgw/layout/
> >> [2] https://github.com/ceph/ceph/pull/32809
> >> [3] https://www.spinics.net/lists/ceph-users/msg45755.html
> >
>
> --
> Loïc Dachary, Artisan Logiciel Libre
>
>
>
S3 gateways have much more computer power and bandwidth to internet then
> it is used right now.
>
> Thank you
>
> Regards
> Michal Strnad
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to
imilar number of NVMe drives
> will bottleneck. Unless perhaps you have the misfortune of a chassis
> manufacturer who for some reason runs NVMe PCI lanes *though* an HBA.
>
>
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
>
older ceph version not accessible anymore on docs.ceph.com
> >>
> >> It's changed UI because we're hosting them on readthedocs.com now. See
> >> the dropdown in the lower right corner.
> >>
> >
> ___
>
; > ...
> >
> > The USED = 3 * STORED in 3-replica mode is completely right, but for EC
> 4+2 pool
> > (for default-fs-data0 )
> >
> > the USED is not equal 1.5 * STORED, why...:(
> >
> >
> > ___
> > ceph-users mailing list -- ceph-users@c
Hello,
octopus 15.2.4
just as a test, I put my OSDs each inside of a LXD container. Set up
cephFS and mounted it inside a LXD container and it works.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph
Hello,
I started a new fresh ceph cluster and have the exact same problem and
also the slow op warnings.
I found this bug report that seems to be about this problem:
https://158.69.68.89/issues/46743
"... mgr/devicehealth: device_health_metrics pool gets created even
without any OSDs in the clus
pg repair attempt?
Thank you for any suggestions or advice,
--
Steven Pine
webair.com
*P* 516.938.4100 x
*E * steven.p...@webair.com
<https://www.facebook.com/WebairInc/>
<https://www.linkedin.com/company/webair>
___
ceph-users mailing list
: layering, exclusive-lock, object-map, fast-diff,
deep-flatten
op_features:
flags:
create_timestamp: Thu Nov 29 13:56:28 2018
On Mon, 10 Aug 2020 at 09:21, Jason Dillaman wrote:
> On Fri, Aug 7, 2020 at 2:37 PM Steven Vacaroaia wrote:
> >
> > Hi,
> &
Hi,
I would appreciate any help/hints to solve this issue
iscis (gwcli) cannot see the images anymore
This configuration worked fine for many months
What changed was that ceph is "nearly full"
I am in the process of cleaning it up ( by deleting objects from one of the
pools)
and I do see reads
Hello,
on my system it solved it but then a different node suddenly started
the same error. I tried it on the new problem and it did not help.
I notice on:
https://tracker.ceph.com/issues/45726
it says resolved, but on next version v15.2.5
___
ceph
Hello,
Yes, make sure docker & ntp is setup on the new node first.
Also, make sure the public key is added on the new node and firewall
is allowing it through
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le
Hello,
is podman installed on the new node? also make sure the NTP time sync
is on for new node. The ceph orch checks those on the new node and
then dies if not ready with an error like you see.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubs
gt; > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> _______
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscrib
Hello,
I used:
ceph auth ls
to list all services keys and authorize configs to find that info and added it
to the keyring file that was missing it
it worked for a few days and now this morning same error warning but the
caps mgr = "profile crash"
caps mon = "profile crash"
are still in
Hello,
same here. My fix was to examine the keyring file in the misbehaving
server and compare to a different server, I found the file had the key
but was missing:
caps mgr = "profile crash"
caps mon = "profile crash"
I added that back in and now its OK.
/var/lib/ceph/./crash.node1/keyring
No
Hello,
I am new to CEPH and on a few test servers attempting to setup and
learn a test ceph system.
I started off the install with the "Cephadm" option and it uses podman
containers.
Followed steps here:
https://docs.ceph.com/docs/master/cephadm/install/
I ran the bootstrap, added remote hosts,
; I posted the same message in the issue tracker,
> https://tracker.ceph.com/issues/44731
>
> --
> Vitaliy Filippov
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an ema
Can you share "ceph pg 6.36a query" output
Steve
On 3/2/20, 2:53 AM, "Simone Lazzaris" wrote:
Hi there;
I've got a ceph cluster with 4 nodes, each with 9 4TB drives.
Last night a disk failed, and unfortunately this lead to a kernel panic on
the hosting server
(supermicro: ne
61 matches
Mail list logo