[ceph-users] Libvirt and Ceph: libvirtd tries to open random RBD images

2023-12-01 Thread Jayanth Reddy
Hello Users, We're using libvirt with KVM and the orchestrator is Cloudstack. I raised the issue already at Cloudstack at https://github.com/apache/cloudstack/issues/8211 but appears to be at libvirtd. Did the same in libvirt ML at https://lists.libvirt.org/archives/list/us...@lists.libvirt.org/thr

[ceph-users] Ceph 17.2.7 to 18.2.0 issues

2023-12-01 Thread pclark6063
Hi All, Recently I upgraded my cluster from Quincy to Reef. Everything appeared to go smoothly and without any issues arising. I was forced to poweroff the cluster, performing the ususal procedures beforehand and everything appears to have come back fine. Every service reports green across the

[ceph-users] Compilation failure when building Ceph on Ubuntu

2023-12-01 Thread Yong Yuan
Hi, I'm trying to build a DEBUG version of Ceph Reef on a virtual Ubuntu-LTS 22.04 running on Lima by following the README on Ceph's github repo. The build failed and the last CMake error was ""g++-11: error: unrecognized command-line option '-Wimplicit-const-int-float-conversion'". Does anyone kn

[ceph-users] Re: Stray host/daemon

2023-12-01 Thread Jeremy Hansen
Found my previous post regarding this issue. Fixed by restarting mgr daemons. -jeremy > On Friday, Dec 01, 2023 at 3:04 AM, Me (mailto:jer...@skidrow.la)> wrote: > I think I ran in to this before but I forget the fix: > > HEALTH_WARN 1 stray host(s) with 1 daemon(s) not managed by cephadm > [WR

[ceph-users] Re: ceph osd dump_historic_ops

2023-12-01 Thread E Taka
This small (Bash) wrapper around the "ceph daemon" command, especially the auto-completeion with the TAB key, ist quite helpful, IMHO: https://github.com/test-erik/ceph-daemon-wrapper Am Fr., 1. Dez. 2023 um 15:03 Uhr schrieb Phong Tran Thanh < tranphong...@gmail.com>: > It works!!! > > Thanks Ka

[ceph-users] Re: reef 18.2.1 QE Validation status

2023-12-01 Thread Igor Fedotov
Hi Yuri, Looks like that's not THAT critical and complicated as it's been thought originally. User has to change bluefs_shared_alloc_size to be exposed to the issue. So hopefully I'll submit a patch on Monday to close this gap and we'll be able to proceed. Thanks, Igor On 01/12/2023 18:16

[ceph-users] Re: About ceph osd slow ops

2023-12-01 Thread Josh Baergen
Given that this is s3, are the slow ops on index or data OSDs? (You mentioned HDD but I don't want to assume that meant that the osd you mentioned is data) Josh On Fri, Dec 1, 2023 at 7:05 AM VÔ VI wrote: > > Hi Stefan, > > I am running replicate x3 with a failure domain as host and setting > mi

[ceph-users] Re: How to identify the index pool real usage?

2023-12-01 Thread Anthony D'Atri
>> >> Today we had a big issue with slow ops on the nvme drives which holding >> the index pool. >> >> Why the nvme shows full if on ceph is barely utilized? Which one I should >> belive? >> >> When I check the ceph osd df it shows 10% usage of the osds (1x 2TB nvme >> drive has 4x osds on it):

[ceph-users] Re: How to identify the index pool real usage?

2023-12-01 Thread David C.
Hi, It looks like a trim/discard problem. I would try my luck by activating the discard on a disk, to validate. I have no feedback on the reliability of the bdev_*_discard parameters. Maybe dig a little deeper into the subject or if anyone has any feedback... ___

[ceph-users] Re: reef 18.2.1 QE Validation status

2023-12-01 Thread Yuri Weinstein
Venky, pls review the test results for smoke and fs after the PRs were merged. Radek, Igor, Adam - any updates on https://tracker.ceph.com/issues/63618? Thx On Thu, Nov 30, 2023 at 8:08 AM Yuri Weinstein wrote: > > The fs PRs: > https://github.com/ceph/ceph/pull/54407 > https://github.com/ceph/

[ceph-users] How to identify the index pool real usage?

2023-12-01 Thread Szabo, Istvan (Agoda)
Hi, Today we had a big issue with slow ops on the nvme drives which holding the index pool. Why the nvme shows full if on ceph is barely utilized? Which one I should belive? When I check the ceph osd df it shows 10% usage of the osds (1x 2TB nvme drive has 4x osds on it): ID CLASS WEIGHT

[ceph-users] Duplicated device IDs

2023-12-01 Thread Nicola Mori
Dear Ceph users, I am replacing some small disks on one of my hosts with bigger ones. I delete the OSD from the web UI, preserving the ID for replacement, then after the rebalancing is finished I change the disk and the cluster automatically re-creates the OSD with the same ID. Then I adjust t

[ceph-users] Re: About ceph osd slow ops

2023-12-01 Thread VÔ VI
Hi Stefan, I am running replicate x3 with a failure domain as host and setting min_size pool is 1. Because my cluster s3 traffic real time and can't stop or block IO, the data may be lost but IO alway available. I hope my cluster can run with two nodes unavailable. After that two nodes is down at

[ceph-users] Re: ceph osd dump_historic_ops

2023-12-01 Thread Phong Tran Thanh
It works!!! Thanks Kai Stian Olstad Vào Th 6, 1 thg 12, 2023 vào lúc 17:06 Kai Stian Olstad < ceph+l...@olstad.com> đã viết: > On Fri, Dec 01, 2023 at 04:33:20PM +0700, Phong Tran Thanh wrote: > >I have a problem with my osd, i want to show dump_historic_ops of osd > >I follow the guide: > > >

[ceph-users] Re: ceph fs (meta) data inconsistent

2023-12-01 Thread Frank Schilder
Hi Xiubo, I uploaded a test script with session output showing the issue. When I look at your scripts, I can't see the stat-check on the second host anywhere. Hence, I don't really know what you are trying to compare. If you want me to run your test scripts on our system for comparison, please

[ceph-users] Stray host/daemon

2023-12-01 Thread Jeremy Hansen
I think I ran in to this before but I forget the fix: HEALTH_WARN 1 stray host(s) with 1 daemon(s) not managed by cephadm [WRN] CEPHADM_STRAY_HOST: 1 stray host(s) with 1 daemon(s) not managed by cephadm stray host cn06.ceph.fu.intra has 1 stray daemons: ['mon.cn03'] Pacific 16.2.11 How do I cl

[ceph-users] Re: ceph osd dump_historic_ops

2023-12-01 Thread Kai Stian Olstad
On Fri, Dec 01, 2023 at 04:33:20PM +0700, Phong Tran Thanh wrote: I have a problem with my osd, i want to show dump_historic_ops of osd I follow the guide: https://www.ibm.com/docs/en/storage-fusion/2.6?topic=alerts-cephosdslowops But when i run command ceph daemon osd.8 dump_historic_ops show t

[ceph-users] Re: ceph osd dump_historic_ops

2023-12-01 Thread Robert Sander
On 12/1/23 10:33, Phong Tran Thanh wrote: ceph daemon osd.8 dump_historic_ops show the error, the command run on node with osd.8 Can't get admin socket path: unable to get conf option admin_socket for osd: b"error parsing 'osd': expected string of the form TYPE.ID, valid types are: auth, mon, os

[ceph-users] Re: About ceph osd slow ops

2023-12-01 Thread Stefan Kooman
On 01-12-2023 08:45, VÔ VI wrote: Hi community, My cluster running with 10 nodes and 2 nodes goes down, sometimes the log shows the slow ops, what is the root cause? My osd is HDD and block.db and wal is 500GB SSD per osd. Health check update: 13 slow ops, oldest one blocked for 167 sec, osd.10

[ceph-users] ceph osd dump_historic_ops

2023-12-01 Thread Phong Tran Thanh
Hi community, I have a problem with my osd, i want to show dump_historic_ops of osd I follow the guide: https://www.ibm.com/docs/en/storage-fusion/2.6?topic=alerts-cephosdslowops But when i run command ceph daemon osd.8 dump_historic_ops show the error, the command run on node with osd.8 Can't ge