from:"david"

[ceph-users] Re: v19.2.3 Squid released

2025-08-01 Thread David Orman

for-profit changing direction. Cheers, David On Fri, Aug 1, 2025, at 02:34, Alexander Patrakov wrote: > Just as another consideration, AlmaLinux is targeting x86_64 v2, while > other RHEL clones target v3. So, basing Ceph builds and containers on > AlmaLinux will make containerized builds a

[ceph-users] Re: Test Cluster / Performance Degradation After Adding Private Network

2025-07-18 Thread David Rivera

This sounds like a network configuration issue to me. The fact that you mention ssh'ing into the nodes or running apt get is slow sounds like DNS timeouts. Make sure that you only have IP address and subnet configured on your cluster network interface (no gateway or DNS). On Fri, Jul 18, 2025, 16:

[ceph-users] Re: Ceph usage doubled after update

2025-05-05 Thread David Orman

ent allocation sizes, and balances on PG counts, so you'll need to disable it until fully repaved (and potentially engage in manual balancing operations with somthing like the JJ balancer). I suggest this because you may get back a decent amount of space, depending on your current setup.

[ceph-users] Re: HEALTH_ERR: 1 MDSs report damaged metadata - damage_type=dentry

2025-04-18 Thread David C.

gt; > > On Apr 17, 2025, at 11:39 AM, Christophe DIARRA < > christophe.dia...@idris.fr> wrote: > > > > Hello David, > > > > The SSD model is VO007680JWZJL. > > > > I will delay the 'ceph tell mds.cfs_irods_test:0 damage rm 241447932

[ceph-users] Re: HEALTH_ERR: 1 MDSs report damaged metadata - damage_type=dentry

2025-04-17 Thread David C.

eat of the moment, I have no other idea than to delete the dentry. ceph tell mds.cfs_irods_test:0 damage rm 241447932 However, in production, this results in the content (of dir /testdir[12]) being abandoned. Le jeu. 17 avr. 2025 à 12:44, Christophe DIARRA a écrit : > Hello David, > > Th

[ceph-users] Re: Dump/Add users yaml/json

2024-12-04 Thread David C.

Hi, In this case, the tool that adds the account should perform a caps check (for security reasons) and probably use get-or-create/caps (not import) Le mer. 4 déc. 2024 à 10:42, Albert Shih a écrit : > Le 03/12/2024 à 18:27:57+0100, David C. a écrit > Hi, > > > > > (o

[ceph-users] Re: Dump/Add users yaml/json

2024-12-03 Thread David C.

Hi Albert, (open question, without judgment) What is the purpose of importing users recurrently ? It seems to me that import is the complement of export, to restore. Creating in ceph and exporting (possibly) in json format is not enough ? Le mar. 3 déc. 2024 à 13:29, Albert Shih a écrit : > Le

[ceph-users] Re: How to modify the destination pool name in the rbd-mirror configuration?

2024-11-26 Thread David Yang

ntion to the correct configs etc. > Each cluster has its own UUID and separate MONs, it should just work. > If not, let us know. ;-) > > Zitat von David Yang : > > > Hi Eugen > > > > Do you mean that it is possible to create multiple clusters on one > > infrastr

[ceph-users] Re: How to modify the destination pool name in the rbd-mirror configuration?

2024-11-26 Thread David Yang

up clusters and configure mirroring on both of them. You > should already have enough space (enough OSDs) to mirror both pools, > so that would work. You can colocate MONs with OSDs so you don't need > additional hardware for that. > > Regards, > Eugen > > Zitat von Da

[ceph-users] How to modify the destination pool name in the rbd-mirror configuration?

2024-11-26 Thread David Yang

Hello, everyone. Under normal circumstances, we synchronize from PoolA of ClusterA to PoolA of ClusterB (same pool name), which is also easy to configure. Now the requirements are as follows: ClusterA/Pool synchronizes to BackCluster/PoolA ClusterB/Pool synchronizes to BackCluster/PoolB After re

[ceph-users] How to synchronize pools with the same name in multiple clusters to multiple pools in one cluster

2024-11-26 Thread David Yang

We have ceph clusters in multiple regions to provide rbd services. We are currently preparing a remote backup plan, which is to synchronize pools with the same name in each region to different pools in one cluster. For example: Cluster A Pool synchronized to backup cluster poolA Cluster B Pool sy

[ceph-users] Re: no recovery running

2024-10-29 Thread David Turner

I was running into that as well. Setting `osd_mclock_override_recovery_settings` [1] to true allowed me to manage osd_max_backfills again and get recovery to start happening again. It's on my todo list to understand mclock profiles, but resizing PGs was a nightmare with it. Changing to override the

[ceph-users] Re: Ubuntu 24.02 LTS Ceph status warning

2024-10-16 Thread David Orman

de of seemingly unrelated issues due to that behavior with AppArmor on Ubuntu 24.04. David On Thu, Oct 10, 2024, at 02:51, Dominique Ramaekers wrote: > I manage a 4 hosts cluster on Ubuntu 22.04 LTS with ceph installed > trough cephad and containers on Docker. > > Last month, I've migra

[ceph-users] Re: cephadm crush_device_class not applied

2024-10-03 Thread David Orman

I'm not sure, but that's going to break with a lot of people's Pacific specifications when they upgrade. We heavily utilize this functionality, and use different device class names for a lot of good reasons. This seems like a regression to me. David On Thu, Oct 3, 2024, at 16:

[ceph-users] CLT meeting notes: Sep 09, 2024

2024-09-10 Thread David Orman

CLT discussion on Sep 09, 2024 19.2.0 release: * Cherry picked patch: https://github.com/ceph/ceph/pull/59492 * Approvals requested for re-runs CentOS Stream/distribution discussions ongoing * Significant implications in infrastructure for building/testing requiring ongoing discussions/work to d

[ceph-users] Re: Prefered distro for Ceph

2024-09-05 Thread David Orman

27;s been one of the best decisions I've ever made. David On Thu, Sep 5, 2024, at 10:09, Albert Shih wrote: > Le 05/09/2024 à 11:06:27-0400, Anthony D'Atri a écrit > > Hi, > >> The bare metal has to run *something*, whether Ceph is run from packages or >> containers

[ceph-users] Re: Pull failed on cluster upgrade

2024-08-06 Thread David Orman

What operating system/distribution are you running? What hardware? David On Tue, Aug 6, 2024, at 02:20, Nicola Mori wrote: > I think I found the problem. Setting the cephadm log level to debug and > then watching the logs during the upgrade: > >ceph config set mgr

[ceph-users] Re: Unable to mount with 18.2.2

2024-07-18 Thread David C.

", "addr": "(IP):6789", "nonce": 0 }, { "type": "v2", "addr": "(IP):3300", "nonce": 0 } ] } After osd restart : [aevoo-test - ceph-0]# ceph report |jq '.osdmap.osds

[ceph-users] Re: Small issue with perms

2024-07-18 Thread David C.

Thanks Christian, I see the fix is on the postinst, so probably the reboot shouldn't put "nobody" back, right? Le jeu. 18 juil. 2024 à 11:44, Christian Rohmann < christian.rohm...@inovex.de> a écrit : > On 18.07.24 9:56 AM, Albert Shih wrote: > >Error scraping /var/lib/ceph/crash: [Errno 13

[ceph-users] Re: Small issue with perms

2024-07-18 Thread David C.

h a écrit > > Le 18/07/2024 à 10:56:33+0200, David C. a écrit > > > Hi, > > > > > > Your ceph processes are in containers. > > > > Yes I know but in my install process I just install > > > > ceph-common > > ceph-base > > > >

[ceph-users] Re: Small issue with perms

2024-07-18 Thread David C.

Your ceph processes are in containers. You don't need the ceph-* packages on the host hosting the containers Cordialement, *David CASIER* *Ligne directe: +33(0) 9 72 61

[ceph-users] Re: Small issue with perms

2024-07-18 Thread David C.

Hi Albert, perhaps a conflict with the udev rules of locally installed packages. Try uninstalling ceph-* Le jeu. 18 juil. 2024 à 09:57, Albert Shih a écrit : > Hi everyone. > > After my upgrade from 17.2.7 to 18.2.2 I notice after each time I restart I > got a issue with perm on > > /var/lib

[ceph-users] Re: Unable to mount with 18.2.2

2024-07-17 Thread David C.

Hi, It would seem that the order of declaration of mons addresses (v2 then v1 and not the other way around) is important. Albert restarted all services after this modification and everything is back to normal Le mer. 17 juil. 2024 à 09:40, David C. a écrit : > Hi Frédéric, > > The

[ceph-users] Re: Unable to mount with 18.2.2

2024-07-17 Thread David C.

er of declaration of my_host (v1 before v2) but apparently, that's not it. Le mer. 17 juil. 2024 à 09:21, Frédéric Nass a écrit : > Hi David, > > Redeploying 2 out of 3 MONs a few weeks back (to have them using RocksDB > to be ready for Quincy) prevented some clients from connectin

[ceph-users] Re: Unable to mount with 18.2.2

2024-07-16 Thread David C.

h a écrit : > Le 16/07/2024 à 15:04:05+0200, David C. a écrit > Hi, > > > > > I think it's related to your network change. > > I though about it but in that case why the old (and before upgrade) server > works ? > > > Can you send me the return of "

[ceph-users] Re: Unable to mount with 18.2.2

2024-07-16 Thread David C.

Hi Albert, I think it's related to your network change. Can you send me the return of "ceph report" ? Le mar. 16 juil. 2024 à 14:34, Albert Shih a écrit : > Hi everyone > > My cluster ceph run currently 18.2.2 and ceph -s say everything are OK > > root@cthulhu1:/var/lib/ceph/crash# ceph -s >

[ceph-users] pg deep-scrub control scheme

2024-06-26 Thread David Yang

Hello everyone. I have a cluster with 8321 pgs and recently I started to get pg not deep-scrub warnings. The reason is that I reduced max_scrub to avoid the impact of scrub on IO. Here is my current scrub configuration: ~]# ceph tell osd.1 config show|grep scrub "mds_max_scrub_ops_in_progress":

[ceph-users] Re: wrong public_ip after blackout / poweroutage

2024-06-21 Thread David C.

Hi, This type of incident is often resolved by setting the public_network option to the "global" scope, in the configuration: ceph config set global public_network a:b:c:d::/64 Le ven. 21 juin 2024 à 03:36, Eugen Block a écrit : > Hi, > > this only a theory, not a proven answer or something.

[ceph-users] Re: Incomplete PGs. Ceph Consultant Wanted

2024-06-17 Thread David C.

In Pablo's unfortunate incident, it was because of a SAN incident, so it's possible that Replica 3 didn't save him. In this scenario, the architecture is more the origin of the incident than the number of replicas. It seems to me that replica 3 exists, by default, since firefly => make replica 2,

[ceph-users] Re: Incomplete PGs. Ceph Consultant Wanted

2024-06-17 Thread David C.

>> On Mon, Jun 17, 2024 at 11:46 AM Matthias Grandl < >> matthias.gra...@croit.io> wrote: >> >>> We are missing info here. Ceph status claims all OSDs are up. Did an OSD >>> die and was it already removed from the CRUSH map? If so the only chance I >>

[ceph-users] Re: Incomplete PGs. Ceph Consultant Wanted

2024-06-17 Thread David C.

ools min_size. >> > >> > If it is an EC setup, it might be quite a bit more painful, depending >> on what happened to the dead OSDs and whether they are at all recoverable. >> > >> > >> > Matthi

[ceph-users] Re: Incomplete PGs. Ceph Consultant Wanted

2024-06-17 Thread David C.

Hi Pablo, Could you tell us a little more about how that happened? Do you have a min_size >= 2 (or E/C equivalent) ? Cordialement, *David CASIER* Le lun. 17 juin 2024 à 16

[ceph-users] Re: Ceph crash :-(

2024-06-13 Thread David C.

In addition to Robert's recommendations, Remember to respect the update order (mgr->mon->(crash->)osd->mds->...) Before everything was containerized, it was not recommended to have different services on the same machine. Le jeu. 13 juin 2024 à 19:37, Robert Sander a écrit : > On 13.06.24 18:

[ceph-users] Re: [SPAM] Re: Ceph crash :-(

2024-06-13 Thread David C.

gt; personal desktop, but on servers where I keep data I’m doing it. > > but what canonical did in this case is… this is LTS version :/ > > > > > > BR, > > Sebastian > > > > > >> On 13 Jun 2024, at 19:47, David C. wrote: > >> > >

[ceph-users] Re: Recoveries without any misplaced objects?

2024-04-24 Thread David Orman

It is RGW, but the index is on a different pool. Not seeing any key/s being reported in recovery. We've definitely had OSDs flap multiple times. David On Wed, Apr 24, 2024, at 16:48, Anthony D'Atri wrote: > Do you see *keys* aka omap traffic? Especially if you have RGW set up? &

[ceph-users] Re: Recoveries without any misplaced objects?

2024-04-24 Thread David Orman

Did you ever figure out what was happening here? David On Mon, May 29, 2023, at 07:16, Hector Martin wrote: > On 29/05/2023 20.55, Anthony D'Atri wrote: >> Check the uptime for the OSDs in question > > I restarted all my OSDs within the past 10 days or so. Maybe OSD &g

[ceph-users] Re: Stuck in replay?

2024-04-23 Thread David Yang

Hi Erich When mds cache usage is very high, recovery is very slow. So I use command to drop mds cache: ceph tell mds.* cache drop 600 Lars Köppel 于2024年4月23日周二 16:36写道： > > Hi Erich, > > great that you recovered from this. > It sounds like you had the same problem I had a few months ago. > mds cr

[ceph-users] Re: DB/WALL and RGW index on the same NVME

2024-04-08 Thread David Orman

I would suggest that you might consider EC vs. replication for index data, and the latency implications. There's more than just the nvme vs. rotational discussion to entertain, especially if using the more widely spread EC modes like 8+3. It would be worth testing for your particular workload.

[ceph-users] Re: Impact of Slow OPS?

2024-04-06 Thread David C.

rasure code) *David* Le ven. 5 avr. 2024 à 19:42, adam.ther a écrit : > Hello, > > Do slow ops impact data integrity or can I generally ignore it? I'm > loading 3 hosts with a 10GB link and it saturating the disks or the OSDs. > > 2024-04-05T15:33:10.625922+ mon.CEPHADM

[ceph-users] Re: Erasure Code with Autoscaler and Backfill_toofull

2024-03-27 Thread David C.

Hi Daniel, Changing pg_num when some OSD is almost full is not a good strategy (or even dangerous). What is causing this backfilling? loss of an OSD? balancer? other ? What is the least busy OSD level (sort -nrk17) Is the balancer activated? (upmap?) Once the situation stabilizes, it becomes i

[ceph-users] Re: Call for Interest: Managed SMB Protocol Support

2024-03-26 Thread David Yang

This is great, we are currently using the smb protocol heavily to export kernel-mounted cephfs. But I encountered a problem. When there are many smb clients enumerating or listing the same directory, the smb server will experience high load, and the smb process will become D state. This problem has

[ceph-users] Re: Clients failing to advance oldest client?

2024-03-25 Thread David Yang

You can use the "ceph health detail" command to see which clients are not responding. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Clients failing to advance oldest client?

2024-03-25 Thread David Yang

It is recommended to disconnect the client first and then observe whether the cluster's slow requests recover. Erich Weiler 于2024年3月26日周二 05:02写道： > > Hi Y'all, > > I'm seeing this warning via 'ceph -s' (this is on Reef): > > # ceph -s >cluster: > id: 58bde08a-d7ed-11ee-9098-506b4b4d

[ceph-users] All MGR loop crash

2024-03-07 Thread David C.

> mon weight myself, do you know how that happened? > > Zitat von "David C." : > > Ok, got it : >> >> [root@pprod-admin:/var/lib/ceph/]# ceph mon dump -f json-pretty >> |egrep "name|weigh" >> dumped monmap epoch 14 >>

[ceph-users] Re: All MGR loop crash

2024-03-07 Thread David C.

;name": "pprod-mon3", "weight": 10, "name": "pprod-osd2", "weight": 0, "name": "pprod-osd1", "weight": 0, "name": "pprod-osd

[ceph-users] Re: All MGR loop crash

2024-03-07 Thread David C.

I took the wrong ligne => https://github.com/ceph/ceph/blob/v17.2.6/src/mon/MonClient.cc#L822 Le jeu. 7 mars 2024 à 18:21, David C. a écrit : > > Hello everybody, > > I'm encountering strange behavior on an infrastructure (it's > pre-production but it's very

[ceph-users] All MGR loop crash

2024-03-07 Thread David C.

Hello everybody, I'm encountering strange behavior on an infrastructure (it's pre-production but it's very ugly). After a "drain" on monitor (and a manager). MGRs all crash on startup: Mar 07 17:06:47 pprod-mon1 ceph-mgr[564045]: mgr ms_dispatch2 standby mgrmap(e 1310) v1 Mar 07 17:06:47 pprod-mo

[ceph-users] Re: has anyone enabled bdev_enable_discard?

2024-03-02 Thread David C.

user IO. Keep an eye on your > discards being sent to devices and the discard latency, as well (via > node_exporter, for example). > > Matt > > > On 2024-03-02 06:18, David C. wrote: > > I came across an enterprise NVMe used for BlueFS DB whose performance > > dr

[ceph-users] Re: has anyone enabled bdev_enable_discard?

2024-03-02 Thread David C.

I came across an enterprise NVMe used for BlueFS DB whose performance dropped sharply after a few months of delivery (I won't mention the brand here but it was not among these 3: Intel, Samsung, Micron). It is clear that enabling bdev_enable_discard impacted performance, but this option also saved

[ceph-users] Re: Seperate metadata pool in 3x MDS node

2024-02-24 Thread David C.

Hello, Each rack works on different trees or is everything parallelized ? The meta pools would be distributed over racks 1,2,4,5 ? If it is distributed, even if the addressed MDS is on the same switch as the client, you will always have this MDS which will consult/write (nvme) OSDs on the other ra

[ceph-users] Re: [Urgent] Ceph system Down, Ceph FS volume in recovering

2024-02-24 Thread David C.

re not comfortable I am making the simplest one on the ceph side but not the most comfortable on the business side. There is no one (in Vietnam?) who could help you more seriously (and in a lasting way) ? Le sam. 24 févr. 2024 à 15:55, a écrit : > Hi David, > > I ll follow your sug

[ceph-users] Re: [Urgent] Ceph system Down, Ceph FS volume in recovering

2024-02-24 Thread David C.

Do you have the possibility to stop/unmount cephfs clients ? If so, do that and restart the MDS. It should restart. Have the clients restart one by one and check that the MDS does not crash (by monitoring the logs) Cordialement, *David

[ceph-users] Re: [Urgent] Ceph system Down, Ceph FS volume in recovering

2024-02-23 Thread David C.

look at ALL cephfs kernel clients (no effect on RGW) Le ven. 23 févr. 2024 à 16:38, a écrit : > And we dont have parameter folder > > cd /sys/module/ceph/ > [root@cephgw01 ceph]# ls > coresize holders initsize initstate notes refcnt rhelversion > sections srcversion taint uevent > > My

[ceph-users] Re: [Urgent] Ceph system Down, Ceph FS volume in recovering

2024-02-23 Thread David C.

Hi, The problem seems to come from the clients (reconnect). Test by disabling metrics on all clients: echo Y > /sys/module/ceph/parameters/disable_send_metrics Cordialement, *David CAS

[ceph-users] Re: pacific 16.2.15 QE validation status

2024-02-07 Thread David Orman

That tracker's last update indicates it's slated for inclusion. On Thu, Feb 1, 2024, at 10:47, Zakhar Kirpichenko wrote: > Hi, > > Please consider not leaving this behind: > https://github.com/ceph/ceph/pull/55109 > > It's a serious bug, which potentially affects a whole node stability if > the

[ceph-users] Re: Debian 12 (bookworm) / Reef 18.2.1 problems

2024-02-05 Thread David Orman

large upstream project). I don't pretend to know the complexities around an alternative implementation, but it seems worth at least a cursory investigation, as behavior right now (prior to the blocking change) may be somewhat undefined even if not throwing errors, according to the above

[ceph-users] Re: How check local network

2024-01-29 Thread David C.

Hello Albert, this should return you the sockets used on the network cluster : ceph report | jq '.osdmap.osds[] | .cluster_addrs.addrvec[] | .addr' Cordialement, *Da

[ceph-users] Re: cephadm discovery service certificate absent after upgrade.

2024-01-25 Thread David C.

It would be cool, actually, to have the metrics working in 18.2.2, for IPv6 only Otherwise, everything works fine on my side. Cordialement, *David CASIER* Le jeu. 25 janv. 2024 à

[ceph-users] Re: Stupid question about ceph fs volume

2024-01-25 Thread David C.

risk if misused. Cordialement, *David CASIER* Le jeu. 25 janv. 2024 à 14:45, Eugen Block a écrit : > Oh right, I forgot about that, good point! But if that is (still) true >

[ceph-users] Re: Stupid question about ceph fs volume

2024-01-25 Thread David C.

In case the root is EC, it is likely that is not possible to apply the disaster recovery procedure, (no xattr layout/parent on the data pool). Cordialement, *David CASIER* Le jeu

[ceph-users] Re: Stupid question about ceph fs volume

2024-01-25 Thread David C.

Albert, Never used EC for (root) data pool. Le jeu. 25 janv. 2024 à 12:08, Albert Shih a écrit : > Le 25/01/2024 à 08:42:19+, Eugen Block a écrit > > Hi, > > > > it's really as easy as it sounds (fresh test cluster on 18.2.1 without > any > > pools yet): > > > > ceph:~ # ceph fs volume creat

[ceph-users] Re: Questions about the CRUSH details

2024-01-24 Thread David C.

.) Cordialement, *David CASIER* Le mer. 24 janv. 2024 à 17:49, Henry lol a écrit : > Hello, I'm new to ceph and sorry in advance for the naive questions. > > 1. > As far as I know, CRUSH utilize

[ceph-users] Re: How many pool for cephfs

2024-01-24 Thread David C.

per OSD, balancer, pg autoscaling, etc.) Cordialement, *David CASIER* Le mer. 24 janv. 2024 à 10:10, Albert Shih a écrit : > Le 24/01/2024 à 09:45:56+0100, Robert Sander a éc

[ceph-users] Re: cephadm discovery service certificate absent after upgrade.

2024-01-23 Thread David C.

According to sources, the certificates are generated automatically at startup. Hence my question if the service started correctly. I also had problems with IPv6 only, but I don't immediately have more info Cordialement, *David C

[ceph-users] Re: cephadm discovery service certificate absent after upgrade.

2024-01-23 Thread David C.

Is the cephadm http server service starting correctly (in the mgr logs)? IPv6 ? Cordialement, *David CASIER* Le mar. 23 janv. 2024 à 16:29, Nicolas FOURNIL a écrit : > He

[ceph-users] Re: cephadm discovery service certificate absent after upgrade.

2024-01-23 Thread David C.

/ceph/blob/main/src/pybind/mgr/cephadm/templates/services/prometheus/prometheus.yml.j2 Cordialement, *David CASIER* Le mar. 23 janv. 2024 à 10:56, Nicolas FOURNIL a écrit :

[ceph-users] Re: RFI: Prometheus, Etc, Services - Optimum Number To Run

2024-01-21 Thread David Orman

e your LGTM (or other choice) externally. Cheers, David On Fri, Jan 19, 2024, at 23:42, duluxoz wrote: > Hi All, > > In regards to the monitoring services on a Ceph Cluster (ie Prometheus, > Grafana, Alertmanager, Loki, Node-Exported, Promtail, etc) how many > instances should/

[ceph-users] Re: About ceph disk slowops effect to cluster

2024-01-09 Thread David Yang

The 2*10Gbps shared network seems to be full (1.9GB/s). Is it possible to reduce part of the workload and wait for the cluster to return to a healthy state? Tip: Erasure coding needs to collect all data blocks when recovering data, so it takes up a lot of network card bandwidth and processor resour

[ceph-users] CLT Meeting Minutes 2024-01-03

2024-01-03 Thread David Orman

Happy 2024! Today's CLT meeting covered the following: 1. 2024 brings a focus on performance of Crimson (some information here: https://docs.ceph.com/en/reef/dev/crimson/crimson/ ) 1. Status is available here: https://github.com/ceph/ceph.io/pull/635 2. There will be a new Crimson perform

[ceph-users] Re: mds generates slow request: peer_request, how to deal with it?

2024-01-01 Thread David Yang

ke 于2023年12月31日周日 16:57写道： > > Hi David, > > How does your filesystem looks like. We have a few folders with a lot of > subfolders, which are all randomly accessed. And I guess the balancer is > moving a lot of folders between the mds nodes. > We noticed that multiple active

[ceph-users] mds generates slow request: peer_request, how to deal with it?

2023-12-31 Thread David Yang

I hope this message finds you well. I have a cephfs cluster with 3 active mds, and use 3-node samba to export through the kernel. Currently, there are 2 node mds experiencing slow requests. We have tried restarting the mds. After a few hours, the replay log status became active. But the slow requ

[ceph-users] Re: FS down - mds degraded

2023-12-20 Thread David C.

Hi Sake, I would start by decrementing max_mds by 1: ceph fs set atlassian-prod max_mds 2 The mds.1 no longer restarts? logs? Le jeu. 21 déc. 2023 à 08:11, Sake Ceph a écrit : > Starting a new thread, forgot subject in the previous. > So our FS down. Got the following error, what can I do?

[ceph-users] Re: Osd full

2023-12-11 Thread David C.

Hi Mohamed, Changing weights is no longer a good practice. The balancer is supposed to do the job. The number of pg per osd is really tight on your infrastructure. Can you display the ceph osd tree command? Cordialement, *David CASIER

[ceph-users] Re: EC Profiles & DR

2023-12-05 Thread David C.

/painful to change profiles later (need data migration). Cordialement, *David CASIER* Le mar. 5 déc. 2023 à 12:35, Patrick Begou < patrick.be...@univ-grenoble-alpes.fr> a

[ceph-users] Re: EC Profiles & DR

2023-12-05 Thread David C.

). Cordialement, *David CASIER* Le mar. 5 déc. 2023 à 11:17, Patrick Begou < patrick.be...@univ-grenoble-alpes.fr> a écrit : > Hi Robert, > > Le 05/12/2023 à 10:05

[ceph-users] Re: EC Profiles & DR

2023-12-05 Thread David C.

ain your cluster (update, etc.), it's dangerous. Unless you can afford to lose/rebuild your cluster, you should never have a min_size <2 Cordialement, *David CASIER* Le

[ceph-users] Re: EC Profiles & DR

2023-12-05 Thread David Rivera

First problem here is you are using crush-failure-domain=osd when you should use crush-failure-domain=host. With three hosts, you should use k=2, m=1; this is not recommended in production environment. On Mon, Dec 4, 2023, 23:26 duluxoz wrote: > Hi All, > > Looking for some help/explanation aro

[ceph-users] Re: How to identify the index pool real usage?

2023-12-04 Thread David C.

great. Cordialement, *David CASIER* Le lun. 4 déc. 2023 à 10:14, Szabo, Istvan (Agoda) a écrit : > These values shouldn't be true to be able to do triming? > > "bde

[ceph-users] Re: How to identify the index pool real usage?

2023-12-04 Thread David C.

Hi, A flash system needs free space to work efficiently. Hence my hypothesis that fully allocated disks need to be notified of free blocks (trim) Cordialement, *David CASIER

[ceph-users] Re: How to identify the index pool real usage?

2023-12-01 Thread David C.

... Cordialement, *David CASIER* Le ven. 1 déc. 2023 à 16:15, Szabo, Istvan (Agoda) a écrit : > Hi, > > Today we had a big issue with slow ops on the nvme drives which holding > the index pool. > >

[ceph-users] Re: Issue with CephFS (mds stuck in clientreplay status) since upgrade to 18.2.0.

2023-11-27 Thread David C.

for 2 MDS to be on the same host "monitor-02" ? Cordialement, *David CASIER* Le lun. 27 nov. 2023 à 10:09, Lo Re Giuseppe a écrit : > Hi, > We have upgraded

[ceph-users] Re: How to use hardware

2023-11-18 Thread David C.

Hello Albert, 5 vs 3 MON => you won't notice any difference 5 vs 3 MGR => by default, only 1 will be active Le sam. 18 nov. 2023 à 09:28, Albert Shih a écrit : > Le 17/11/2023 à 11:23:49+0100, David C. a écrit > > Hi, > > > > > 5 instead of 3 mon will a

[ceph-users] Re: RadosGW public HA traffic - best practices?

2023-11-17 Thread David Orman

eouts will likely happens, so the impact won't be non-zero, but it also won't be catastrophic. David On Fri, Nov 17, 2023, at 10:09, David Orman wrote: > Use BGP/ECMP with something like exabgp on the haproxy servers. > > David > > On Fri, Nov 17, 2023, at 04:09, Boris Behren

[ceph-users] Re: RadosGW public HA traffic - best practices?

2023-11-17 Thread David Orman

Use BGP/ECMP with something like exabgp on the haproxy servers. David On Fri, Nov 17, 2023, at 04:09, Boris Behrens wrote: > Hi, > I am looking for some experience on how people make their RGW public. > > Currently we use the follow: > 3 IP addresses that get distributed via kee

[ceph-users] Re: cephadm user on cephadm rpm package

2023-11-17 Thread David C.

figure out how to enable cephadm's access to the > machines. > > Anyway, thanks for your reply. > > Luis Domingues > Proton AG > > > On Friday, 17 November 2023 at 13:55, David C. > wrote: > > > > Hi, > > > > You can use the cephadm account (i

[ceph-users] Re: cephadm user on cephadm rpm package

2023-11-17 Thread David C.

Hi, You can use the cephadm account (instead of root) to control machines with the orchestrator. Le ven. 17 nov. 2023 à 13:30, Luis Domingues a écrit : > Hi, > > I noticed when installing the cephadm rpm package, to bootstrap a cluster > for example, that a user cephadm was created. But I do n

[ceph-users] Re: Problem while upgrade 17.2.6 to 17.2.7

2023-11-17 Thread David C.

Le ven. 17 nov. 2023 à 11:22, Jean-Marc FONTANA a écrit : > Hello, everyone, > > There's nothing cephadm.log in /var/log/ceph. > > To get something else, we tried what David C. proposed (thanks to him !!) > and found: > > nov. 17 10:53:54 svtcephmonv3 ceph-mgr[727]:

[ceph-users] Re: How to use hardware

2023-11-17 Thread David C.

Hi Albert , 5 instead of 3 mon will allow you to limit the impact if you break a mon (for example, with the file system full) 5 instead of 3 MDS, this makes sense if the workload can be distributed over several trees in your file system. Sometimes it can also make sense to have several FSs in ord

[ceph-users] Re: per-rbd snapshot limitation

2023-11-15 Thread David C.

t for each rbd? > > Respectfully, > > *Wes Dillingham* > w...@wesdillingham.com > LinkedIn <http://www.linkedin.com/in/wesleydillingham> > > > On Wed, Nov 15, 2023 at 1:14 PM David C. wrote: > >> rbd create testpool/test3 --size=100M >> rbd snap limit set

[ceph-users] Re: per-rbd snapshot limitation

2023-11-15 Thread David C.

rbd create testpool/test3 --size=100M rbd snap limit set testpool/test3 --limit 3 Le mer. 15 nov. 2023 à 17:58, Wesley Dillingham a écrit : > looking into how to limit snapshots at the ceph level for RBD snapshots. > Ideally ceph would enforce an arbitrary number of snapshots allowable per > rb

[ceph-users] Re: Problem while upgrade 17.2.6 to 17.2.7

2023-11-14 Thread David C.

nable_apis Cordialement, *David CASIER* Le mar. 14 nov. 2023 à 11:45, Jean-Marc FONTANA a écrit : > Hello everyone, > > We operate two clusters that we installed with ceph-deploy in Nautilus > version on Debi

[ceph-users] Re: IO stalls when primary OSD device blocks in 17.2.6

2023-11-10 Thread David C.

Hi Daniel, it's perfectly normal for a PG to freeze when the primary osd is not stable. It can sometimes happen that the disk fails but doesn't immediately send back I/O errors (which crash the osd). When the OSD is stopped, there's a 5-minute delay before it goes down in the crushmap. Le ve

[ceph-users] Re: Crush map & rule

2023-11-09 Thread David C.

ossible on this architecture. Cordialement, *David CASIER* Le jeu. 9 nov. 2023 à 08:48, Albert Shih a écrit : > Le 08/11/2023 à 19:29:19+0100, David C. a écrit > Hi David

[ceph-users] Re: Crush map & rule

2023-11-08 Thread David C.

Hi Albert, What would be the number of replicas (in total and on each row) and their distribution on the tree ? Le mer. 8 nov. 2023 à 18:45, Albert Shih a écrit : > Hi everyone, > > I'm totally newbie with ceph, so sorry if I'm asking some stupid question. > > I'm trying to understand how the

[ceph-users] Re: HDD cache

2023-11-08 Thread David C.

Without (raid/jbod) controller ? Le mer. 8 nov. 2023 à 18:36, Peter a écrit : > Hi All, > > I note that HDD cluster commit delay improves after i turn off HDD cache. > However, i also note that not all HDDs are able to turn off the cache. > special I found that two HDD with same model number, on

[ceph-users] Re: 100.00 Usage for ssd-pool (maybe after: ceph osd crush move .. root=default)

2023-11-08 Thread David C.

so the next step is to place the pools on the right rule : ceph osd pool set db-pool crush_rule fc-r02-ssd Le mer. 8 nov. 2023 à 12:04, Denny Fuchs a écrit : > hi, > > I've forget to write the command, I've used: > > = > ceph osd crush move fc-r02-ceph-osd-01 root=default > ceph osd crush

[ceph-users] Re: 100.00 Usage for ssd-pool (maybe after: ceph osd crush move .. root=default)

2023-11-08 Thread David C.

I've probably answered too quickly if the migration is complete and there are no incidents. Are the pg active+clean? Cordialement, *David CASIER* Le mer. 8 nov. 2023 à

[ceph-users] Re: 100.00 Usage for ssd-pool (maybe after: ceph osd crush move .. root=default)

2023-11-08 Thread David C.

). Cordialement, *David CASIER* *Ligne directe: +33(0) 9 72 61 98 29* Le mer. 8 nov. 2023 à 11:27, Denny Fuchs a écrit : > Hello, > > we upgraded to Quincy and tried to remove an obso

[ceph-users] Re: Ceph dashboard reports CephNodeNetworkPacketErrors

2023-11-07 Thread David C.

hecks Cordialement, *David CASIER* Le mar. 7 nov. 2023 à 11:20, Dominique Ramaekers < dominique.ramaek...@cometal.be> a écrit : > Hi, > > I'm using Ceph on a 4-host cluster for a year now. I recen

[ceph-users] Re: Emergency, I lost 4 monitors but all osd disk are safe

2023-11-02 Thread David C.

Pro" range, but that's not the point), which have never been able to restart since the incident. Someone please correct me, but as far as I'm concerned, the cluster is lost. ____ Cordialement, *David CASIER* *Ligne

1 2 3 4 5 >

1 - 100 of 476 matches

Mail list logo