[ceph-users] Re: Rocky8 (el8) client for squid 19.2.2

2025-07-22 Thread Burkhard Linke
Hi, *snipsnap* But I still don't seem to be able to mount a CephFS volume: mount: /mnt/web: wrong fs type, bad option, bad superblock on ceph01,ceph02:/volumes/web/www/457c578c-95a1-4f28-aafa-d2c7e9603042, missing codepage or helper program, or other error. The program mount.ceph is not ins

[ceph-users] Re: 2025-Q3: Stable release recommendation for production clusters

2025-07-09 Thread Burkhard Linke
Hi, On 09.07.25 21:21, Özkan Göksu wrote: Hello Wesley. Thank you for the warning. I'm aware of this and even with the recommended upgrade path it is not easy or safe for complicated clusters like mine. I have billions of small s3 objects, versions, indexes etc. With each new Ceph release, the

[ceph-users] Re: CephFS with Ldap

2025-06-30 Thread Burkhard Linke
maybe even directly. Best regards, Burkhard Linke Please advise. Thanks, Gagan ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users

[ceph-users] Re: First time configuration advice

2025-06-24 Thread Burkhard Linke
Hi and welcome to ceph, On 23.06.25 22:37, Ryan Sleeth wrote: I am setting up my first cluster of 9-nodes each with 8x 20T HDDs and 2x 2T NVMes. I plan to partition the NVMes into 5x 300G so that one partition can be used by cephfs_metadata (SSD only), while the other 4x partitions will be paire

[ceph-users] Re: CephFS mirroring alternative?

2025-06-13 Thread Burkhard Linke
Hi, On 12.06.25 21:58, Daniel Vogelbacher wrote: Hi Eric, On 6/12/25 17:33, Eric Le Lay wrote: I use rsync to copy data (~10TB) to backup storage. To speed things up I use the ceph.dir.rctime extended attribute to instantly ignore sub-trees that haven't changed without iterating through the

[ceph-users] Re: [Ceph-announce] v18.2.5 Reef released

2025-04-14 Thread Burkhard Linke
Hi, On 4/11/25 18:33, Stephan Hohn wrote: - cryptsetup version check isn't working at least in the container image of v18.2.5 ( https://github.com/ceph/ceph/blob/reef/src/ceph-volume/ceph_volume/util/encryption.py) which leads to encrypted osds not starting due to "'Error while checking cryptset

[ceph-users] Re: [Ceph-announce] v18.2.5 Reef released

2025-04-14 Thread Burkhard Linke
on Debian 12, only public network for all services. I'll update the remaining services and afterwards run some tests on the single machine for further investigations. Best regards, Burkhard Linke ___ ceph-users mailing list -- ceph-users@ceph.io T

[ceph-users] Re: endless remapping after increasing number of PG in a pool

2025-04-01 Thread Burkhard Linke
thing that may help? Remapping PGs has a higher priority than scrubbing. So as long as the current pool extension is not finished, only idle OSDs will scrub their PGs. This is expected, and the cluster will take care for the missing scrub runs after it is healthy again. Best regards, Burkhard

[ceph-users] Re: endless remapping after increasing number of PG in a pool

2025-04-01 Thread Burkhard Linke
oth should be extended at the same time or it not causing the expected result? If it is the case, should I just reenter the command to extend pg_num and pgp_num? (and wait for the resulting remapping!) In current ceph release only pg_num can be changed. pgp_num is automatically adopted.

[ceph-users] Re: Unintuitive (buggy?) CephFS behaviour when dealing with pool_namespace layout attribute

2025-03-05 Thread Burkhard Linke
by moving The last time we want to perform a large scale pool change we had to copy each file / directory to actually move the data. Best regards, Burkhard Linke ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to

[ceph-users] Re: How to reduce CephFS num_strays effectively?

2025-02-18 Thread Burkhard Linke
reintegrate these strays e.g. by running a recursive find on the filesystem or scrub it (https://docs.ceph.com/en/latest/cephfs/scrub/). I would recommend to upgrade before running any of these checks. Best regards, Burkhard Linke ___ ceph-users mailing

[ceph-users] Re: CephFS: EC pool with "leftover" objects

2025-01-29 Thread Burkhard Linke
h information, or use 'ceph-dencoder' to decode it (not sure about the exact parameters for this). Best regards, Burkhard Linke ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Many misplaced PG's, full OSD's and a good amount of manual intervention to keep my Ceph cluster alive.

2025-01-04 Thread Burkhard Linke
Hi, your cephfs.cephfs01.data pool currently has 144 PGs. So this pool seems to be resizing, e.g. from 128 PGs to 256 PGs. Do you use the autoscalar or did you trigger a manual PG increment of the pool? You can check this with the output of "ceph osd pool ls detail". It shows the current an

[ceph-users] Re: MONs not trimming

2024-12-17 Thread Burkhard Linke
Hi, On 17.12.24 14:40, Janek Bevendorff wrote: Hi all, We moved our Ceph cluster to a new data centre about three months ago, which completely changed its physical topology. I changed the CRUSH map accordingly so that the CRUSH location matches the physical location again and the cluster has

[ceph-users] Re: Ceph native clients

2024-10-28 Thread Burkhard Linke
e same problem with our desktops a while ago, and decided to switch to a NFS re-export of the CephFS filesystem. This has proven to be much more reliable in case of hibernation. But as you already mentioned NFS also has other problems... Regards, Burkhard Linke ___

[ceph-users] Re: Backup strategies for rgw s3

2024-09-25 Thread Burkhard Linke
t it might be a good starting point. Best regards, Burkhard Linke ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] cephfs +inotify = caps problem?

2024-09-25 Thread Burkhard Linke
st as root only reports very few open files (<50). So inotify seems to be responsible for the massive caps build up. Terminating VSC results in a sharp drop of the caps  (just a few open files / directories left afterwards). Is this a known problem? Best regards, Burkha

[ceph-users] Debian package for 18.2.4 broken

2024-08-01 Thread Burkhard Linke
Hi, the Debian bookworm packages for 18.2.4 are completely broken and unusable: 1. missing dependencies, e.g. python3-packaging: root@cc-r3-ceph-2:~# ceph-volume lvm list Traceback (most recent call last):   File "/usr/sbin/ceph-volume", line 33, in     sys.exit(load_entry_point('ceph-volume=

[ceph-users] Converting/Migrating EC pool to a replicated pool

2024-07-15 Thread Burkhard Linke
ement set, disable access to a bucket, move all objects belonging to that bucket from one pool to the other (identifiable via the bucket's marker), change the placement setting of the bucket and reenable access. Unfortunately I've didn't found a good method to temporarily disable bu

[ceph-users] Re: Cephfs over internet

2024-05-21 Thread Burkhard Linke
Hi, On 5/21/24 13:39, Marcus wrote: Thanks for your answers! I read somewhere that a vpn would really have an impact on performance, so it was not recommended, and I found v2 protocol. But vpn feels like the solution and you have to accept the lower speed. Also keep in mind that clients hav

[ceph-users] Re: MDS Behind on Trimming...

2024-03-28 Thread Burkhard Linke
Hi, we have similar problems from time to time. Running Reef on servers and latest ubuntu 20.04 hwe kernel on the clients. There are probably two scenarios with slightly different observations: 1. MDS reports slow ops Some client is holding caps for a certain file / directory and blocks o

[ceph-users] Re: Robust cephfs design/best practice

2024-03-15 Thread Burkhard Linke
t you need to know your current and future workloads to configure it accordingly. This is also true for any other shared filesystem. Best regards, Burkhard Linke ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Full cluster outage when ECONNREFUSED is triggered

2023-11-24 Thread Burkhard Linke
osd_fast_fail_on_connection_refused<https://docs.ceph.com/en/latest/rados/configuration/osd-config-ref/#confval-osd_fast_fail_on_connection_refused>=true changes this behaviour. Best regards, Burkhard Linke On 24.11.23 11:09, Janne Johansson wrote: Den fre 24 nov. 2023 kl 10:25 skrev Frank Schilder: Hi Den

[ceph-users] Re: Pacific 16.2.14 debian Incomplete

2023-08-30 Thread Burkhard Linke
Hi, On 8/30/23 18:26, Yuri Weinstein wrote: 16.2.14 has not been released yet. Please don't do any upgrades before we send an announcement email. Then stop pushing packets before the announcement. This is not the first time this problem occurred. And given your answer I'm afraid it won't be

[ceph-users] Re: Deleting millions of objects

2023-05-17 Thread Burkhard Linke
removing large buckets, but I'm not sure how it actually works if you only want to remove some objects. Regards, Burkhard Linke ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] CEPH Mirrors are lacking packages

2023-04-17 Thread Burkhard Linke
Hi, at least eu.ceph.com and de.ceph.com are lacking packages for the pacific release. All package not start with "c" (e.g. librbd, librados, radosgw) are missing. Best regards, Burkhard Linke ___ ceph-users mailing list -- ceph-use

[ceph-users] Re: Live migrate RBD image with a client using it

2023-04-13 Thread Burkhard Linke
s block migration within qemu itself. Proxmox is using this for storage migrations. Works quite well within proxmox (btw thx for the great software), but I haven't done it manually yet. YMMV. Best regards, Burkhard Linke ___ ceph-users ma

[ceph-users] Re: Newer linux kernel cephfs clients is more trouble?

2022-12-09 Thread Burkhard Linke
Hi, I would like to add a datapoint. I rebooted one of our client machines into kernel 5.4.0-135-generic (latest ubuntu 20.04 non hwe kernel) and performed the same test (copying a large file within cephfs). Both the source and target files stay in cache completely: # fincore bar   RES   PA

[ceph-users] Re: Newer linux kernel cephfs clients is more trouble?

2022-12-09 Thread Burkhard Linke
Hi, On 07.12.22 11:58, Stefan Kooman wrote: On 5/13/22 09:38, Xiubo Li wrote: On 5/12/22 12:06 AM, Stefan Kooman wrote: Hi List, We have quite a few linux kernel clients for CephFS. One of our customers has been running mainline kernels (CentOS 7 elrepo) for the past two years. They starte

[ceph-users] Re: [PHISHING VERDACHT] ceph is stuck after increasing pg_nums

2022-11-04 Thread Burkhard Linke
Hi, On 11/4/22 09:45, Adrian Nicolae wrote: Hi, We have a Pacific cluster (16.2.4) with 30 servers and 30 osds. We started to increase the pg_num for the data bucket for more than a month, I usually added 64 pgs in every step I didn't have any issue. The cluster was healthy before increasing

[ceph-users] Re: How does client get the new active ceph-mgr endpoint when failover happens?

2022-10-06 Thread Burkhard Linke
gards, Burkhard Linke ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Increasing number of unscrubbed PGs

2022-09-14 Thread Burkhard Linke
orage, and according to the timestamps have been scrubbed in the last days. Is it possible to get a full list of all affected PGs? 'ceph health detail' only displays 50 entries. Best regards, Burkhard Linke ___ ceph-users mailing

[ceph-users] Re: Increasing number of unscrubbed PGs

2022-09-13 Thread Burkhard Linke
Hi Josh, thx for the link. I'm not sure whether this is the root cause, since we did not use the noscrub and nodeepscrub flags in the past. I've set them for a short period to test whether removing the flag triggers more backfilling. During that time no OSD were restarted etc. But the tick

[ceph-users] Re: Increasing number of unscrubbed PGs

2022-09-12 Thread Burkhard Linke
Hi, On 9/12/22 11:44, Eugen Block wrote: Hi, I'm still not sure why increasing the interval doesn't help (maybe there's some flag set to the PG or something), but you could just increase osd_max_scrubs if your OSDs are not too busy. On one customer cluster with high load during the day we c

[ceph-users] Increasing number of unscrubbed PGs

2022-09-12 Thread Burkhard Linke
rising from 600 to over 1000 during the weekend and continues to rise... Best regards, Burkhard Linke ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Changing the cluster network range

2022-08-29 Thread Burkhard Linke
cluster network. It is extra effort both in setup, maintenance and operating. Unless your network is the bottleneck you might want to use this pending configuration change to switch to a single network setup. Regards, Burkhard Linke ___ ceph-use

[ceph-users] Re: ceph.conf

2022-08-24 Thread Burkhard Linke
. AFAIK the above settings are used by the fuse client. The kernel client does not use ceph.conf at all and requires all information to be passed as command line parameters to mount or as part of the fstab / autofs entry. Regards, Burkhard Linke

[ceph-users] Re: cephfs mounting multiple filesystems

2022-07-08 Thread Burkhard Linke
Hi, On 08.07.22 11:34, Robert Reihs wrote: Hi, I am very new to the ceph world, and working on setting up a cluster. We have two cephfs filesystems (slow and fast), everything is running and showing um in the dashboard. I can mount on of the filesystems (it mounts it as default). How can I speci

[ceph-users] Tuning for cephfs backup client?

2022-06-23 Thread Burkhard Linke
regards, Burkhard Linke ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Generation of systemd units after nuking /etc/systemd/system

2022-06-10 Thread Burkhard Linke
Hi, On 10.06.22 10:23, Flemming Frandsen wrote: Hmm, does that also create the mon, mgr and mds units? The actual unit files for these services are located in /lib/systemd/system, not /etc/systemd. You need to recreate the _instances_ for the units, e.g. by running systemctl enable ceph-

[ceph-users] Re: DM-Cache for spinning OSDs

2022-05-17 Thread Burkhard Linke
Hi, On 5/17/22 08:51, Stolte, Felix wrote: Hey guys, i have three servers with 12x 12 TB Sata HDDs and 1x 3,4 TB NVME. I am thinking of putting DB/WAL on the NVMe as well as an 5GB DM-Cache for each spinning disk. Is anyone running something like this in a production environment? We have s

[ceph-users] ceph-crash user requirements

2022-05-10 Thread Burkhard Linke
Hi, I just stumpled over some log messages regarding ceph-crash: May 10 09:32:19 bigfoot60775 ceph-crash[2756]: WARNING:ceph-crash:post /var/lib/ceph/crash/2022-05-10T07:10:55.837665Z_7f3b726e-0368-4149-8834-6cafd92fb13f as client.admin failed: b'2022-05-10T09:32:19.099+0200 7f911ad92700 -1

[ceph-users] Re: Best way to keep a backup of a bucket

2022-03-31 Thread Burkhard Linke
Hi, On 3/31/22 08:44, Szabo, Istvan (Agoda) wrote: Hi, I have some critical data in couple of buckets I'd like to keep it somehow safe, but I don't see any kind of snapshot solution in ceph for objectgateway. How you guys (if you do) backup RGW buckets or objects what is the best way to keep

[ceph-users] Re: Path to a cephfs subvolume

2022-03-22 Thread Burkhard Linke
Hi, On 22.03.22 16:23, Robert Vasek wrote: Hello, I have a question about cephfs subvolume paths. The path to a subvol seems to be in the format of //, e.g.: /volumes/csi/csi-vol-59c3cb5a-a9ee-11ec-b412-0242ac110004/b2b5a0b3-e02b-4f93-a3f5-fdcef80ebbea I'm wondering about the segment. Where

[ceph-users] Re: 1 bogus remapped PG (stuck pg_temp) -- how to cleanup?

2022-02-02 Thread Burkhard Linke
Hi, I've found a solution for getting rid of the stale pg_temp. I've scaled the pool up to 128 PGs (thus "covering" the pg_temp). Afterwards the remapped PG was gone. I'm currently scaling down back to 32, no extra PG (either regular or temp) so far. The pool is almost empty, so playing ar

[ceph-users] Re: 1 bogus remapped PG (stuck pg_temp) -- how to cleanup?

2022-02-02 Thread Burkhard Linke
Hi, On 2/2/22 14:39, Konstantin Shalygin wrote: Hi, The cluster is Nautilus 14.2.22 For a long time we have bogus 1 remapped PG, without actual 'remapped' PG's # ceph pg dump pgs_brief | awk '{print $2}' | grep active | sort | uniq -c dumped pgs_brief 15402 active+clean 6 active+cle

[ceph-users] Re: CephFS keyrings for K8s

2022-01-20 Thread Burkhard Linke
an restrict the access scope to sub directories etc. See https://docs.ceph.com/en/pacific/cephfs/client-auth/  (or the pages for your current release). We use the CSI cephfs plugin in our main k8s cluster, and it is working fine with those keys. Regards,

[ceph-users] Mounting cephfs on OSD hosts still a problem

2021-12-24 Thread Burkhard Linke
e case? Or is mounting cephfs on the OSD hosts (kernel implementation) considered safe now? Best regards and happy holidays, Burkhard Linke ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: How to trim/discard ceph osds ?

2021-11-26 Thread Burkhard Linke
Hi, On 11/26/21 12:47 PM, Christoph Adomeit wrote: Hi, I am just wondering if it is recommended to regularly fstrim or discard ceph bluestore osds on flash memory (ssds and nvmes) and how it is done and configured ? fstrim and discard are filesystem level operations, since a filesystem ha

[ceph-users] Re: Minimal requirements for ceph csi users?

2021-10-28 Thread Burkhard Linke
Hi, On 28.10.21 18:10, Konstantin Shalygin wrote: Hi, Try to use profile cap, like 'allow profile rbd' That's fine for csi rbd, thx. Works like a charm so far. But cephfs is a little different beast. As far as I understand the source code, it uses the mgr interface to create subvolumes an

[ceph-users] Minimal requirements for ceph csi users?

2021-10-28 Thread Burkhard Linke
Hi, I'm currently setting up ceph CSI for our kubernetes cluster. What are the minimum requirements / capabilities needed for the rbd and cephfs users? The current setup is working well with admin privileges, but I would like to reduce it to the necessary minimum. Regards, Burkhard _

[ceph-users] Re: Cluster downtime due to unsynchronized clocks

2021-09-23 Thread Burkhard Linke
Hi, On 9/23/21 9:49 AM, Mark Schouten wrote: Hi, Last night we’ve had downtime on a simple three-node cluster. Here’s what happened: 2021-09-23 00:18:48.331528 mon.node01 (mon.0) 834384 : cluster [WRN] message from mon.2 was stamped 8.401927s in the future, clocks not synchronized 2021-09-23 00

[ceph-users] Re: ceph orch commands stuck

2021-08-30 Thread Burkhard Linke
Hi, On 30.08.21 15:36, Oliver Weinmann wrote: Hi, we had one failed osd in our cluster that we have replaced. Since then the cluster is behaving very strange and some ceph commands like ceph crash or ceph orch are stuck. Just two unrelated thoughts: - never use two mons. If one of them

[ceph-users] Multiple DNS names for RGW?

2021-08-15 Thread Burkhard Linke
Hi, we are running RGW behind haproxy for TLS termination and load balancing. Due to some major changes in our setup, we would like to start a smooth transition to a new hostname of the S3 endpoint. The haproxy part should be straightforward (adding a second frontend). But on the RGW side

[ceph-users] Re: Procedure for changing IP and domain name of all nodes of a cluster

2021-07-22 Thread Burkhard Linke
Hi, On 7/21/21 8:30 PM, Konstantin Shalygin wrote: Hi, On 21 Jul 2021, at 10:53, Burkhard Linke <mailto:burkhard.li...@computational.bio.uni-giessen.de>> wrote: One client with special needs is openstack cinder. The database entries contain the mon list for volumes Another que

[ceph-users] Re: Procedure for changing IP and domain name of all nodes of a cluster

2021-07-21 Thread Burkhard Linke
Hi, On 7/21/21 9:40 AM, mabi wrote: Hello, I need to relocate an Octopus (15.2.13) ceph cluster of 8 nodes to another internal network. This means that the IP address of each nodes as well as the domain name will change. The hostname itself will stay the same. What would be the best steps in

[ceph-users] Re: HDD <-> OSDs

2021-06-22 Thread Burkhard Linke
Hi, just an addition: currentl CEPH releases also include disk monitoring (e.g. SMART and other health related features). These do not work with raid devices. You will need external monitoring for your OSD disks. Regards, Burkhard ___ ceph-use

[ceph-users] Re: HDD <-> OSDs

2021-06-22 Thread Burkhard Linke
Hi, On 22.06.21 11:55, Thomas Roth wrote: Hi all, newbie question: The documentation seems to suggest that with ceph-volume, one OSD is created for each HDD (cf. 4-HDD-example in https://docs.ceph.com/en/latest/rados/configuration/bluestore-config-ref/) This seems odd: what if a server has

[ceph-users] Re: Failover with 2 nodes

2021-06-15 Thread Burkhard Linke
Hi, On 15.06.21 16:15, Christoph Brüning wrote: Hi, That's right! We're currently evaluating a similar setup with two identical HW nodes (on two different sites), with OSD, MON and MDS each, and both nodes have CephFS mounted. The goal is to build a minimal self-contained shared filesystem

[ceph-users] Re: Cephfs mount not recovering after icmp-not-reachable

2021-06-14 Thread Burkhard Linke
Hi, CephFS clients are blacklisted if they do not react to heartbeat packets. The MDS will deny the reconnect: [ 1815.029831] ceph: mds0 closed our session [ 1815.029833] ceph: mds0 reconnect start [ 1815.052219] ceph: mds0 reconnect denied [ 1815.052229] ceph: dropping dirty Fw state for

[ceph-users] cephfs objets without 'parent' xattr?

2021-06-07 Thread Burkhard Linke
Hi, during an OS upgrade from Ubuntu 18.04 to 20.04 we seem to have triggered a bcache bug on three OSD hosts. These hosts are used with a 6+2 EC pool used with CephFS, so a number of PGs are affected by the bug. We were able to restart two of the three hosts (and will run some extra scrubs

[ceph-users] Re: Ceph cluster not recover after OSD down

2021-05-05 Thread Burkhard Linke
Hi, On 05.05.21 11:07, Andres Rojas Guerrero wrote: Sorry, I have not understood the problem well, the problem I see is that once the OSD fails, the cluster recovers but the MDS remains faulty: *snipsnap* pgs: 1.562% pgs not active 16128 active+clean 238

[ceph-users] Re: Increase of osd space usage on cephfs heavy load

2021-04-06 Thread Burkhard Linke
Hi, On 4/6/21 2:20 PM, Olivier AUDRY wrote: hello now backup is running since 3hours and cephfs metadata goes from 20G to 479Go... POOL ID STORED OBJECTS USED %USED MAX AVAIL cephfs-metadata12 479 GiB 642.26k1.4 TiB 18.79 2.0 TiB cephfs-data0

[ceph-users] Re: ceph Nautilus lost two disk over night everything hangs

2021-03-30 Thread Burkhard Linke
Hi, On 30.03.21 13:05, Rainer Krienke wrote: Hello, yes your assumptions are correct pxa-rbd ist the metadata pool for pxa-ec which uses a erasure coding 4+2 profile. In the last hours ceph repaired most of the damage. One inactive PG remained and in ceph health detail then told me: -

[ceph-users] Re: Quick quota question

2021-03-17 Thread Burkhard Linke
Hi, On 3/17/21 11:28 AM, Andrew Walker-Brown wrote: Hi Magnus, Thanks for the reply. Just to be certain (I’m having a slow day today), it’s the amount of data stored by the clients. As an example. a pool using 3 replicas and a quota 3TB : clients would be able to create up to 3TB of data a

[ceph-users] Re: Networking Idea/Question

2021-03-16 Thread Burkhard Linke
Hi, On 16.03.21 03:40, Dave Hall wrote: Andrew, I agree that the choice of hash function is important for LACP. My thinking has always been to stay down in layers 2 and 3.  With enough hosts it seems likely that traffic would be split close to evenly.  Heads or tails - 50% of the time you're

[ceph-users] Re: Newbie Requesting Help - Please, This Is Driving Me Mad/Crazy!

2021-02-24 Thread Burkhard Linke
Hi, your whole OSD deployment is wrong. CEPH does not use any filesystem anymore for at least two major releases, and the existing filestore backend is deprecated. Dunno where you got those steps from... Just use ceph-volume, and preferable the lvm based deployment. If you really want to u

[ceph-users] Re: Using RBD to pack billions of small files

2021-02-03 Thread Burkhard Linke
Hi, On 2/3/21 9:41 AM, Loïc Dachary wrote: Just my 2 cents: You could use the first byte of the SHA sum to identify the image, e.g. using a fixed number of 256 images. Or some flexible approach similar to the way filestore used to store rados objects. A friend suggested the same to save spac

[ceph-users] Re: Using RBD to pack billions of small files

2021-02-03 Thread Burkhard Linke
Hi, On 2/2/21 9:32 PM, Loïc Dachary wrote: Hi Greg, On 02/02/2021 20:34, Gregory Farnum wrote: *snipsnap* Right. Dan's comment gave me pause: it does not seem to be a good idea to assume a RBD image of an infinite size. A friend who read this thread suggested a sensible approach (which als

[ceph-users] Re: [Suspicious newsletter] Re: Rbd pool shows 458GB USED but the image is empty

2021-01-28 Thread Burkhard Linke
Hi, On 28.01.21 13:21, Szabo, Istvan (Agoda) wrote: I mean the image hasn’t been deleted, but the content from the image. RBD is (as the name implies) is a block device layer. Block devices do not have a concept of content, file, directories or even allocated or unallocated space. They are

[ceph-users] Re: cephfs: massive drop in MDS requests per second with increasing number of caps

2021-01-18 Thread Burkhard Linke
Hi, On 1/18/21 5:46 PM, Dietmar Rieder wrote: Hi all, we noticed a massive drop in requests per second a cephfs client is able to perform when we do a recursive chown over a directory with millions of files. As soon as we see about 170k caps on the MDS, the client performance drops from abou

[ceph-users] Re: Compression of data in existing cephfs EC pool

2021-01-04 Thread Burkhard Linke
Hi, On 1/4/21 5:27 PM, Paul Mezzanini wrote: Hey everyone, I've got an EC pool as part of our cephfs for colder data. When we started using it, compression was still marked experimental. Since then it has become stable so I turned compression on to "aggressive". Using 'ceph df detail' I ca

[ceph-users] Re: DB sizing for lots of large files

2020-11-26 Thread Burkhard Linke
Hi, On 11/26/20 12:45 PM, Richard Thornton wrote: Hi, Sorry to bother you all. It’s a home server setup. Three nodes (ODROID-H2+ with 32GB RAM and dual 2.5Gbit NICs), two 14TB 7200rpm SATA drives and an Optane 118GB NVMe in each node (OS boots from eMMC). *snipsnap* Is there a rough Ceph

[ceph-users] Re: desaster recovery Ceph Storage , urgent help needed

2020-10-23 Thread Burkhard Linke
Hi, On 10/23/20 2:22 PM, Gerhard W. Recher wrote: This is a proxmox cluster ... sorry for formating problems of my post :( short plot, we messed with ip addr. change of public network, so monitors went down. *snipsnap* so howto recover from this disaster ? # ceph -s   cluster:     id:  

[ceph-users] Re: desaster recovery Ceph Storage , urgent help needed

2020-10-23 Thread Burkhard Linke
Hi, your mail is formatted in a way that makes it impossible to get all information, so a number of questions first: - are the mons up, or are the mon up and in a quorum? you cannot change mon IP addresses without also adjusting them in the mon map. use the daemon socket on the systems to

[ceph-users] Re: 14.2.12 breaks mon_host pointing to Round Robin DNS entry

2020-10-23 Thread Burkhard Linke
Hi, non round robin entries with multiple mon host FQDNs are also broken. Regards, Burkhard ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: 6 PG's stuck not-active, remapped

2020-10-22 Thread Burkhard Linke
Hi, On 10/21/20 10:01 PM, Mac Wynkoop wrote: *snipsnap* *up: 0: 1131: 1382: 303: 1324: 1055: 576: 1067: 1408: 161acting: 0: 721: 1502: 21474836473: 21474836474: 245: 486: 327: 1578: 103* 21474836473 is -1 as unsigned integer. This value means that the CRUSH algorithm did not produce enough

[ceph-users] Re: Need help integrating radosgw with keystone for openstack swift

2020-10-22 Thread Burkhard Linke
Hi, in our setup (ceph 15.2.4, openstack train) the swift endpoint URLs are different, e.g. # openstack endpoint list --service swift +--+---+--+--+-+---+--+ | I

[ceph-users] Re: Cluster under stress - flapping OSDs?

2020-10-12 Thread Burkhard Linke
Hi, On 10/12/20 12:05 PM, Kristof Coucke wrote: Diving into the different logging and searching for answers, I came across the following: PG_DEGRADED Degraded data redundancy: 2101057/10339536570 objects degraded (0.020%), 3 pgs degraded, 3 pgs undersized pg 1.4b is stuck undersized for 63

[ceph-users] Re: Ubuntu 20 with octopus

2020-10-12 Thread Burkhard Linke
Hi, On 10/12/20 2:31 AM, Seena Fallah wrote: Hi all, Does anyone has any production cluster with ubuntu 20 (focal) or any suggestion or any bugs that prevents to deploy Ceph octopus on Ubuntu 20? We are running our new ceph cluster on Ubuntu 20.04 and ceph octopus release. Packages are take

[ceph-users] Re: Single Server Ceph OSD Recovery

2020-07-04 Thread Burkhard Linke
Hi, in addition you need a way to recover the mon maps (I assume the mon was on the same host). If the mon data is lost, you can try to retrieve some of the maps from the existing OSDs. See the documentation about desaster recovery in the ceph documentation. If you cannot restore the mons,

[ceph-users] Re: Showing OSD Disk config?

2020-07-01 Thread Burkhard Linke
Hi, On 7/2/20 7:25 AM, Lindsay Mathieson wrote: Is there a way to display an OSD's setup - data, data.db and WAL disks/partitions? If you use ceph-volume, the corresponding 'list' command on the host will print details of OSDs: # ceph-volume lvm list == osd.1 ===   [block] /dev

[ceph-users] Re: Advice on SSD choices for WAL/DB?

2020-07-01 Thread Burkhard Linke
Hi, On 7/1/20 1:57 PM, Andrei Mikhailovsky wrote: Hello, We are planning to perform a small upgrade to our cluster and slowly start adding 12TB SATA HDD drives. We need to accommodate for additional SSD WAL/DB requirements as well. Currently we are considering the following: HDD Drives - Sea

[ceph-users] Re: CEPH failure domain - power considerations

2020-05-28 Thread Burkhard Linke
use them for "small" machines, so I dunno whether yuo can get them for higher loads. Regards, Burkhard -- Dr. rer. nat. Burkhard Linke Bioinformatics and Systems Biology Justus-Liebig-University Giessen 35392 Giessen, Germany Phone: (+49) (0)6

[ceph-users] Re: moving small production cluster to different datacenter

2020-01-31 Thread Burkhard Linke
. nat. Burkhard Linke Bioinformatics and Systems Biology Justus-Liebig-University Giessen 35392 Giessen, Germany Phone: (+49) (0)641 9935810 ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Recovering from a Failed Disk (replication 1)

2019-10-17 Thread Burkhard Linke
Hi, On 10/17/19 5:56 AM, Ashley Merrick wrote: I think your better off doing the DD method, you can export and import a PG at a time (ceph-objectstore-tool) But if the disk is failing a DD is probably your best method. In case of hardware problems or broken sectors, I would recommend 'dd_

[ceph-users] Re: Nautilus: BlueFS spillover

2019-09-27 Thread Burkhard Linke
Hi, On 9/27/19 10:54 AM, Eugen Block wrote: Update: I expanded all rocksDB devices, but the warnings still appear: BLUEFS_SPILLOVER BlueFS spillover detected on 10 OSD(s) osd.0 spilled over 2.5 GiB metadata from 'db' device (2.4 GiB used of 30 GiB) to slow device osd.19 spilled over

[ceph-users] Re: Health error: 1 MDSs report slow metadata IOs, 1 MDSs report slow requests

2019-09-24 Thread Burkhard Linke
Hi, you need to fix the non active PGs first. They are also probably the reason for the blocked requests. Regards, Burkhard On 9/24/19 1:30 PM, Thomas wrote: Hi, ceph health reports 1 MDSs report slow metadata IOs 1 MDSs report slow requests This is the complete output of ceph -s: root@

[ceph-users] Re: regurlary 'no space left on device' when deleting on cephfs

2019-09-10 Thread Burkhard Linke
but this might lead to other problems (too large rados omap objects). Regards, Burkhard -- Dr. rer. nat. Burkhard Linke Bioinformatics and Systems Biology Justus-Liebig-University Giessen 35392 Giessen, Germany Phone: (+49) (0)641 9935810 ___ cep

[ceph-users] Re: Unable to replace OSDs deployed with ceph-volume lvm batch

2019-09-09 Thread Burkhard Linke
Hi, On 9/9/19 5:55 PM, Robert Sander wrote: Hi, On 09.09.19 17:39, Burkhard Linke wrote: # ceph-volume lvm create --bluestore --data /dev/sda --block.db /dev/ceph-block-dbs-ea684aa8-544e-4c4a-8664-6cb50b3116b8/osd-block-db-a8f1489a-d97b-479e-b9a7-30fc9fa99cb5 When using an LV on a VG omit

[ceph-users] Unable to replace OSDs deployed with ceph-volume lvm batch

2019-09-09 Thread Burkhard Linke
Hi, we had a failing hard disk, and I replace it and want to create a new OSD on it now. But ceph-volume fails under these circumstances. In the original setup, the OSDs were created with ceph-volume lvm batch using a bunch of drives and a NVMe device for bluestore db. The batch mode uses

[ceph-users] Re: How to map 2 different Openstack users belonging to the same project to 2 distinct radosgw users ?

2019-08-29 Thread Burkhard Linke
Hi, which protocol do you intend to use? Swift and S3 behave completely different with respect to users and keystone based authentication. Regards, Burkhard ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-us

[ceph-users] Re: help

2019-08-29 Thread Burkhard Linke
Hi, ceph uses a pseudo random distribution within crush to select the target hosts. As a result, the algorithm might not be able to select three different hosts out of three hosts in the configured number of tries. The affected PGs will be shown as undersized and only list two OSDs instead o