Re: [ceph-users] OSD + Flashcache + udev + Partition uuid

2015-03-20 Thread Burkhard Linke
Hi, On 03/19/2015 10:41 PM, Nick Fisk wrote: I'm looking at trialling OSD's with a small flashcache device over them to hopefully reduce the impact of metadata updates when doing small block io. Inspiration from here:- http://comments.gmane.org/gmane.comp.file-systems.ceph.devel/12083 One thin

Re: [ceph-users] 'pgs stuck unclean ' problem

2015-03-20 Thread Burkhard Linke
Hi, On 03/20/2015 01:58 AM, houguanghua wrote: Dear all, Ceph 0.72.2 is deployed in three hosts. But the ceph's status is HEALTH_WARN . The status is as follows: # ceph -s cluster e25909ed-25d9-42fd-8c97-0ed31eec6194 health HEALTH_WARN 768 pgs degraded; 768 pgs stuck u

[ceph-users] Disabling btrfs snapshots for existing OSDs

2015-04-23 Thread Burkhard Linke
Hi, I have a small number of OSDs running Ubuntu Trusty 14.04 and Ceph Firefly 0.80.9. Due to stability issues I would like to disable the btrfs snapshot feature (filestore btrfs snap = false). Is it possible to apply this change to an existing OSD (stop OSD, change config, restart OSD), or

[ceph-users] "Compacting" btrfs file storage

2015-04-23 Thread Burkhard Linke
Hi, I've noticed that the btrfs file storage is performing some cleanup/compacting operations during OSD startup. Before OSD start: /dev/sdc1 2.8T 2.4T 390G 87% /var/lib/ceph/osd/ceph-58 After OSD start: /dev/sdc1 2.8T 2.2T 629G 78% /var/lib/ceph/osd/ceph-58 O

Re: [ceph-users] ceph-fuse unable to run through "screen" ?

2015-04-23 Thread Burkhard Linke
. rer. nat. Burkhard Linke Bioinformatics and Systems Biology Justus-Liebig-University Giessen 35392 Giessen, Germany Phone: (+49) (0)641 9935810 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] OSD move after reboot

2015-04-23 Thread Burkhard Linke
Hi, On 04/23/2015 11:18 AM, Jake Grimmett wrote: Dear All, I have multiple disk types (15k & 7k) on each ceph node, which I assign to different pools, but have a problem as whenever I reboot a node, the OSD's move in the CRUSH map. i.e. on host ceph4, before a reboot I have this osd tree -

[ceph-users] cephfs: recovering from transport endpoint not connected?

2015-04-27 Thread Burkhard Linke
Hi, I've deployed ceph on a number of nodes in our compute cluster (Ubuntu 14.04 Ceph Firefly 0.80.9). /ceph is mounted via ceph-fuse. From time to time some nodes loose their access to cephfs with the following error message: # ls /ceph ls: cannot access /ceph: Transport endpoint is not co

Re: [ceph-users] cephfs: recovering from transport endpoint not connected?

2015-04-28 Thread Burkhard Linke
Hi, On 04/27/2015 02:31 PM, Yan, Zheng wrote: On Mon, Apr 27, 2015 at 3:42 PM, Burkhard Linke wrote: Hi, I've deployed ceph on a number of nodes in our compute cluster (Ubuntu 14.04 Ceph Firefly 0.80.9). /ceph is mounted via ceph-fuse. From time to time some nodes loose their acce

Re: [ceph-users] Btrfs defragmentation

2015-05-07 Thread Burkhard Linke
Hi, On 05/07/2015 12:04 PM, Lionel Bouton wrote: On 05/06/15 19:51, Lionel Bouton wrote: *snipsnap* We've seen progress on this front. Unfortunately for us we had 2 power outages and they seem to have damaged the disk controller of the system we are testing Btrfs on: we just had a system crash

[ceph-users] Fwd: Re: Unexpected disk write activity with btrfs OSDs

2015-06-19 Thread Burkhard Linke
Forget the reply to the list... Forwarded Message Subject:Re: [ceph-users] Unexpected disk write activity with btrfs OSDs Date: Fri, 19 Jun 2015 09:06:33 +0200 From: Burkhard Linke To: Lionel Bouton Hi, On 06/18/2015 11:28 PM, Lionel Bouton wrote: Hi

[ceph-users] Removing empty placement groups / empty objects

2015-06-29 Thread Burkhard Linke
Hi, I've noticed that a number of placement groups in our setup contain objects, but no actual data (ceph pg dump | grep remapped during a hard disk replace operation): 7.616 26360 0 52720 4194304 3003 3003 active+remapped+wait_backfill 2015-06-29 13:43:28.716

Re: [ceph-users] Removing empty placement groups / empty objects

2015-07-01 Thread Burkhard Linke
Hi, On 07/01/2015 06:09 PM, Gregory Farnum wrote: On Mon, Jun 29, 2015 at 1:44 PM, Burkhard Linke wrote: Hi, I've noticed that a number of placement groups in our setup contain objects, but no actual data (ceph pg dump | grep remapped during a hard disk replace operation): 7.616 2636

Re: [ceph-users] any recommendation of using EnhanceIO?

2015-07-02 Thread Burkhard Linke
ad to discard it after a fatal kernel crash related to the module. In my experience it works stable in write through mode, but write back is buggy. Since the later one is the interesting one in almost any use case, I would not recommend to use it. Best regards, Burkhard

[ceph-users] State of nfs-ganesha CEPH fsal

2015-07-27 Thread Burkhard Linke
Hi, the nfs-ganesha documentation states: "... This FSAL links to a modified version of the CEPH library that has been extended to expose its distributed cluster and replication facilities to the pNFS operations in the FSAL. ... The CEPH library modifications have not been merged into the ups

Re: [ceph-users] State of nfs-ganesha CEPH fsal

2015-07-28 Thread Burkhard Linke
Hi, On 07/27/2015 05:42 PM, Gregory Farnum wrote: On Mon, Jul 27, 2015 at 4:33 PM, Burkhard Linke wrote: Hi, the nfs-ganesha documentation states: "... This FSAL links to a modified version of the CEPH library that has been extended to expose its distributed cluster and replic

Re: [ceph-users] State of nfs-ganesha CEPH fsal

2015-07-28 Thread Burkhard Linke
Hi, On 07/28/2015 11:08 AM, Haomai Wang wrote: On Tue, Jul 28, 2015 at 4:47 PM, Gregory Farnum wrote: On Tue, Jul 28, 2015 at 8:01 AM, Burkhard Linke wrote: *snipsnap* Can you give some details on that issues? I'm currently looking for a way to provide NFS based access to CephFS t

[ceph-users] Group permission problems with CephFS

2015-08-04 Thread Burkhard Linke
Hi, I've encountered some problems accesing files on CephFS: $ ls -al syntenyPlot.png -rw-r- 1 edgar edgar 9329 Jun 11 2014 syntenyPlot.png $ groups ... edgar ... $ cat syntenyPlot.png cat: syntenyPlot.png: Permission denied CephFS is mounted via ceph-fuse: ceph-fuse on /ceph type fuse.c

Re: [ceph-users] Unable to start libvirt VM when using cache tiering.

2015-08-05 Thread Burkhard Linke
Hi, On 08/05/2015 02:13 PM, Pieter Koorts wrote: Hi All, This seems to be a weird issue. Firstly all deployment is done with "ceph-deploy" and 3 host machines acting as MON and OSD using the Hammer release on Ubuntu 14.04.3 and running KVM (libvirt). When using vanilla CEPH, single rbd pool

Re: [ceph-users] Unable to start libvirt VM when using cache tiering.

2015-08-05 Thread Burkhard Linke
Hi, On 08/05/2015 02:54 PM, Pieter Koorts wrote: Hi Burkhard, I seemed to have missed that part but even though allowing access (rwx) to the cache pool it still has a similar (not same) problem. The VM process starts but it looks more like a dead or stuck process trying forever to start and

Re: [ceph-users] Unable to start libvirt VM when using cache tiering.

2015-08-05 Thread Burkhard Linke
Hi, On 08/05/2015 03:09 PM, Pieter Koorts wrote: Hi, This is my OSD dump below ### osc-mgmt-1:~$ sudo ceph osd dump | grep pool pool 0 'rbd' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 128 pgp_num 128 last_change 43 lfor 43 flags hashpspool t

Re: [ceph-users] Unable to start libvirt VM when using cache tiering.

2015-08-05 Thread Burkhard Linke
Hi, On 08/05/2015 05:54 PM, Pieter Koorts wrote: Hi I suspect something more sinister may be going on. I have set the values (though smaller) on my cluster but the same issue happens. I also find when the VM is trying to start there might be an IRQ flood as processes like ksoftirqd seem to u

Re: [ceph-users] Different filesystems on OSD hosts at the samecluster

2015-08-07 Thread Burkhard Linke
Hi, On 08/07/2015 04:04 PM, Udo Lembke wrote: Hi, some time ago I switched all OSDs from XFS to ext4 (step by step). I had no issues during mixed osd-format (the process takes some weeks). And yes, for me ext4 performs also better (esp. the latencies). Just out of curiosity: Do you use a ext

Re: [ceph-users] Different filesystems on OSD hosts at the samecluster

2015-08-07 Thread Burkhard Linke
Hi, On 08/07/2015 04:30 PM, Udo Lembke wrote: Hi, I use the ext4-parameters like Christian Balzer wrote in one posting: osd mount options ext4 = "user_xattr,rw,noatime,nodiratime" osd_mkfs_options_ext4 = -J size=1024 -E lazy_itable_init=0,lazy_journal_init=0 Thx for the details. The osd-journ

[ceph-users] CephFS: removing default data pool

2015-09-28 Thread Burkhard Linke
Hi, I've created CephFS with a certain data pool some time ago (using firefly release). I've added addtional pools in the meantime and moved all data to them. But a large number of empty (or very small) objects are left in the pool according to 'ceph df': cephfs_test_data 7

[ceph-users] "stray" objects in empty cephfs data pool

2015-10-08 Thread Burkhard Linke
Hi, I've moved all files from a CephFS data pool (EC pool with frontend cache tier) in order to remove the pool completely. Some objects are left in the pools ('ceph df' output of the affected pools): cephfs_ec_data 19 7565k 0 66288G 13 Listing the object

Re: [ceph-users] "stray" objects in empty cephfs data pool

2015-10-08 Thread Burkhard Linke
Hi John, On 10/08/2015 12:05 PM, John Spray wrote: On Thu, Oct 8, 2015 at 10:21 AM, Burkhard Linke wrote: Hi, *snipsnap* I've moved all files from a CephFS data pool (EC pool with frontend cache tier) in order to remove the pool completely. Some objects are left in the pools (&#x

Re: [ceph-users] "stray" objects in empty cephfs data pool

2015-10-08 Thread Burkhard Linke
Hi John, On 10/08/2015 01:03 PM, John Spray wrote: On Thu, Oct 8, 2015 at 11:41 AM, Burkhard Linke wrote: *snipsnap* Thanks for the fast reply. During the transfer of all files from the EC pool to a standard replicated pool I've copied the file to a new file name, removed the origna

Re: [ceph-users] "stray" objects in empty cephfs data pool

2015-10-12 Thread Burkhard Linke
Hi, On 10/08/2015 09:14 PM, John Spray wrote: On Thu, Oct 8, 2015 at 7:23 PM, Gregory Farnum wrote: On Thu, Oct 8, 2015 at 6:29 AM, Burkhard Linke wrote: Hammer 0.94.3 does not support a 'dump cache' mds command. 'dump_ops_in_flight' does not list any pending operations

[ceph-users] CephFS and page cache

2015-10-16 Thread Burkhard Linke
Hi, I've noticed that CephFS (both ceph-fuse and kernel client in version 4.2.3) remove files from page cache as soon as they are not in use by a process anymore. Is this intended behaviour? We use CephFS as a replacement for NFS in our HPC cluster. It should serve large files which are read

Re: [ceph-users] CephFS and page cache

2015-10-19 Thread Burkhard Linke
Hi, On 10/19/2015 05:27 AM, Yan, Zheng wrote: On Sat, Oct 17, 2015 at 1:42 AM, Burkhard Linke wrote: Hi, I've noticed that CephFS (both ceph-fuse and kernel client in version 4.2.3) remove files from page cache as soon as they are not in use by a process anymore. Is this intended beha

Re: [ceph-users] CephFS and page cache

2015-10-19 Thread Burkhard Linke
Hi, On 10/19/2015 10:34 AM, Shinobu Kinjo wrote: What kind of applications are you talking about regarding to applications for HPC. Are you talking about like netcdf? Caching is quite necessary for some applications for computation. But it's not always the case. It's not quite related to this

Re: [ceph-users] CephFS and page cache

2015-10-19 Thread Burkhard Linke
Hi, On 10/19/2015 12:34 PM, John Spray wrote: On Mon, Oct 19, 2015 at 8:59 AM, Burkhard Linke wrote: Hi, On 10/19/2015 05:27 AM, Yan, Zheng wrote: On Sat, Oct 17, 2015 at 1:42 AM, Burkhard Linke wrote: Hi, I've noticed that CephFS (both ceph-fuse and kernel client in version

Re: [ceph-users] CephFS and page cache

2015-10-21 Thread Burkhard Linke
Hi, On 10/22/2015 02:54 AM, Gregory Farnum wrote: On Sun, Oct 18, 2015 at 8:27 PM, Yan, Zheng wrote: On Sat, Oct 17, 2015 at 1:42 AM, Burkhard Linke wrote: Hi, I've noticed that CephFS (both ceph-fuse and kernel client in version 4.2.3) remove files from page cache as soon as they ar

Re: [ceph-users] "stray" objects in empty cephfs data pool

2015-10-23 Thread Burkhard Linke
Hi, On 10/14/2015 06:32 AM, Gregory Farnum wrote: On Mon, Oct 12, 2015 at 12:50 AM, Burkhard Linke wrote: *snipsnap* Thanks, that did the trick. I was able to locate the host blocking the file handles and remove the objects from the EC pool. Well, all except one: # ceph df

Re: [ceph-users] CephFS and page cache

2015-10-28 Thread Burkhard Linke
Hi, On 10/26/2015 01:43 PM, Yan, Zheng wrote: On Thu, Oct 22, 2015 at 2:55 PM, Burkhard Linke wrote: Hi, On 10/22/2015 02:54 AM, Gregory Farnum wrote: On Sun, Oct 18, 2015 at 8:27 PM, Yan, Zheng wrote: On Sat, Oct 17, 2015 at 1:42 AM, Burkhard Linke wrote: Hi, I've noticed that C

Re: [ceph-users] State of nfs-ganesha CEPH fsal

2015-10-28 Thread Burkhard Linke
Hi, On 10/28/2015 03:08 PM, Dennis Kramer (DT) wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Sorry for raising this topic from the dead, but i'm having the same issues with NFS-GANESHA /w the wrong user/group information. Do you maybe have a working ganesha.conf? I'm assuming I might mi

Re: [ceph-users] Benchmark individual OSD's

2015-10-29 Thread Burkhard Linke
Hi, On 10/29/2015 09:54 AM, Luis Periquito wrote: Only way I can think of that is creating a new crush rule that selects that specific OSD with min_size = max_size = 1, then creating a pool with size = 1 and using that crush rule. Then you can use that pool as you'd use any other pool. I haven

Re: [ceph-users] CephFS and page cache

2015-10-29 Thread Burkhard Linke
Hi, On 10/29/2015 09:30 AM, Sage Weil wrote: On Thu, 29 Oct 2015, Yan, Zheng wrote: On Thu, Oct 29, 2015 at 2:21 PM, Gregory Farnum wrote: On Wed, Oct 28, 2015 at 8:38 PM, Yan, Zheng wrote: On Thu, Oct 29, 2015 at 1:10 AM, Burkhard Linke I tried to dig into the ceph-fuse code, but I was

Re: [ceph-users] Group permission problems with CephFS

2015-11-06 Thread Burkhard Linke
Hi, On 11/06/2015 04:52 PM, Aaron Ten Clay wrote: I'm seeing similar behavior as well. -rw-rw-r-- 1 testuser testgroup 6 Nov 6 07:41 testfile aaron@testhost$ groups ... testgroup ... aaron@testhost$ cat > testfile -bash: testfile: Permission denied Running version 9.0.2. Were you able to make

Re: [ceph-users] Erasure coded pools and 'feature set mismatch'issue

2015-11-09 Thread Burkhard Linke
Hi, On 11/09/2015 11:49 AM, Ilya Dryomov wrote: *snipsnap* You can install an ubuntu kernel from a newer ubuntu release, or pretty much any mainline kernel from kernel-ppa. Ubuntu Trusty has backported kernels from newer releases, e.g. linux-generic-lts-vivid. By using this packages you will

[ceph-users] cephfs: Client hp-s3-r4-compute failing to respond to capabilityrelease

2015-11-09 Thread Burkhard Linke
Hi, I'm currently investigating a lockup problem involving CephFS and SQLite databases. Applications lock up if the same database is accessed from multiple hosts. I was able to narrow the problem down to two host: host A: sqlite3 .schema host B: sqlite3 .schema If both .schema commands h

Re: [ceph-users] cephfs: Client hp-s3-r4-compute failing to respondtocapabilityrelease

2015-11-09 Thread Burkhard Linke
Hi, On 11/09/2015 02:07 PM, Burkhard Linke wrote: Hi, *snipsnap* Cluster is running Hammer 0.94.5 on top of Ubuntu 14.04. Clients use ceph-fuse with patches for improved page cache handling, but the problem also occur with the official hammer packages from download.ceph.com I've t

Re: [ceph-users] cephfs: Client hp-s3-r4-compute failing torespondtocapabilityrelease

2015-11-09 Thread Burkhard Linke
Hi, On 11/09/2015 04:03 PM, Gregory Farnum wrote: On Mon, Nov 9, 2015 at 6:57 AM, Burkhard Linke wrote: Hi, On 11/09/2015 02:07 PM, Burkhard Linke wrote: Hi, *snipsnap* Cluster is running Hammer 0.94.5 on top of Ubuntu 14.04. Clients use ceph-fuse with patches for improved page cache

Re: [ceph-users] cephfs: Client hp-s3-r4-compute failingtorespondtocapabilityrelease

2015-11-11 Thread Burkhard Linke
Hi, On 11/10/2015 09:20 PM, Gregory Farnum wrote: Can you dump the metadata ops in flight on each ceph-fuse when it hangs? ceph daemon mds_requests Current state: host A and host B blocked, both running ceph-fuse 0.94.5 (trusty package) hostA mds_requests (client id 1265369): { "reque

Re: [ceph-users] cephfs: Client hp-s3-r4-compute failingtorespondtocapabilityrelease

2015-11-16 Thread Burkhard Linke
Hi, On 11/13/2015 03:42 PM, Yan, Zheng wrote: On Tue, Nov 10, 2015 at 12:06 AM, Burkhard Linke <mailto:burkhard.li...@computational.bio.uni-giessen.de>> wrote: > Hi, *snipsnap* it seems the hang is related to async invalidate. please try the following patch --- diff --git

Re: [ceph-users] Removing OSD - double rebalance?

2015-11-30 Thread Burkhard Linke
Hi Carsten, On 11/30/2015 10:08 AM, Carsten Schmitt wrote: Hi all, I'm running ceph version 0.94.5 and I need to downsize my servers because of insufficient RAM. So I want to remove OSDs from the cluster and according to the manual it's a pretty straightforward process: I'm beginning with "

[ceph-users] Enforce MDS map update in CephFS kernel driver

2016-04-27 Thread Burkhard Linke
Hi, we recently stumbled over a problem with the kernel based CephFS driver (Ubuntu Trusty with 4.4.0-18 kernel from xenial lts backport package). Our MDS failed for some unknown reason, and the standby MDS became active. After rejoining the MDS cluster, the former standby MDS stuck at the c

[ceph-users] Disabling POSIX locking semantics for CephFS

2016-05-03 Thread Burkhard Linke
Hi, we have a number of legacy applications that do not cope well with the POSIX locking semantics in CephFS due to missing locking support (e.g. flock syscalls). We are able to fix some of these applications, but others are binary only. Is it possible to disable POSIX locking completely in

Re: [ceph-users] Disabling POSIX locking semantics for CephFS

2016-05-03 Thread Burkhard Linke
Hi, On 03.05.2016 18:39, Gregory Farnum wrote: On Tue, May 3, 2016 at 9:30 AM, Burkhard Linke wrote: Hi, we have a number of legacy applications that do not cope well with the POSIX locking semantics in CephFS due to missing locking support (e.g. flock syscalls). We are able to fix some of

Re: [ceph-users] Disabling POSIX locking semantics for CephFS

2016-05-04 Thread Burkhard Linke
Hi, On 05/04/2016 09:15 AM, Yan, Zheng wrote: On Wed, May 4, 2016 at 3:39 AM, Burkhard Linke wrote: Hi, On 03.05.2016 18:39, Gregory Farnum wrote: On Tue, May 3, 2016 at 9:30 AM, Burkhard Linke wrote: Hi, we have a number of legacy applications that do not cope well with the POSIX

Re: [ceph-users] Unable to mount the CephFS file system fromclientnode with "mount error 5 = Input/output error"

2016-06-14 Thread Burkhard Linke
Hi, On 06/14/2016 01:21 PM, Rakesh Parkiti wrote: Hello, Unable to mount the CephFS file system from client node with *"mount error 5 = Input/output error"* MDS was installed on a separate node. Ceph Cluster health is OK and mds services are running. firewall was disabled across all the nodes

Re: [ceph-users] stuck unclean since forever

2016-06-22 Thread Burkhard Linke
Hi, On 06/22/2016 12:10 PM, min fang wrote: Hi, I created a new ceph cluster, and create a pool, but see "stuck unclean since forever" errors happen(as the following), can help point out the possible reasons for this? thanks. ceph -s cluster 602176c1-4937-45fc-a246-cc16f1066f65 healt

Re: [ceph-users] Recover Data from Deleted RBD Volume

2016-08-08 Thread Burkhard Linke
Hi, On 08.08.2016 09:58, Georgios Dimitrakakis wrote: Dear all, I would like your help with an emergency issue but first let me describe our environment. Our environment consists of 2OSD nodes with 10x 2TB HDDs each and 3MON nodes (2 of them are the OSD nodes as well) all with ceph version

Re: [ceph-users] Recover Data from Deleted RBD Volume

2016-08-08 Thread Burkhard Linke
Hi, On 08.08.2016 10:50, Georgios Dimitrakakis wrote: Hi, On 08.08.2016 09:58, Georgios Dimitrakakis wrote: Dear all, I would like your help with an emergency issue but first let me describe our environment. Our environment consists of 2OSD nodes with 10x 2TB HDDs each and 3MON nodes (2

[ceph-users] Merging CephFS data pools

2016-08-18 Thread Burkhard Linke
Hi, the current setup for CephFS at our site uses two data pools due to different requirements in the past. I want to merge these two pools now, eliminating the second pool completely. I've written a small script to locate all files on the second pool using their file layout attributes and r

[ceph-users] Recommended hardware for MDS server

2016-08-22 Thread Burkhard Linke
Hi, we are running CephFS with about 70TB data, > 5 million files and about 100 clients. The MDS is currently colocated on a storage box with 14 OSD (12 HDD, 2SSD). The box has two E52680v3 CPUs and 128 GB RAM. CephFS runs fine, but it feels like the metadata operations may need more speed.

Re: [ceph-users] CephFS vs RBD

2017-06-23 Thread Burkhard Linke
Hi, On 06/23/2017 02:44 PM, Bogdan SOLGA wrote: Hello, everyone! We are working on a project which uses RBD images (formatted with XFS) as home folders for the project's users. The access speed and the overall reliability have been pretty good, so far. From the architectural perspective, o

Re: [ceph-users] cephfs df with EC pool

2017-06-28 Thread Burkhard Linke
o operates on filesystem level, and data pools are assigned on directory level (with the default pool being assigned to the root directory). A single sane value for free available space is thus not meaningful, so the cephfs implementation just reports the o

Re: [ceph-users] Cannot mount Ceph FS

2017-06-29 Thread Burkhard Linke
290 GB avail 192 active+clean Any hints? What I am doing wrong? You need a running MDS daemon for CephFS. Regards, Burkhard Linke ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] "Zombie" ceph-osd@xx.service remain fromoldinstallation

2017-08-03 Thread Burkhard Linke
Hi, On 03.08.2017 16:31, c.mo...@web.de wrote: Hello! I have purged my ceph and reinstalled it. ceph-deploy purge node1 node2 node3 ceph-deploy purgedata node1 node2 node3 ceph-deploy forgetkeys All disks configured as OSDs are physically in two servers. Due to some restrictions I needed to m

Re: [ceph-users] output discards (queue drops) on switchport

2017-09-08 Thread Burkhard Linke
Hi, On 09/08/2017 02:12 PM, Marc Roos wrote: Afaik ceph is is not supporting/working with bonding. https://www.mail-archive.com/ceph-users@lists.ceph.com/msg35474.html (thread: Maybe some tuning for bonded network adapters) CEPH works well with LACP bonds. The problem described in that thr

Re: [ceph-users] output discards (queue drops) on switchport

2017-09-08 Thread Burkhard Linke
Hi, On 09/08/2017 04:13 PM, Andreas Herrmann wrote: Hi, On 08.09.2017 15:59, Burkhard Linke wrote: On 09/08/2017 02:12 PM, Marc Roos wrote: Afaik ceph is is not supporting/working with bonding. https://www.mail-archive.com/ceph-users@lists.ceph.com/msg35474.html (thread: Maybe some

Re: [ceph-users] Fwd: FileStore vs BlueStore

2017-09-20 Thread Burkhard Linke
Hi, On 09/20/2017 11:10 AM, Sam Huracan wrote: So why do not journal write only metadata? As I've read, it is for ensure consistency of data, but I do not know how to do that in detail? And why BlueStore still ensure consistency without journal? The main reason for having a journal with fil

Re: [ceph-users] Fwd: FileStore vs BlueStore

2017-09-20 Thread Burkhard Linke
Hi, On 09/20/2017 12:24 PM, Sean Purdy wrote: On Wed, 20 Sep 2017, Burkhard Linke said: The main reason for having a journal with filestore is having a block device that supports synchronous writes. Writing to a filesystem in a synchronous way (e.g. including all metadata writes) results in a

Re: [ceph-users] CephFs kernel client metadata caching

2017-10-13 Thread Burkhard Linke
Hi, On 10/13/2017 12:36 PM, Denes Dolhay wrote: Dear All, First of all, this is my first post, so please be lenient :) For the last few days I have been testing ceph, and cephfs, deploying a PoC cluster. I have been testing the cephfs kernel client caching, when I came across something

Re: [ceph-users] CephFs kernel client metadata caching

2017-10-13 Thread Burkhard Linke
Hi, On 10/13/2017 02:26 PM, Denes Dolhay wrote: Hi, Thank you for your fast response! Is there a way -that You know of- to list these locks? The only way I know of is dumping the MDS cache content. But I don't know exactly how to do it or how to analyse the content. I write to the file w

Re: [ceph-users] unusual growth in cluster after replacing journalSSDs

2017-11-16 Thread Burkhard Linke
Hi, On 11/16/2017 01:36 PM, Jogi Hofmüller wrote: Dear all, for about a month we experience something strange in our small cluster. Let me first describe what happened on the way. On Oct 4ht smartmon told us that the journal SSDs in one of our two ceph nodes will fail. Since getting replac

[ceph-users] ceph luminous + multi mds: slow request. behind on trimming, failedto authpin local pins

2017-12-06 Thread Burkhard Linke
/21975 ? Regards, Burkhard Linke ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] How to fix mon scrub errors?

2017-12-12 Thread Burkhard Linke
HI, since the upgrade to luminous 12.2.2 the mons are complaining about scrub errors: 2017-12-13 08:49:27.169184 mon.ceph-storage-03 [ERR] scrub mismatch 2017-12-13 08:49:27.169203 mon.ceph-storage-03 [ERR]  mon.0 ScrubResult(keys {logm=87,mds_health=13} crc {logm=4080463437,mds_health=221

Re: [ceph-users] Deterministic naming of LVM volumes (ceph-volume)

2017-12-13 Thread Burkhard Linke
Hi, On 12/13/2017 02:12 PM, Webert de Souza Lima wrote: Cool On Wed, Dec 13, 2017 at 11:04 AM, Stefan Kooman > wrote: So, a "ceph osd ls" should give us a list, and we will pick the smallest available number as the new osd id to use. We will make a check in

Re: [ceph-users] Problems understanding 'ceph features' output

2017-12-15 Thread Burkhard Linke
Hi, On 12/15/2017 10:56 AM, Massimo Sgaravatto wrote: Hi I tried the jewel --> luminous update on a small testbed composed by: - 3 mon + mgr nodes - 3 osd nodes (4 OSDs per each of this node) - 3 clients (each client maps a single volume) *snipsnap* [*]     "client": {         "group": {

Re: [ceph-users] Unable to ceph-deploy luminos

2017-12-18 Thread Burkhard Linke
Hi, On 12/18/2017 05:28 PM, Andre Goree wrote: I'm working on setting up a cluster for testing purposes and I can't see to install luminos.  All nodes are runnind Ubuntu 16.04. [cephadmin][DEBUG ] Err:7 https://download.ceph.com/debian-luminos xenial/main amd64 Packages [cephadmin][DEBUG ] 

[ceph-users] Slow backfilling with bluestore, ssd and metadata pools

2017-12-21 Thread Burkhard Linke
Hi, we are in the process of migrating our hosts to bluestore. Each host has 12 HDDs (6TB / 4TB) and two Intel P3700 NVME SSDs with 375 GB capacity. The new bluestore OSDs are created by ceph-volume: ceph-volume lvm create --bluestore --block.db /dev/nvmeXn1pY --data /dev/sdX1 6 OSDs sh

Re: [ceph-users] Proper way of removing osds

2017-12-21 Thread Burkhard Linke
Hi, On 12/21/2017 11:03 AM, Karun Josy wrote: Hi, This is how I remove an OSD from cluster * Take it out ceph osd out osdid Wait for the balancing to finish * Mark it down ceph osd down osdid Then Purge it  cephosd purge osdid --yes-i-really-mean-it While purging

Re: [ceph-users] Slow backfilling with bluestore, ssd and metadatapools

2017-12-21 Thread Burkhard Linke
Hi, On 12/21/2017 11:43 AM, Richard Hesketh wrote: On 21/12/17 10:28, Burkhard Linke wrote: OSD config section from ceph.conf: [osd] osd_scrub_sleep = 0.05 osd_journal_size = 10240 osd_scrub_chunk_min = 1 osd_scrub_chunk_max = 1 max_pg_per_osd_hard_ratio = 4.0 osd_max_pg_per_osd_hard_ratio

Re: [ceph-users] cephfs degraded on ceph luminous 12.2.2

2018-01-09 Thread Burkhard Linke
Hi, On 01/08/2018 05:40 PM, Alessandro De Salvo wrote: Thanks Lincoln, indeed, as I said the cluster is recovering, so there are pending ops:     pgs: 21.034% pgs not active 1692310/24980804 objects degraded (6.774%) 5612149/24980804 objects misplaced (22.466%)

Re: [ceph-users] After Luminous upgrade: ceph-fuse clients failingtorespond to cache pressure

2018-01-16 Thread Burkhard Linke
Hi, On 01/16/2018 09:50 PM, Andras Pataki wrote: Dear Cephers, *snipsnap* We are running with a larger MDS cache than usual, we have mds_cache_size set to 4 million.  All other MDS configs are the defaults. AFAIK the MDS cache management in luminous has changed, focusing on memory siz

Re: [ceph-users] Adding disks -> getting unfound objects [Luminous]

2018-01-23 Thread Burkhard Linke
Hi, On 01/23/2018 08:54 AM, Nico Schottelius wrote: Good morning, the osd.61 actually just crashed and the disk is still intact. However, after 8 hours of rebuilding, the unfound objects are still missing: *snipsnap* Is there any chance to recover those pgs or did we actually lose data wi

Re: [ceph-users] Importance of Stable Mon and OSD IPs

2018-01-23 Thread Burkhard Linke
Hi, On 01/23/2018 09:53 AM, Mayank Kumar wrote: Hi Ceph Experts I am a new user of Ceph and currently using Kubernetes to deploy Ceph RBD Volumes. We our doing some initial work rolling it out to internal customers and in doing that we are using the ip of the host as the ip of the osd and m

Re: [ceph-users] Importance of Stable Mon and OSD IPs

2018-01-31 Thread Burkhard Linke
Hi, On 02/01/2018 07:21 AM, Mayank Kumar wrote: Thanks Gregory and Burkhard In kubernetes we use rbd create  and rbd map/unmap commands. In this perspective are you referring to rbd as the client or after the image is created and mapped, is there a different client running inside the kernel

Re: [ceph-users] Migration from "classless pre luminous" to"deviceclasses" CRUSH.

2018-02-01 Thread Burkhard Linke
Hi, On 02/01/2018 10:43 AM, Konstantin Shalygin wrote: Hi cephers. I have typical double root crush - for nvme pools and hdd pools created on Kraken cluster (what I mean: http://cephnotes.ksperis.com/blog/2015/02/02/crushmap-example-of-a-hierarchical-cluster-map). Now cluster upgraded to

Re: [ceph-users] Monitor won't upgrade

2018-02-15 Thread Burkhard Linke
Hi, On 02/15/2018 09:19 AM, Mark Schouten wrote: On woensdag 14 februari 2018 16:20:57 CET David Turner wrote: From the mon.0 server run `ceph --version`. If you've restarted the mon daemon and it is still showing 0.94.5, it is most likely because that is the version of the packages on that

[ceph-users] Fwd: Re: Merging CephFS data pools

2016-08-23 Thread Burkhard Linke
Missing CC to list Forwarded Message Subject:Re: [ceph-users] Merging CephFS data pools Date: Tue, 23 Aug 2016 08:59:45 +0200 From: Burkhard Linke To: Gregory Farnum Hi, On 08/22/2016 10:02 PM, Gregory Farnum wrote: On Thu, Aug 18, 2016 at 12:21

Re: [ceph-users] Recommended hardware for MDS server

2016-08-23 Thread Burkhard Linke
discussions. On Mon, 22 Aug 2016 14:47:38 +0200 Burkhard Linke wrote: Hi, we are running CephFS with about 70TB data, > 5 million files and about 100 clients. The MDS is currently colocated on a storage box with 14 OSD (12 HDD, 2SSD). The box has two E52680v3 CPUs and 128 GB RAM. CephFS runs fine,

[ceph-users] CephFS + cache tiering in Jewel

2016-08-23 Thread Burkhard Linke
Hi, the Firefly and Hammer releases did not support transparent usage of cache tiering in CephFS. The cache tier itself had to be specified as data pool, thus preventing on-the-fly addition and removal of cache tiers. Does the same restriction also apply to Jewel? I would like to add a cache

Re: [ceph-users] phantom osd.0 in osd tree

2016-08-23 Thread Burkhard Linke
Hi, On 08/23/2016 08:19 PM, Reed Dier wrote: Trying to hunt down a mystery osd populated in the osd tree. Cluster was deployed using ceph-deploy on an admin node, originally 10.2.1 at time of deployment, but since upgraded to 10.2.2. For reference, mons and mds do not live on the osd nodes,

Re: [ceph-users] CephFS + cache tiering in Jewel

2016-08-24 Thread Burkhard Linke
Hi, On 08/24/2016 10:22 PM, Gregory Farnum wrote: On Tue, Aug 23, 2016 at 7:50 AM, Burkhard Linke wrote: Hi, the Firefly and Hammer releases did not support transparent usage of cache tiering in CephFS. The cache tier itself had to be specified as data pool, thus preventing on-the-fly

Re: [ceph-users] cephfs/ceph-fuse: mds0: Client XXX:XXX failingtorespond to capability release

2016-09-14 Thread Burkhard Linke
Hi, On 09/14/2016 12:43 PM, Dennis Kramer (DT) wrote: Hi Goncalo, Thank you. Yes, i have seen that thread, but I have no near full osds and my mds cache size is pretty high. You can use the daemon socket on the mds server to get an overview of the current cache state: ceph daemon mds.XXX

Re: [ceph-users] cephfs/ceph-fuse: mds0: Client XXX:XXXfailingtorespondto capability release

2016-09-14 Thread Burkhard Linke
Hi, My cluster is back to HEALTH_OK, the involved host has been restarted by the user. But I will debug some more on the host when i see this issue again next time. PS: For completeness, i've stated that this issue was often seen in my current Jewel environment, I meant to say that this iss

[ceph-users] CephFS: Upper limit for number of files in a directory?

2016-09-15 Thread Burkhard Linke
Hi, does CephFS impose an upper limit on the number of files in a directory? We currently have one directory with a large number of subdirectories: $ ls | wc -l 158141 Creating a new subdirectory fails: $ touch foo touch: cannot touch 'foo': No space left on device Creating files in a diffe

Re: [ceph-users] CephFS: Upper limit for number of files in adirectory?

2016-09-15 Thread Burkhard Linke
Hi, On 09/15/2016 12:00 PM, John Spray wrote: On Thu, Sep 15, 2016 at 2:20 PM, Burkhard Linke wrote: Hi, does CephFS impose an upper limit on the number of files in a directory? We currently have one directory with a large number of subdirectories: $ ls | wc -l 158141 Creating a new

Re: [ceph-users] Ceph full cluster

2016-09-26 Thread Burkhard Linke
Hi, On 09/26/2016 12:58 PM, Dmitriy Lock wrote: Hello all! I need some help with my Ceph cluster. I've installed ceph cluster with two physical servers with osd /data 40G on each. Here is ceph.conf: [global] fsid = 377174ff-f11f-48ec-ad8b-ff450d43391c mon_initial_members = vm35, vm36 mon_host

Re: [ceph-users] Ceph with Cache pool - disk usage / cleanup

2016-09-28 Thread Burkhard Linke
Hi, someone correct me if I'm wrong, but removing objects in a cache tier setup result in empty objects which acts as markers for deleting the object on the backing store.. I've seen the same pattern you have described in the past. As a test you can try to evict all objects from the cache

Re: [ceph-users] ceph write performance issue

2016-09-29 Thread Burkhard Linke
map_header_cache_size=1024 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Dr. rer. nat. Burkhard Linke Bioinformatics and Systems Biology Justus-Liebig-University Giessen 35392 Giessen, Germany P

Re: [ceph-users] Ceph with Cache pool - disk usage / cleanup

2016-09-29 Thread Burkhard Linke
Hi, On 09/29/2016 01:34 PM, Sascha Vogt wrote: *snipsnap* We have a huge amount of short lived VMs which are deleted before they are even flushed to the backing pool. Might this be the reason, that ceph doesn't handle that particular thing well? Eg. when deleting an object / RBD image which h

Re: [ceph-users] Ceph with Cache pool - disk usage / cleanup

2016-09-29 Thread Burkhard Linke
Hi, On 09/29/2016 01:46 PM, Sascha Vogt wrote: A quick follow up question: Am 29.09.2016 um 13:34 schrieb Sascha Vogt: Can you check/verify that the deleted objects are actually gone on the backing pool? How do I check that? Aka how to find out on which OSD a particular object in the cache p

Re: [ceph-users] Ceph with Cache pool - disk usage / cleanup

2016-09-29 Thread Burkhard Linke
Hi, On 09/29/2016 02:52 PM, Sascha Vogt wrote: Hi, Am 29.09.2016 um 13:45 schrieb Burkhard Linke: On 09/29/2016 01:34 PM, Sascha Vogt wrote: We have a huge amount of short lived VMs which are deleted before they are even flushed to the backing pool. Might this be the reason, that ceph

[ceph-users] radosgw backup / staging solutions?

2016-09-30 Thread Burkhard Linke
Hi, we are about to move from internal testing to a first production setup with our object storage based on Ceph RGW. One of the last open problems is a backup / staging solution for S3 buckets. As far as I know many of the life-cycle operations available in Amazon S3 are not implemented in

Re: [ceph-users] 6 Node cluster with 24 SSD per node: Hardwareplanning/ agreement

2016-10-04 Thread Burkhard Linke
Hi, some thoughts about network and disks inline On 10/04/2016 03:43 PM, Denny Fuchs wrote: Hello, *snipsnap* * Storage NIC: 1 x Infiniband MCX314A-BCCT ** I red, that ConnectX-3 Pro is better supported, than the X-4 and a bit cheaper ** Switch: 2 x Mellanox SX6012 (56Gb/s) ** Active FC

Re: [ceph-users] Merging CephFS data pools

2016-10-05 Thread Burkhard Linke
Hi, I've managed to move the data from the old pool to the new one using some shell scripts and cp/rsync. Recursive getfattr on the mount point does not reveal any file with a layout refering the old pool. Nonetheless 486 objects are left in the pool: ... POOLS: NAME I

  1   2   >