Re: [ceph-users] CEPH MON Updates Live

2017-04-25 Thread Henrik Korkuc
On 17-04-24 19:38, Ashley Merrick wrote: Hey, Quick question hopefully have tried a few Google searches but noting concrete. I am running KVM VM's using KRBD, if I add and remove CEPH mon's are the running VM's updated with this information. Or do I need to reboot the VM's for them to be prov

[ceph-users] inconsistent of pgs due to attr_value_mismatch

2017-04-25 Thread Lomayani S. Laizer
Hello, Am having this error in my cluster of inconsistent of pgs due to attr_value_mismatch. Looks all pgs having these error is hosting one vm with ID 3fb4c238e1f29. Am using using replication of 3 with min of 2. Pg repair is not working. Please any suggestions to resolve this issue. More logs ar

[ceph-users] cephfs not writeable on a few clients

2017-04-25 Thread Steininger, Herbert
Hi, I'm fairly new to cephfs, on my new job there is a cephfs-cluster that I have to administer. The problem is, I can't write from some clients to the cephfs-mount. When I try from the specific clients I get in the Logfile: >Apr 24 13:14:00 cuda002 kernel: ceph: mds0 hung >Apr 24 13:14:00 cuda0

[ceph-users] Large META directory within each OSD's directory

2017-04-25 Thread 许雪寒
Hi, everyone. Recently, in one of our clusters, we found that the “META” directory in each OSD’s working directory is getting extremely large, about 17GB each. Why hasn’t the OSD cleared those old osdmaps? How should I deal with this problem? Thank you☺ _

Re: [ceph-users] v12.0.2 Luminous (dev) released

2017-04-25 Thread Dan van der Ster
Hi, The mon's on my test luminous cluster do not start after upgrading from 12.0.1 to 12.0.2. Here is the backtrace: 0> 2017-04-25 11:06:02.897941 7f467ddd7880 -1 *** Caught signal (Aborted) ** in thread 7f467ddd7880 thread_name:ceph-mon ceph version 12.0.2 (5a1b6b3269da99a18984c138c23935

Re: [ceph-users] v12.0.2 Luminous (dev) released

2017-04-25 Thread Dan van der Ster
Could this change be the culprit? commit 973829132bf7206eff6c2cf30dd0aa32fb0ce706 Author: Sage Weil Date: Fri Mar 31 09:33:19 2017 -0400 mon/OSDMonitor: spinlock -> std::mutex I think spinlock is dangerous here: we're doing semi-unbounded work (decode). Also seemingly innocuous c

Re: [ceph-users] v12.0.2 Luminous (dev) released

2017-04-25 Thread Dan van der Ster
Created ticket to follow up: http://tracker.ceph.com/issues/19769 On Tue, Apr 25, 2017 at 11:34 AM, Dan van der Ster wrote: > Could this change be the culprit? > > commit 973829132bf7206eff6c2cf30dd0aa32fb0ce706 > Author: Sage Weil > Date: Fri Mar 31 09:33:19 2017 -0400 > > mon/OSDMonito

[ceph-users] 答复: cephfs not writeable on a few clients

2017-04-25 Thread Xusangdi
The working client is running in user space (probably ceph-fuse), while the non-working client is using kernel mount 发件人: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] 代表 Steininger, Herbert 发送时间: 2017年4月25日 16:44 收件人: ceph-users@lists.ceph.com 主题: [ceph-users] cephfs not writeable on a

[ceph-users] Is single MDS data recoverable

2017-04-25 Thread gjprabu
Hi Team, I am running cephfs setup with single MDS . Suppose in single MDS setup if the MDS goes down what will happen for data. Is it advisable to run multiple MDS. Regards Prabu GJ ___ ceph-users mailing list ceph-users@lists.ceph

Re: [ceph-users] Ceph built from source, can't start ceph-mon

2017-04-25 Thread Joao Eduardo Luis
On 04/25/2017 03:52 AM, Henry Ngo wrote: Anyone? On Sat, Apr 22, 2017 at 12:33 PM, Henry Ngo mailto:henry@phazr.io>> wrote: I followed the install doc however after deploying the monitor, the doc states to start the mon using Upstart. I learned through digging around that the Up

Re: [ceph-users] Is single MDS data recoverable

2017-04-25 Thread Henrik Korkuc
On 17-04-25 13:43, gjprabu wrote: Hi Team, I am running cephfs setup with single MDS . Suppose in single MDS setup if the MDS goes down what will happen for data. Is it advisable to run multiple MDS. MDS data is in Ceph cluster itself. After MDS failure you can start another MDS

Re: [ceph-users] Large META directory within each OSD's directory

2017-04-25 Thread David Turner
Which version of Ceph are you running? My guess is Hammer pre-0.94.9. There is an osdmap cache bug that was introduced with Hammer that was fixed in 0.94.9. The work around is to restart all of the OSDs in your cluster. After restarting the OSDs, the cluster will start to clean up osdmaps 20 at a t

Re: [ceph-users] v12.0.2 Luminous (dev) released

2017-04-25 Thread Sage Weil
I think this commit just missed 12.0.2: commit 32b1b0476ad0d6a50d84732ce96cda6ee09f6bec Author: Sage Weil Date: Mon Apr 10 17:36:37 2017 -0400 mon/OSDMonitor: tolerate upgrade from post-kraken dev cluster If the 'creating' pgs key is missing, move on without crashing. Si

[ceph-users] best practices in connecting clients to cephfs public network

2017-04-25 Thread Ronny Aasen
hello i want to connect 3 servers to cephfs. The servers are normally not in the public network. is it best practice to connect 2 interfaces on the servers to have the servers directly connected to the public network ? or to route between the networks, via their common default gateway. the ma

Re: [ceph-users] best practices in connecting clients to cephfs public network

2017-04-25 Thread David Turner
In the past, I've made the "public" network another vlan that only has servers that need to talk to the storage back end included in it. That way you don't open it up to anything that doesn't need to have it and if a server needs to talk on it that should only be on restricted vlans, then you sati

Re: [ceph-users] inconsistent of pgs due to attr_value_mismatch

2017-04-25 Thread Lomayani S. Laizer
Hello, I managed to resolve the issue. OSD 21 had corrupted data. I removed from cluster and formatted hard drive then re-added to the cluster. After backfill finished I ran repair again and fixed the problem -- Lomayani On Tue, Apr 25, 2017 at 11:42 AM, Lomayani S. Laizer wrote: > Hello, > Am

Re: [ceph-users] RGW 10.2.5->10.2.7 authentication fail?

2017-04-25 Thread Radoslaw Zarzynski
Hello Ben, Could you provide full RadosGW's log for the failed request? I mean the lines starting from header listing, through the start marker ("== starting new request...") till the end marker? At the moment we can't see any details related to the signature calculation. Regards, Radek On

Re: [ceph-users] Sharing SSD journals and SSD drive choice

2017-04-25 Thread David
On 19 Apr 2017 18:01, "Adam Carheden" wrote: Does anyone know if XFS uses a single thread to write to it's journal? You probably know this but just to avoid any confusion, the journal in this context isn't the metadata journaling in XFS, it's a separate journal written to by the OSD daemons I

[ceph-users] ceph packages on stretch from eu.ceph.com

2017-04-25 Thread Ronny Aasen
Hello i am trying to install ceph on debian stretch from http://eu.ceph.com/debian-jewel/dists/ but there is no stretch repo there. now with stretch being frozen, it is a good time to be testing ceph on stretch. is it possible to get packages for stretch on jewel, kraken, and lumious ? k

Re: [ceph-users] Sharing SSD journals and SSD drive choice

2017-04-25 Thread Adam Carheden
On 04/25/2017 11:57 AM, David wrote: > On 19 Apr 2017 18:01, "Adam Carheden" > wrote: > > Does anyone know if XFS uses a single thread to write to it's journal? > > > You probably know this but just to avoid any confusion, the journal in > this context isn't the me

[ceph-users] Deepscrub IO impact on Jewel: What is osd_op_queue prio implementation?

2017-04-25 Thread Martin Millnert
Hi, experiencing significant impact from deep scrubs on Jewel. Started investigating OP priorities. We use default values on related/relevant OSD priority settings. "osd op queue" on http://docs.ceph.com/docs/master/rados/configuration/osd-config-ref/#operations states: "The normal queue is diff

Re: [ceph-users] Deepscrub IO impact on Jewel: What is osd_op_queue prio implementation?

2017-04-25 Thread Gregory Farnum
On Tue, Apr 25, 2017 at 3:04 PM, Martin Millnert wrote: > Hi, > > experiencing significant impact from deep scrubs on Jewel. > Started investigating OP priorities. We use default values on > related/relevant OSD priority settings. > > "osd op queue" on > http://docs.ceph.com/docs/master/rados/conf

Re: [ceph-users] Deepscrub IO impact on Jewel: What is osd_op_queue prio implementation?

2017-04-25 Thread Martin Millnert
On Tue, Apr 25, 2017 at 03:39:42PM -0400, Gregory Farnum wrote: > > I'd like to understand if "prio" in Jewel is as explained, i.e. > > something similar to the following pseudo code: > > > > if len(subqueue) > 0: > > dequeue(subqueue) > > if tokens(global) > some_cost: > > for queue in

[ceph-users] Adding New OSD Problem

2017-04-25 Thread Ramazan Terzi
Hello, I have a Ceph Cluster with specifications below: 3 x Monitor node 6 x Storage Node (6 disk per Storage Node, 6TB SATA Disks, all disks have SSD journals) Distributed public and private networks. All NICs are 10Gbit/s osd pool default size = 3 osd pool default min size = 2 Ceph version is

Re: [ceph-users] Ceph built from source gives Rados import error

2017-04-25 Thread Henry Ngo
Mine is at /usr/local/lib/x86_64-linux-gnu/librados.so.2 I moved the libraries to /usr/lib/x86_64-linux-gnu however I'm still getting the error when running ceph -v $ ceph -v Traceback (most recent call last): File "/usr/local/bin/ceph", line 106, in import rados ImportError: No module

Re: [ceph-users] Adding New OSD Problem

2017-04-25 Thread Reed Dier
Others will likely be able to provide some better responses, but I’ll take a shot to see if anything makes sense. With 10.2.6 you should be able to set 'osd scrub during recovery’ to false to prevent any new scrubs from occurring during a recovery event. Current scrubs will complete, but future

[ceph-users] Race Condition(?) in CephFS

2017-04-25 Thread Adam Tygart
cat.ksu.edu/~mozes/ceph-20170425/ # uname -a Linux eunomia 3.10.0-514.16.1.el7.x86_64 #1 SMP Wed Apr 12 15:04:24 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux We're currently running Ceph Jewel (10.2.5). We're looking to update soon, but we wanted a clean backup of everything in CephFS first.

Re: [ceph-users] Race Condition(?) in CephFS

2017-04-25 Thread Patrick Donnelly
Hello Adam, On Tue, Apr 25, 2017 at 5:32 PM, Adam Tygart wrote: > I'm using CephFS, on CentOS 7. We're currently migrating away from > using a catch-all cephx key to mount the filesystem (with the kernel > module), to a much more restricted key. > > In my tests, I've come across an issue, extract

Re: [ceph-users] rbd kernel client fencing

2017-04-25 Thread Kjetil Jørgensen
Hi, On Wed, Apr 19, 2017 at 9:08 PM, Chaofan Yu wrote: > Thank you so much. > > The blacklist entries are stored in osd map, which is supposed to be tiny and > clean. > So we are doing similar cleanups after reboot. In the face of churn - this won't necessarily matter as I believe there's some