date:20160114

[ceph-users] Odd single VM ceph error

2016-01-14 Thread Robert LeBlanc

We have a single VM that is acting odd. We had 7 SSD OSDs (out of 40) go down over a period of about 12 hours. These are a cache tier and have size 4, min_size 2. I'm not able to make heads or tails of the error and hoped someone here could help. 2016-01-14 23:09:54.559121 osd.136 [ERR] 13.503 cop

Re: [ceph-users] RGW -- 404 on keys in bucket.list() thousands of multipart ids listed as well.

2016-01-14 Thread seapasu...@uchicago.edu

It looks like the gateway is experiencing a similar race condition to what we reported before. The rados object has a size of 0 bytes but the bucket index shows the object listed and the object metadata shows a size of 7147520 bytes. I have a lot of logs but I don't think any of them have the

Re: [ceph-users] cephfs - inconsistent nfs and samba directory listings

2016-01-14 Thread Mike Carlson

Hey ceph-users, I wanted to follow up, Zheng's patch did the trick. We re-added the removed mds, and it all came back. We're sync-ing our data off to a backup server. Thanks for all of the help, Ceph has a great community to work with! Mike C On Thu, Jan 14, 2016 at 4:46 PM, Yan, Zheng wrote:

Re: [ceph-users] Infernalis upgrade breaks when journal on separate partition

2016-01-14 Thread Stuart Longland

On 12/01/16 01:22, Stillwell, Bryan wrote: >> Well, it seems I spoke to soon. Not sure what logic the udev rules use >> >to identify ceph journals, but it doesn't seem to pick up on the >> >journals in our case as after a reboot, those partitions are owned by >> >root:disk with permissions 0660.

[ceph-users] RGW -- 404 on keys in bucket.list() thousands of multipart ids listed as well.

2016-01-14 Thread seapasu...@uchicago.edu

I am not sure why this is happening someone used s3cmd to upload around 130,000 7mb objects to a single bucket. Now we are tearing down the cluster to rebuild it better, stronger, and hopefully faster. Before we destroy it we need to download all of the data. I am running through all of the key

Re: [ceph-users] cephfs - inconsistent nfs and samba directory listings

2016-01-14 Thread Mike Carlson

Do I apply this against the v9.2.0 git tag? On Thu, Jan 14, 2016 at 4:48 PM, Dyweni - Ceph-Users < 6exbab4fy...@dyweni.com> wrote: > Your patch lists the command as "addfailed" but the email lists the > command as "add failed". (Note the space). > > > > > > On 2016-01-14 18:46, Yan, Zheng wrote:

Re: [ceph-users] cephfs - inconsistent nfs and samba directory listings

2016-01-14 Thread Dyweni - Ceph-Users

Your patch lists the command as "addfailed" but the email lists the command as "add failed". (Note the space). On 2016-01-14 18:46, Yan, Zheng wrote: Here is patch for v9.2.0. After install the modified version of ceph-mon, run “ceph mds add failed 1” On Jan 15, 2016, at 08:20, Mike

Re: [ceph-users] cephfs - inconsistent nfs and samba directory listings

2016-01-14 Thread Yan, Zheng

Here is patch for v9.2.0. After install the modified version of ceph-mon, run “ceph mds add failed 1” mds_addfailed.patch Description: Binary data > On Jan 15, 2016, at 08:20, Mike Carlson wrote: > > okay, that sounds really good. > > Would it help if you had access to our cluster? > >

Re: [ceph-users] cephfs - inconsistent nfs and samba directory listings

2016-01-14 Thread John Spray

On Fri, Jan 15, 2016 at 12:23 AM, Sage Weil wrote: > On Fri, 15 Jan 2016, Yan, Zheng wrote: >> > On Jan 15, 2016, at 08:16, Mike Carlson wrote: >> > >> > Did I just loose all of my data? >> > >> > If we were able to export the journal, could we create a brand new mds out >> > of that and retriev

Re: [ceph-users] cephfs - inconsistent nfs and samba directory listings

2016-01-14 Thread Sage Weil

On Fri, 15 Jan 2016, Yan, Zheng wrote: > > On Jan 15, 2016, at 08:16, Mike Carlson wrote: > > > > Did I just loose all of my data? > > > > If we were able to export the journal, could we create a brand new mds out > > of that and retrieve our data? > > No. it’s early to fix. but you need to re

Re: [ceph-users] cephfs - inconsistent nfs and samba directory listings

2016-01-14 Thread Mike Carlson

okay, that sounds really good. Would it help if you had access to our cluster? On Thu, Jan 14, 2016 at 4:19 PM, Yan, Zheng wrote: > > > On Jan 15, 2016, at 08:16, Mike Carlson wrote: > > > > Did I just loose all of my data? > > > > If we were able to export the journal, could we create a brand

Re: [ceph-users] cephfs - inconsistent nfs and samba directory listings

2016-01-14 Thread Yan, Zheng

> On Jan 15, 2016, at 08:16, Mike Carlson wrote: > > Did I just loose all of my data? > > If we were able to export the journal, could we create a brand new mds out of > that and retrieve our data? No. it’s early to fix. but you need to re-compile ceph-mon from source code. I’m writing the p

Re: [ceph-users] cephfs - inconsistent nfs and samba directory listings

2016-01-14 Thread Mike Carlson

Did I just loose all of my data? If we were able to export the journal, could we create a brand new mds out of that and retrieve our data? On Thu, Jan 14, 2016 at 4:15 PM, Yan, Zheng wrote: > > > On Jan 15, 2016, at 08:01, Gregory Farnum wrote: > > > > On Thu, Jan 14, 2016 at 3:46 PM, Mike Car

Re: [ceph-users] cephfs - inconsistent nfs and samba directory listings

2016-01-14 Thread Yan, Zheng

> On Jan 15, 2016, at 08:01, Gregory Farnum wrote: > > On Thu, Jan 14, 2016 at 3:46 PM, Mike Carlson wrote: >> Hey Zheng, >> >> I've been in the #ceph irc channel all day about this. >> >> We did that, we set max_mds back to 1, but, instead of stopping mds 1, we >> did a "ceph mds rmfailed 1"

Re: [ceph-users] cephfs - inconsistent nfs and samba directory listings

2016-01-14 Thread Yan, Zheng

> On Jan 15, 2016, at 08:01, Gregory Farnum wrote: > > On Thu, Jan 14, 2016 at 3:46 PM, Mike Carlson wrote: >> Hey Zheng, >> >> I've been in the #ceph irc channel all day about this. >> >> We did that, we set max_mds back to 1, but, instead of stopping mds 1, we >> did a "ceph mds rmfailed 1"

Re: [ceph-users] cephfs - inconsistent nfs and samba directory listings

2016-01-14 Thread Gregory Farnum

On Thu, Jan 14, 2016 at 3:46 PM, Mike Carlson wrote: > Hey Zheng, > > I've been in the #ceph irc channel all day about this. > > We did that, we set max_mds back to 1, but, instead of stopping mds 1, we > did a "ceph mds rmfailed 1". Running ceph mds stop 1 produces: > > # ceph mds stop 1 > Error

Re: [ceph-users] cephfs - inconsistent nfs and samba directory listings

2016-01-14 Thread Mike Carlson

Hey Zheng, I've been in the #ceph irc channel all day about this. We did that, we set max_mds back to 1, but, instead of stopping mds 1, we did a "ceph mds rmfailed 1". Running ceph mds stop 1 produces: # ceph mds stop 1 Error EEXIST: mds.1 not active (???) Our mds in a state of resolve, and w

Re: [ceph-users] cephfs - inconsistent nfs and samba directory listings

2016-01-14 Thread Yan, Zheng

On Fri, Jan 15, 2016 at 3:28 AM, Mike Carlson wrote: > Thank you for the reply Zheng > > We tried set mds bal frag to true, but the end result was less than > desirable. All nfs and smb clients could no longer browse the share, they > would hang on a directory with anything more than a few hundred

Re: [ceph-users] v10.0.2 released

2016-01-14 Thread Jason Dillaman

rbd-nbd uses librbd directly -- it runs as a user-space daemon process and interacts with the kernel NBD commands via a UNIX socket. As a result, it supports all image features supported by librbd. You can use the rbd CLI to map/unmap RBD-based NBDs [1] similar to how you map/unmap images via

[ceph-users] Ceph Advisory Board: meeting minutes 2016-01-12

2016-01-14 Thread Patrick McGarry

This month’s Ceph Advisory Board meeting notes have been added to the Ceph wiki: wiki.ceph.com/Ceph_Advisory_Board Please let me know if you have any questions or concerns. Thanks. -- Best Regards, Patrick McGarry Director Ceph Community || Red Hat http://ceph.com || http://community.redha

[ceph-users] Community Update

2016-01-14 Thread Patrick McGarry

Hey cephers, It has been quite a while since I distilled the highlights of what is going on in the community into a single post, so I figured it was long overdue. Please check out the latest Ceph.com blog and some of the many great things that are on our short-term radar at the moment: http://cep

Re: [ceph-users] v10.0.2 released

2016-01-14 Thread Dyweni - Ceph-Users

Does this support rbd images with stripe count > 1? If yes, then this is also a solution for this problem: http://tracker.ceph.com/issues/3837 Thanks, Dyweni On 2016-01-14 13:27, Bill Sanders wrote: Is there some information about rbd-nbd somewhere? If it has feature parity with librbd

Re: [ceph-users] cephfs - inconsistent nfs and samba directory listings

2016-01-14 Thread Mike Carlson

Thank you for the reply Zheng We tried set mds bal frag to true, but the end result was less than desirable. All nfs and smb clients could no longer browse the share, they would hang on a directory with anything more than a few hundred files. We then tried to back out the active/active mds change

Re: [ceph-users] v10.0.2 released

2016-01-14 Thread Bill Sanders

Is there some information about rbd-nbd somewhere? If it has feature parity with librbd and is easier to maintain, will this eventually deprecate krbd? We're using the RBD kernel client right now, and so this looks like something we might want to explore at my employer. Bill On Thu, Jan 14, 201

Re: [ceph-users] osd process threads stack up on osds failure

2016-01-14 Thread Samuel Just

Probably worth filing a bug. Make sure to include the usual stuff: 1) version 2) logs from a crashing osd For this one, it would also be handy if you used gdb to dump the thread backtraces for an osd which is experiencing "an increase of approximately 230-260 threads for every other OSD node" -Sa

Re: [ceph-users] CEPH Replication

2016-01-14 Thread Gregory Farnum

We went to 3 copies because 2 isn't safe enough for the default. With 3 copies and a properly configured system your data is approximately as safe as the data center it's in. With 2 copies the durability is a lot lower than that (two 9s versus four 9s or something). The actual safety numbers did no

Re: [ceph-users] where is the client

2016-01-14 Thread Gregory Farnum

There's not a great unified tracking solution, but newer MDS code has admin socket commands to dump client sessions. Look for those. This question is good for the user list, but if you can't send mail to dev lost you're probably using HTML email or something. vger.kernel.org has some pretty strict

Re: [ceph-users] ceph-fuse on Jessie not mounted at boot

2016-01-14 Thread Gregory Farnum

Try using "id=client.my_user". It's not taking daemonize arguments because auto-mount in fstab requires the use of CLI arguments (of which daemonize isn't a member), IIRC. -Greg On Wednesday, January 6, 2016, Florent B wrote: > Hi everyone, > > I have a problem with ceph-fuse on Debian Jessie. >

Re: [ceph-users] where is the fsid field coming from in ceph -s ?

2016-01-14 Thread Gregory Farnum

It sounds like you *didn't* change the fsid for the existing osd/mon daemons since you say there gettin refused. So I think you created a new "cluster" of just the one monitor, and your client is choosing to connect to it first. If that's the case, killing that monitor and creating it properly will

Re: [ceph-users] v10.0.2 released

2016-01-14 Thread Yehuda Sadeh-Weinraub

On Thu, Jan 14, 2016 at 7:37 AM, Sage Weil wrote: > This development release includes a raft of changes and improvements for > Jewel. Key additions include CephFS scrub/repair improvements, an AIX and > Solaris port of librados, many librbd journaling additions and fixes, > extended per-pool optio

Re: [ceph-users] How to do quiesced rbd snapshot in libvirt?

2016-01-14 Thread Василий Ангапов

Thank you very much, Jason! I've updated the ticket with new data, but I'm not sure if I attached logs correctly. Please let me know if anything more is needed. 2016-01-14 23:29 GMT+08:00 Jason Dillaman : > I would need to see the log from the point where you've frozen the disks > until the poin

[ceph-users] v10.0.2 released

2016-01-14 Thread Sage Weil

This development release includes a raft of changes and improvements for Jewel. Key additions include CephFS scrub/repair improvements, an AIX and Solaris port of librados, many librbd journaling additions and fixes, extended per-pool options, and NBD driver for RBD (rbd-nbd) that allows librbd

Re: [ceph-users] How to do quiesced rbd snapshot in libvirt?

2016-01-14 Thread Jason Dillaman

I would need to see the log from the point where you've frozen the disks until the point when you attempt to create a snapshot. The logs below just show normal IO. I've opened a new ticket [1] where you can attach the logs. [1] http://tracker.ceph.com/issues/14373 -- Jason Dillaman -

Re: [ceph-users] Observations after upgrading to latest Firefly (0.80.11)

2016-01-14 Thread Gregory Farnum

On Thu, Jan 14, 2016 at 12:50 AM, Kostis Fardelas wrote: > Hello cephers, > after being on 0.80.10 for a while, we upgraded to 0.80.11 and we > noticed the following things: > a. ~13% paxos refresh latency increase (from about 0.015 to 0.017 on average) > b. ~15% paxos commit latency ( from 0.019

Re: [ceph-users] lost OSD due to failing disk

2016-01-14 Thread Mihai Gheorghe

2016-01-14 11:25 GMT+02:00 Magnus Hagdorn : > On 13/01/16 13:32, Andy Allan wrote: > >> On 13 January 2016 at 12:26, Magnus Hagdorn >> wrote: >> >>> Hi there, >>> we recently had a problem with two OSDs failing because of I/O errors of >>> the >>> underlying disks. We run a small ceph cluster wit

Re: [ceph-users] lost OSD due to failing disk

2016-01-14 Thread Magnus Hagdorn

On 13/01/16 13:32, Andy Allan wrote: On 13 January 2016 at 12:26, Magnus Hagdorn wrote: Hi there, we recently had a problem with two OSDs failing because of I/O errors of the underlying disks. We run a small ceph cluster with 3 nodes and 18 OSDs in total. All 3 nodes are dell poweredge r515 ser

[ceph-users] Observations after upgrading to latest Firefly (0.80.11)

2016-01-14 Thread Kostis Fardelas

Hello cephers, after being on 0.80.10 for a while, we upgraded to 0.80.11 and we noticed the following things: a. ~13% paxos refresh latency increase (from about 0.015 to 0.017 on average) b. ~15% paxos commit latency ( from 0.019 to 0.022 on average) c. osd commitcycle latencies were decreased and

37 matches

Mail list logo