Re: [ceph-users] rbd: add failed: (34) Numerical result out of range

2014-06-10 Thread Chris
> > When I used CentOS osds, and tried to rbd map from arch linux or fedora, > > I would get "rbd: add failed: (34) Numerical result out of range". It > > seemed to happen when the tool was writing to /sys/bus/rbd/add_single_major. > > If I rebuild the osds using fedora (20 in this case), everythi

[ceph-users] PG Scrub Error / active+clean+inconsistent

2014-06-10 Thread Christian Eichelmann
Hi all, after coming back from a long weekend, I found my production cluster in an error state, mentioning 6 scrub errors and 6 pg's in active+clean+inconsistent state. Strange is, that my Prelive-Cluster, running on different Hardware, are also showing 1 scrub error and 1 inconsisten pg... pg d

[ceph-users] Problem installing ceph from package manager / ceph repositories

2014-06-10 Thread Karan Singh
Hello Cephers First of all this problem is not related to ceph-deploy , ceph-deploy 1.5.4 works like charm :-) , thanks for Alfredo Problem : 1. When installing Ceph using package manger ( # yum install ceph or # yum update cehp) that uses ceph repositories (cep.repo) , the package manage

Re: [ceph-users] Problem installing ceph from package manager / ceph repositories

2014-06-10 Thread Dan Van Der Ster
Hi, On 10 Jun 2014, at 10:30, Karan Singh mailto:karan.si...@csc.fi>> wrote: Hello Cephers First of all this problem is not related to ceph-deploy , ceph-deploy 1.5.4 works like charm :-) , thanks for Alfredo Problem : 1. When installing Ceph using package manger ( # yum install ceph or #

Re: [ceph-users] PG Scrub Error / active+clean+inconsistent

2014-06-10 Thread Christian Eichelmann
Hi again, just found the ceph pg repair command :) Now both clusters are OK again. Anyways, I'm really interested in the caus of the problem. Regards, Christian Am 10.06.2014 10:28, schrieb Christian Eichelmann: > Hi all, > > after coming back from a long weekend, I found my production cluster

[ceph-users] Fwd: CEPH Multitenancy and Data Isolation

2014-06-10 Thread Vilobh Meshram
How does CEPH guarantee data isolation for volumes which are not meant to be shared in a Openstack tenant? When used with OpenStack the data isolation is provided by the Openstack level so that all users who are part of same tenant will be able to access/share the volumes created by users in

Re: [ceph-users] Cannot attach volumes

2014-06-10 Thread Karan Singh
Hi Kumar “Clock skew” is just a warning and should not related to this problem. But its pretty easy to fix this warning either by setting up NTP on all Ceph cluster nodes or by adding mon clock drift warn backoff = into ceph.conf (do not do this in production) WRT to your second problem , t

Re: [ceph-users] Minimal io block in rbd

2014-06-10 Thread Alexandre DERUMIER
>>Do for every read 1 Kb rbd will read 4MB from hdd? for write? rados support partial read|write. Note that with erasure code, write need to full rewrite object. (so 4MB) I think that with key-value-store backend (like leveldb), read/write are full too. some interesting notes here : http://eu

Re: [ceph-users] Fail to Block Devices and OpenStack

2014-06-10 Thread Karan Singh
HI Yamashita First try to create a cinder volume , that should be stored on ceph backend, then proceeded with glance image on Ceph. I assume you have done all the steps correctly as mentioned on ceph documentation for ceph and openstack integration , if you are still not able to create volumes

Re: [ceph-users] How to avoid deep-scrubbing performance hit?

2014-06-10 Thread Dan Van Der Ster
Hi, I’m just starting to get interested in this topic, since today we’ve found that a weekly peak in latency correlates with a bulk (~30) of deep scrubbing PGs. One idea I had was to check the behaviour under different disk io schedulers, trying exploit thread io priorities with cfq. So I have a

Re: [ceph-users] failed assertion on AuthMonitor

2014-06-10 Thread Mohammad Salehe
Hi Greg, Thank for your suggestion and information. I've installed the cluster over again. I just wanted to investigate a little more based on your information. I can see that auth/paxos values in monitor K/V store are these: 'authfirst_commited': 251 'authlast_commited': 329 and I have all the

Re: [ceph-users] Problem installing ceph from package manager / ceph repositories

2014-06-10 Thread Alfredo Deza
On Tue, Jun 10, 2014 at 4:34 AM, Dan Van Der Ster wrote: > Hi, > > On 10 Jun 2014, at 10:30, Karan Singh wrote: > > Hello Cephers > > First of all this problem is not related to ceph-deploy , ceph-deploy 1.5.4 > works like charm :-) , thanks for Alfredo > > Problem : > > 1. When installing Ceph u

Re: [ceph-users] Problem installing ceph from package manager / ceph repositories

2014-06-10 Thread Karan Singh
Thanks Dan / Alfredo Yep priority=1 helped a little bit and yum-plugin-priority fixed all. - karan - On 10 Jun 2014, at 15:22, Alfredo Deza wrote: > On Tue, Jun 10, 2014 at 4:34 AM, Dan Van Der Ster > wrote: >> Hi, >> >> On 10 Jun 2014, at 10:30, Karan Singh wrote: >> >> Hello Cephers >>

[ceph-users] Ceph performance profiling with perf

2014-06-10 Thread Mark Nelson
Hi All, For those of you that are interested in performance data, Brendan Gregg wrote a really useful cheat sheet for perf that can give you some idea of the things it can do beyond just profiling and performance counters. The static and dynamic tracing capabilities are especially interesting

[ceph-users] some PGs degraded after upgrading size to 3

2014-06-10 Thread Marc
Hi, I'm running a Ceph cluster on version 0.80.1 (a38fe1169b6d2ac98b427334c12d7cf81f809b74). Here's roughly what happened: cluster has been created, pg and pgp num have been increased step by step to (now) 1024 to have a fitting number for the amount of OSDs and size. Then (days later) the size o

Re: [ceph-users] PG Scrub Error / active+clean+inconsistent

2014-06-10 Thread Peter Howell
Hi I am interested in this too! We have two Ceph arrays too and we have found that we get these same errors about once a fortnight on each array. One array is in a colo site and the other in the office. Both arrays have 3 replications, version 0.80.1. The osd's are running on zfs which sh

[ceph-users] Updating Ubuntu/ceph from 12.04/cuttlefish to 13.04/firefly

2014-06-10 Thread Davide Fanciola
Hello, we have a Ceph deployment still running on Ubuntu 12.04 with Ceph 0.61. The docs are suggesting that we need to updgrade Ceph to 0.67/Dumpling before upgrading to next versions. But the command "apt-cache show ceph | grep Version" gives the following output : On Ubuntu 13.04/Raring: Ver

Re: [ceph-users] Updating Ubuntu/ceph from 12.04/cuttlefish to 13.04/firefly

2014-06-10 Thread Davide Fanciola
Sorry I think i found the answer just by scrolling on the RIGHT chapter of the "Upgrading Ceph" guide! Should I mention that version naming really confuse me? :) BR Davide On Tue, Jun 10, 2014 at 5:09 PM, Davide Fanciola wrote: > Hello, > > we have a Ceph deployment still running on Ubuntu 1

[ceph-users] FAILED assert(_size >= 0) during recovery - need to understand what's going on

2014-06-10 Thread Christian Kauhaus
Hi list, during the last days our Ceph cluster did not handle a recovery correctly. Several VM images on RBD have been corrupted. I'm currently trying to understand what has happened and how to avoid such problems in the future. I'll describe the course of events. 1. We are running 32 OSDs on 4

[ceph-users] Is it still unsafe to map a RBD device on an OSD server?

2014-06-10 Thread Sebastien Han
Hi all, A couple of years ago, I heard that it wasn’t safe to map a krbd block on an OSD host. It was more or less like mounting a NFS mount on the NFS server, we can potentially end up with some deadlocks. At least, I tried again recently and didn’t encounter any problem. What do you think?

Re: [ceph-users] question about feature set mismatch

2014-06-10 Thread Sebastien Han
FYI I encountered the same problem for krbd, removing the ec pool didn’t solve my problem. I’m running 3.13 Sébastien Han Cloud Engineer "Always give 100%. Unless you're giving blood." Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 11 bis, rue Roquépine - 75008

Re: [ceph-users] Is it still unsafe to map a RBD device on an OSD server?

2014-06-10 Thread Jean-Charles LOPEZ
Hi Sébastien, still the case. Depending on what you do, the OSD process will get to a hang and will suicide. Regards JC On Jun 10, 2014, at 09:46, Sebastien Han wrote: > Hi all, > > A couple of years ago, I heard that it wasn’t safe to map a krbd block on an > OSD host. > It was more or les

Re: [ceph-users] Is it still unsafe to map a RBD device on an OSD server?

2014-06-10 Thread John Wilkins
Sebastian, It's actually not an issue with Ceph, but with the Linux kernel itself. If you want to do this and avoid a deadlock, just use a VM on the same host to mount the block device. Regards, John On Tue, Jun 10, 2014 at 9:51 AM, Jean-Charles LOPEZ wrote: > Hi Sébastien, > > still the ca

Re: [ceph-users] Ceph networks, to bond or not to bond?

2014-06-10 Thread Sven Budde
Hi Josef, all, it’s never to late to join a party ;) The cheap switches don’t support mlag either. I did some testing today with the balance-alb mode which works fine so far in this setup. I’m able to have my links placed redundantly on both switch; utilize up to 2 gb/s when talking to

[ceph-users] I have PGs that I can't deep-scrub

2014-06-10 Thread Craig Lewis
Every time I deep-scrub one PG, all of the OSDs responsible get kicked out of the cluster. I've deep-scrubbed this PG 4 times now, and it fails the same way every time. OSD logs are linked at the bottom. What can I do to get this deep-scrub to complete cleanly? This is the first time I've deep-

Re: [ceph-users] Is it still unsafe to map a RBD device on an OSD server?

2014-06-10 Thread Sebastien Han
Thanks for your answers :) Sébastien Han Cloud Engineer "Always give 100%. Unless you're giving blood." Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 11 bis, rue Roquépine - 75008 Paris Web : www.enovance.com - Twitter : @enovance On 10 Jun 2014, at 20:49, Jo

Re: [ceph-users] How to avoid deep-scrubbing performance hit?

2014-06-10 Thread Craig Lewis
After doing this, I've found that I'm having problems with a few specific PGs. If I set nodeep-scrub, then manually deep-scrub one specific PG, the responsible OSDs get kicked out. I'm starting a new discussion, subject: "I have PGs that I can't deep-scrub" I'll re-test this correlation after I

Re: [ceph-users] PG Selection Criteria for Deep-Scrub

2014-06-10 Thread Gregory Farnum
Hey Mike, has your manual scheduling resolved this? I think I saw another similar-sounding report, so a feature request to improve scrub scheduling would be welcome. :) -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Tue, May 20, 2014 at 5:46 PM, Mike Dawson wrote: > I tend

Re: [ceph-users] How to implement a rados plugin to encode/decode data while r/w

2014-06-10 Thread Gregory Farnum
On Tue, May 27, 2014 at 7:44 PM, Plato wrote: > For certain security issue, I need to make sure the data finally saved to > disk is encrypted. > So, I'm trying to write a rados class, which would be triggered to reading > and writing process. > That is, before data is written, encrypting method of

Re: [ceph-users] about rgw region and zone

2014-06-10 Thread Craig Lewis
The idea of regions and zones is to replicate Amazon's S3 storage. Here's some links from Amazon descriping EC2 regions and zones (http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-regions-availability-zones.html) and S3 Regions (http://docs.aws.amazon.com/AmazonS3/latest/dev/LocationSelect

Re: [ceph-users] RBD Export-Diff With Children Snapshots

2014-06-10 Thread Josh Durgin
On Fri, 6 Jun 2014 17:34:56 -0700 Tyler Wilson wrote: > Hey All, > > Simple question, does 'rbd export-diff' work with children snapshot > aka; > > root:~# rbd children images/03cb46f7-64ab-4f47-bd41-e01ced45f0b4@snap > compute/2b65c0b9-51c3-4ab1-bc3c-6b734cc796b8_disk > compute/54f3b23c-facf-4

Re: [ceph-users] Fwd: CEPH Multitenancy and Data Isolation

2014-06-10 Thread Josh Durgin
On 06/10/2014 01:56 AM, Vilobh Meshram wrote: How does CEPH guarantee data isolation for volumes which are not meant to be shared in a Openstack tenant? When used with OpenStack the data isolation is provided by the Openstack level so that all users who are part of same tenant will be able to ac

Re: [ceph-users] failed assertion on AuthMonitor

2014-06-10 Thread Gregory Farnum
I'd have to look for details, but I don't think the auth monitor ever removes those keys, so if there are some missing, it sounds like some data got lost out from underneath it. That could have happened if the filesystem dropped a file, which we have seen on some kernels. -Greg Software Engineer #4

[ceph-users] unsubscribe

2014-06-10 Thread Punit Dambiwal
unsubscribe ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Swift API Authentication Failure

2014-06-10 Thread Yehuda Sadeh
Can you verify that the subuser object actually exist? Try doing: $ rados ls -p .users.swift (unless you have non default pools set) Yehuda On Tue, Jun 10, 2014 at 6:44 PM, David Curtiss wrote: > No good. In fact, for some reason when I tried to load up my cluster VMs > today, I couldnt't get

Re: [ceph-users] perplexed by unmapped groups on fresh firefly install

2014-06-10 Thread Miki Habryn
Thanks, that did the trick! I think there's some puzzling things that change depending on timing of commands during setup, and at some point I noticed that the script output said "Installing stable release Emperor" or the equivalent, so possibly I have no idea what my own commands are doing. But, f

Re: [ceph-users] Cannot attach volumes

2014-06-10 Thread yalla.gnan.kumar
Hi Karan, I have checked the cinder logs but could not find anything suspicious. Thanks Kumar From: Karan Singh [mailto:karan.si...@csc.fi] Sent: Tuesday, June 10, 2014 3:14 PM To: Gnan Kumar, Yalla Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Cannot attach volumes Hi Kumar "Clock