> > When I used CentOS osds, and tried to rbd map from arch linux or fedora,
> > I would get "rbd: add failed: (34) Numerical result out of range". It
> > seemed to happen when the tool was writing to /sys/bus/rbd/add_single_major.
> > If I rebuild the osds using fedora (20 in this case), everythi
Hi all,
after coming back from a long weekend, I found my production cluster in
an error state, mentioning 6 scrub errors and 6 pg's in
active+clean+inconsistent state.
Strange is, that my Prelive-Cluster, running on different Hardware, are
also showing 1 scrub error and 1 inconsisten pg...
pg d
Hello Cephers
First of all this problem is not related to ceph-deploy , ceph-deploy 1.5.4
works like charm :-) , thanks for Alfredo
Problem :
1. When installing Ceph using package manger ( # yum install ceph or # yum
update cehp) that uses ceph repositories (cep.repo) , the package manage
Hi,
On 10 Jun 2014, at 10:30, Karan Singh
mailto:karan.si...@csc.fi>> wrote:
Hello Cephers
First of all this problem is not related to ceph-deploy , ceph-deploy 1.5.4
works like charm :-) , thanks for Alfredo
Problem :
1. When installing Ceph using package manger ( # yum install ceph or #
Hi again,
just found the ceph pg repair command :) Now both clusters are OK again.
Anyways, I'm really interested in the caus of the problem.
Regards,
Christian
Am 10.06.2014 10:28, schrieb Christian Eichelmann:
> Hi all,
>
> after coming back from a long weekend, I found my production cluster
How does CEPH guarantee data isolation for volumes which are not meant to be
shared in a Openstack tenant?
When used with OpenStack the data isolation is provided by the Openstack level
so that all users who are part of same tenant will be able to access/share the
volumes created by users in
Hi Kumar
“Clock skew” is just a warning and should not related to this problem. But its
pretty easy to fix this warning either by setting up NTP on all Ceph cluster
nodes or by adding mon clock drift warn backoff = into ceph.conf
(do not do this in production)
WRT to your second problem , t
>>Do for every read 1 Kb rbd will read 4MB from hdd? for write?
rados support partial read|write.
Note that with erasure code, write need to full rewrite object. (so 4MB)
I think that with key-value-store backend (like leveldb), read/write are full
too.
some interesting notes here :
http://eu
HI Yamashita
First try to create a cinder volume , that should be stored on ceph backend,
then proceeded with glance image on Ceph.
I assume you have done all the steps correctly as mentioned on ceph
documentation for ceph and openstack integration , if you are still not able to
create volumes
Hi,
I’m just starting to get interested in this topic, since today we’ve found that
a weekly peak in latency correlates with a bulk (~30) of deep scrubbing PGs.
One idea I had was to check the behaviour under different disk io schedulers,
trying exploit thread io priorities with cfq. So I have a
Hi Greg,
Thank for your suggestion and information. I've installed the cluster over
again.
I just wanted to investigate a little more based on your information. I can
see that auth/paxos values in monitor K/V store are these:
'authfirst_commited': 251
'authlast_commited': 329
and I have all the
On Tue, Jun 10, 2014 at 4:34 AM, Dan Van Der Ster
wrote:
> Hi,
>
> On 10 Jun 2014, at 10:30, Karan Singh wrote:
>
> Hello Cephers
>
> First of all this problem is not related to ceph-deploy , ceph-deploy 1.5.4
> works like charm :-) , thanks for Alfredo
>
> Problem :
>
> 1. When installing Ceph u
Thanks Dan / Alfredo
Yep priority=1 helped a little bit and yum-plugin-priority fixed all.
- karan -
On 10 Jun 2014, at 15:22, Alfredo Deza wrote:
> On Tue, Jun 10, 2014 at 4:34 AM, Dan Van Der Ster
> wrote:
>> Hi,
>>
>> On 10 Jun 2014, at 10:30, Karan Singh wrote:
>>
>> Hello Cephers
>>
Hi All,
For those of you that are interested in performance data, Brendan Gregg
wrote a really useful cheat sheet for perf that can give you some idea
of the things it can do beyond just profiling and performance counters.
The static and dynamic tracing capabilities are especially interesting
Hi,
I'm running a Ceph cluster on version 0.80.1
(a38fe1169b6d2ac98b427334c12d7cf81f809b74).
Here's roughly what happened: cluster has been created, pg and pgp num
have been increased step by step to (now) 1024 to have a fitting number
for the amount of OSDs and size. Then (days later) the size o
Hi
I am interested in this too!
We have two Ceph arrays too and we have found that we get these same
errors about once a fortnight on each array.
One array is in a colo site and the other in the office. Both arrays
have 3 replications, version 0.80.1. The osd's are running on zfs
which sh
Hello,
we have a Ceph deployment still running on Ubuntu 12.04 with Ceph 0.61.
The docs are suggesting that we need to updgrade Ceph to 0.67/Dumpling
before upgrading to next versions.
But the command "apt-cache show ceph | grep Version" gives the following
output :
On Ubuntu 13.04/Raring:
Ver
Sorry I think i found the answer just by scrolling on the RIGHT chapter of
the "Upgrading Ceph" guide!
Should I mention that version naming really confuse me? :)
BR
Davide
On Tue, Jun 10, 2014 at 5:09 PM, Davide Fanciola
wrote:
> Hello,
>
> we have a Ceph deployment still running on Ubuntu 1
Hi list,
during the last days our Ceph cluster did not handle a recovery correctly.
Several VM images on RBD have been corrupted. I'm currently trying to
understand what has happened and how to avoid such problems in the future.
I'll describe the course of events.
1. We are running 32 OSDs on 4
Hi all,
A couple of years ago, I heard that it wasn’t safe to map a krbd block on an
OSD host.
It was more or less like mounting a NFS mount on the NFS server, we can
potentially end up with some deadlocks.
At least, I tried again recently and didn’t encounter any problem.
What do you think?
FYI I encountered the same problem for krbd, removing the ec pool didn’t solve
my problem.
I’m running 3.13
Sébastien Han
Cloud Engineer
"Always give 100%. Unless you're giving blood."
Phone: +33 (0)1 49 70 99 72
Mail: sebastien@enovance.com
Address : 11 bis, rue Roquépine - 75008
Hi Sébastien,
still the case. Depending on what you do, the OSD process will get to a hang
and will suicide.
Regards
JC
On Jun 10, 2014, at 09:46, Sebastien Han wrote:
> Hi all,
>
> A couple of years ago, I heard that it wasn’t safe to map a krbd block on an
> OSD host.
> It was more or les
Sebastian,
It's actually not an issue with Ceph, but with the Linux kernel itself. If
you want to do this and avoid a deadlock, just use a VM on the same host to
mount the block device.
Regards,
John
On Tue, Jun 10, 2014 at 9:51 AM, Jean-Charles LOPEZ
wrote:
> Hi Sébastien,
>
> still the ca
Hi Josef, all,
its never to late to join a party ;) The cheap switches dont support mlag
either.
I did some testing today with the balance-alb mode which works fine so far
in this setup.
Im able to have my links placed redundantly on both switch; utilize up to 2
gb/s when talking to
Every time I deep-scrub one PG, all of the OSDs responsible get kicked
out of the cluster. I've deep-scrubbed this PG 4 times now, and it
fails the same way every time. OSD logs are linked at the bottom.
What can I do to get this deep-scrub to complete cleanly?
This is the first time I've deep-
Thanks for your answers :)
Sébastien Han
Cloud Engineer
"Always give 100%. Unless you're giving blood."
Phone: +33 (0)1 49 70 99 72
Mail: sebastien@enovance.com
Address : 11 bis, rue Roquépine - 75008 Paris
Web : www.enovance.com - Twitter : @enovance
On 10 Jun 2014, at 20:49, Jo
After doing this, I've found that I'm having problems with a few
specific PGs. If I set nodeep-scrub, then manually deep-scrub one
specific PG, the responsible OSDs get kicked out. I'm starting a new
discussion, subject: "I have PGs that I can't deep-scrub"
I'll re-test this correlation after I
Hey Mike, has your manual scheduling resolved this? I think I saw
another similar-sounding report, so a feature request to improve scrub
scheduling would be welcome. :)
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com
On Tue, May 20, 2014 at 5:46 PM, Mike Dawson wrote:
> I tend
On Tue, May 27, 2014 at 7:44 PM, Plato wrote:
> For certain security issue, I need to make sure the data finally saved to
> disk is encrypted.
> So, I'm trying to write a rados class, which would be triggered to reading
> and writing process.
> That is, before data is written, encrypting method of
The idea of regions and zones is to replicate Amazon's S3 storage.
Here's some links from Amazon descriping EC2 regions and zones
(http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-regions-availability-zones.html)
and S3 Regions
(http://docs.aws.amazon.com/AmazonS3/latest/dev/LocationSelect
On Fri, 6 Jun 2014 17:34:56 -0700
Tyler Wilson wrote:
> Hey All,
>
> Simple question, does 'rbd export-diff' work with children snapshot
> aka;
>
> root:~# rbd children images/03cb46f7-64ab-4f47-bd41-e01ced45f0b4@snap
> compute/2b65c0b9-51c3-4ab1-bc3c-6b734cc796b8_disk
> compute/54f3b23c-facf-4
On 06/10/2014 01:56 AM, Vilobh Meshram wrote:
How does CEPH guarantee data isolation for volumes which are not meant
to be shared in a Openstack tenant?
When used with OpenStack the data isolation is provided by the
Openstack level so that all users who are part of same tenant will be
able to ac
I'd have to look for details, but I don't think the auth monitor ever
removes those keys, so if there are some missing, it sounds like some
data got lost out from underneath it. That could have happened if the
filesystem dropped a file, which we have seen on some kernels.
-Greg
Software Engineer #4
unsubscribe
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Can you verify that the subuser object actually exist? Try doing:
$ rados ls -p .users.swift
(unless you have non default pools set)
Yehuda
On Tue, Jun 10, 2014 at 6:44 PM, David Curtiss
wrote:
> No good. In fact, for some reason when I tried to load up my cluster VMs
> today, I couldnt't get
Thanks, that did the trick! I think there's some puzzling things that
change depending on timing of commands during setup, and at some point I
noticed that the script output said "Installing stable release Emperor" or
the equivalent, so possibly I have no idea what my own commands are doing.
But, f
Hi Karan,
I have checked the cinder logs but could not find anything suspicious.
Thanks
Kumar
From: Karan Singh [mailto:karan.si...@csc.fi]
Sent: Tuesday, June 10, 2014 3:14 PM
To: Gnan Kumar, Yalla
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Cannot attach volumes
Hi Kumar
"Clock
37 matches
Mail list logo