Hello,
On Mon, 30 Nov 2015 07:55:24 + MATHIAS, Bryn (Bryn) wrote:
> Hi Christian,
>
> I’ll give you a much better dump of detail :)
>
> Running RHEL 7.1,
> ceph version 0.94.5
>
> all ceph disks are xfs, with journals on a partition on the disk
> Disks: 6Tb spinners.
>
OK, I was guessing
Hi!
Götz Reinicke wrotes:
>>What if one of the networks fail? e.g. just on one host or the whole
>>network for all nodes?
>>Is there some sort of auto failover to use the other network for alltraffic
>>than?
>>How dose that work in real life? :) Or do I have to interact by hand
Alex Gorbachev
On 29-11-15 20:20, misa-c...@hudrydum.cz wrote:
> Hi everyone,
>
> for my pet project I've needed python3 rados library. So I've took the
> existing python2 rados code and clean it up a little bit to fit my needs. The
> lib contains basic interface, asynchronous operations and also asyncio
>
Btw, in my configuration "mon osd downout subtree limit" is set to "host".
Does it influence things?
2015-11-29 14:38 GMT+08:00 Vasiliy Angapov :
> Bob,
> Thanks for explanation, sounds resonable! But how it could happen that
> host is down and its OSDs are still IN cluster?
> I mean NOOUT flag is
Hi,
> On 30 Nov 2015, at 13:44, Christian Balzer wrote:
>
>
> Hello,
>
> On Mon, 30 Nov 2015 07:55:24 + MATHIAS, Bryn (Bryn) wrote:
>
>> Hi Christian,
>>
>> I’ll give you a much better dump of detail :)
>>
>> Running RHEL 7.1,
>> ceph version 0.94.5
>>
>> all ceph disks are xfs, with jo
Hi all,
I'm running ceph version 0.94.5 and I need to downsize my servers
because of insufficient RAM.
So I want to remove OSDs from the cluster and according to the manual
it's a pretty straightforward process:
I'm beginning with "ceph osd out {osd-num}" and the cluster starts
rebalancing i
On 30-11-15 10:08, Carsten Schmitt wrote:
> Hi all,
>
> I'm running ceph version 0.94.5 and I need to downsize my servers
> because of insufficient RAM.
>
> So I want to remove OSDs from the cluster and according to the manual
> it's a pretty straightforward process:
> I'm beginning with "ceph
Hi Carsten,
On 11/30/2015 10:08 AM, Carsten Schmitt wrote:
Hi all,
I'm running ceph version 0.94.5 and I need to downsize my servers
because of insufficient RAM.
So I want to remove OSDs from the cluster and according to the manual
it's a pretty straightforward process:
I'm beginning with "
On Sun, Nov 29, 2015 at 7:20 PM, wrote:
> Hi everyone,
>
> for my pet project I've needed python3 rados library. So I've took the
> existing python2 rados code and clean it up a little bit to fit my needs. The
> lib contains basic interface, asynchronous operations and also asyncio wrapper
> for
The mons in my production cluster(0.80.7) have a very high cpu usage 100%.
I added leveldb_compression = false to the ceph.conf to disable leveldb
compression and restarted all the mons with --compact. But the mons still
have a high cpu usages, and response to ceph command very slow.
Here is the pe
On 11/30/2015 09:51 AM, Yujian Peng wrote:
> The mons in my production cluster(0.80.7) have a very high cpu usage 100%.
> I added leveldb_compression = false to the ceph.conf to disable leveldb
> compression and restarted all the mons with --compact. But the mons still
> have a high cpu usages, and
Hi, All
Does anyone know how to open clog debug?
-
wukongming ID: 12019
Tel:0571-86760239
Dept:2014 UIS2 OneStor
---
On Nov 27, 2015 3:34 AM, "NEVEU Stephane"
wrote:
>
> Ok, I think I got it. It seems to come from here :
>
> tracker.ceph.com/issues/6047
>
>
>
> I’m trying to snapshot an image while I previously made a snapshot of my
pool… whereas it just works fine when using a brand new pool. I’m using
ceph v0.
Hi list,
AFAIK, fiemap disabled by default because it cause rbd corruption.
Someone already test it with recent kernels?
Thanks
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Hello Robert,
OK. I already tried this but as you said performances decrease. I just
built the 10.0.0 version and it seems that there are some regressions in
there. I've now 3.5 Kiops instead of 21 Kiops in 9.2.0 :-/
Thanks.
Rémi
Le 2015-11-25 18:54, Robert LeBlanc a écrit :
-BEGIN PGP
... and once you create a pool-level snapshot on a pool, there is no way to
convert that pool back to being compatible with RBD self-managed snapshots.
As for the RBD image feature bits, they are defined within rbd.py. On master,
they currently are as follows:
RBD_FEATURE_LAYERING = 1
RBD_FEAT
Vasiliy,
I don't think that's the cause. Can you paste other tuning options from
your ceph.conf?
Also, have you fixed the problems with cephx auth?
Bob
On Mon, Nov 30, 2015 at 12:56 AM, Vasiliy Angapov wrote:
> Btw, in my configuration "mon osd downout subtree limit" is set to "host".
> Does
Hi list,
Short:
i just want ask, why i can't do:
echo 129 > /sys/class/block/rbdX/queue/nr_requests
i.e. why i can't set value greater then 128?
Why such a restriction?
Long:
Usage example:
i have slow CEPH HDD based storage and i want it export by iSCSI proxy
machine for ESXi cluster
If i have
On Mon, Nov 30, 2015 at 7:17 PM, Timofey Titovets wrote:
> Hi list,
> Short:
> i just want ask, why i can't do:
> echo 129 > /sys/class/block/rbdX/queue/nr_requests
>
> i.e. why i can't set value greater then 128?
> Why such a restriction?
>
> Long:
> Usage example:
> i have slow CEPH HDD based st
Hi John,
thanks for the info. It seems that patch adds a python3 compatibility support
but leaves the ugly thread spawning intact. No idea if it makes sense to try
to merge some of my changes back to the ceph source.
Cheers
On Monday 30 of November 2015 09:46:18 John Spray wrote:
> On Sun, No
On 30 Nov 2015 21:19, "Ilya Dryomov" wrote:
>
> On Mon, Nov 30, 2015 at 7:17 PM, Timofey Titovets
wrote:
> > Hi list,
> > Short:
> > i just want ask, why i can't do:
> > echo 129 > /sys/class/block/rbdX/queue/nr_requests
> >
> > i.e. why i can't set value greater then 128?
> > Why such a restrict
On Mon, Nov 30, 2015 at 7:47 PM, Timofey Titovets wrote:
>
> On 30 Nov 2015 21:19, "Ilya Dryomov" wrote:
>>
>> On Mon, Nov 30, 2015 at 7:17 PM, Timofey Titovets
>> wrote:
>> > Hi list,
>> > Short:
>> > i just want ask, why i can't do:
>> > echo 129 > /sys/class/block/rbdX/queue/nr_requests
>> >
We recently upgraded to 0.94.3 from firefly and now for the last week have
had intermittent slow requests and flapping OSDs. We have been unable to
nail down the cause, but its feeling like it may be related to our osdmaps
not getting deleted properly. Most of our osds are now storing over 100GB
Hi!
On
http://docs.ceph.com/docs/master/rados/operations/user-management/#namespace
I read about auth namespaces. According to the most recent
documentation it is still not supported by any of the client libraries,
especially rbd.
I have a client asking to get access to rbd volumes for Kuber
Big Thanks Ilya,
for explanation
2015-11-30 22:15 GMT+03:00 Ilya Dryomov :
> On Mon, Nov 30, 2015 at 7:47 PM, Timofey Titovets
> wrote:
>>
>> On 30 Nov 2015 21:19, "Ilya Dryomov" wrote:
>>>
>>> On Mon, Nov 30, 2015 at 7:17 PM, Timofey Titovets
>>> wrote:
>>> > Hi list,
>>> > Short:
>>> > i jus
Hi,
I was wondering what hash function the CRUSH algorithm used, is there any
way that I can access the code for it? Or is it a commonly used one such as
MD5 or SHA-1. Essentially, I'm just looking to read more information about
it as I'm interested how this is used in order to look up objects
ind
The code is in ceph/src/crush of the gut repo, but it's pretty opaque. If
you go to the Ceph site and look through the pages there's one about
"publications" (or maybe just documentation? I think publications) that
hosts a paper on how CRUSH works.
IIRC it's using the jenkins hash on the object na
On 11/30/2015 08:56 PM, Tom Christensen wrote:
> We recently upgraded to 0.94.3 from firefly and now for the last week
> have had intermittent slow requests and flapping OSDs. We have been
> unable to nail down the cause, but its feeling like it may be related to
> our osdmaps not getting deleted
Hi,
I've read the recommendation from CERN about the number of OSD maps (
https://cds.cern.ch/record/2015206/files/CephScaleTestMarch2015.pdf, page
3) and I would like to know if there is any negative impact from these
changes:
[global]
osd map message max = 10
[osd]
osd map cache size = 20
osd
It's probably worth noting that if you're planning on removing multiple
OSDs in this manner, you should make sure they are not in the same
failure domain, per your CRUSH rules. For example, if you keep one
replica per node and three copies (as in the default) and remove OSDs
from multiple nodes wit
No, CPU and memory look normal. We haven't been fast/lucky enough with
iostat to see if we're just slamming the disk itself, I continue to attempt
to catch one, get logged into the node, find the disk and get iostat
running before the OSD comes back up. We haven't flapped that many OSDs,
and most
The trick with debugging heartbeat errors is to grep back through the log
to find the last thing the affected thread was doing, e.g. is
0x7f5affe72700 stuck in messaging, writing to the disk, reading through the
omap, etc..
I agree this doesn't look to be network related, but if you want to rule i
I wouldn't run with those settings in production. That was a test to
squeeze too many OSDs into too little RAM.
Check the values from infernalis/master. Those should be safe.
--
Dan
On 30 Nov 2015 21:45, "George Mihaiescu" wrote:
> Hi,
>
> I've read the recommendation from CERN about the number
On 11/30/2015 10:26 AM, misa-c...@hudrydum.cz wrote:
Hi John,
thanks for the info. It seems that patch adds a python3 compatibility support
but leaves the ugly thread spawning intact. No idea if it makes sense to try
to merge some of my changes back to the ceph source.
Yeah, like Wido mentione
What counts as ancient? Concurrent to our hammer upgrade we went from
3.16->3.19 on ubuntu 14.04. We are looking to revert to the 3.16 kernel
we'd been running because we're also seeing an intermittent (its happened
twice in 2 weeks) massive load spike that completely hangs the osd node
(we're ta
Hi,
We lost a disk today in our ceph cluster so we added a new machine with
4 disks to replace the capacity and we activated straw1 tunable too
(we also tried straw2 but we quickly backed up this change).
During recovery OSD started crashing on all of our machines
the issue being OSD RAM usage th
Hi Laurent,
Wow, that's excessive! I'd see if anyone else has any tricks first, but
if nothing else helps, running an OSD through valgrind with massif will
probably help pinpoint what's going on. Have you tweaked the recovery
tunables at all?
Mark
On 11/30/2015 06:52 PM, Laurent GUERBY wr
Oh, forgot to ask, any core dumps?
Mark
On 11/30/2015 06:58 PM, Mark Nelson wrote:
Hi Laurent,
Wow, that's excessive! I'd see if anyone else has any tricks first, but
if nothing else helps, running an OSD through valgrind with massif will
probably help pinpoint what's going on. Have you twea
Thanks for your response!
I have the following two questions now, please help me
1) There is one node for every cluster. Ceph and radosgw are deployed at
every node. One is as master zone, another is as slave zone. I write data
to master zone with bash script, and run radosgw agent at slave zone
Hello,
So I run systems using gentoo's openrc. Ceph is interesting, but
in the long term will it be mandatory to use systemd to keep
using ceph?
Will there continue to be a supported branch that works with openrc?
Long range guidance is keenly appreciated.
James
__
40 matches
Mail list logo