Hi,
Multi-MDS is not recommended, so currently I'm testing with
Active/Standby, but there is also a situation where a MDS could be in
Standby Replay by enabling 'mds_standby_replay' in the config.
How stable is that? I know the answer would be: Test it! But just
wondering what the recommendation
Hello Christian,
Thanks again for all of your help! I started a bonnie test using the
following::
bonnie -d /mnt/rbd/scratch2/ -m $(hostname) -f -b
Hopefully it completes in the next hour or so. A reboot of the slow OSDs
clears the slow marker for now
kh10-9$ ceph -w
cluster 9ea4d9d9-0
Standby replay is about as stable as normal standby -- it's covered in
some of the nightly test suites. The code running in standby replay
is almost all the same as what is run in one go at startup on a normal
standby.
John
On Fri, Dec 19, 2014 at 8:05 AM, Wido den Hollander wrote:
> Hi,
>
> Mu
Hello,
we have had some trouble of osds running full,
even after rebalancing. So at 100% usage and ceph-osds not starting
anymore, we decided to delete some pg directories, after which
rebalancing finished.
However after this, we have the situation that one pg is not
becoming clean anymore.
We t
Will this make its way into the debian repo eventually?
http://ceph.com/debian-giant
--
Lindsay
signature.asc
Description: This is a digitally signed message part.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi
Hi Lindsay,
On 19/12/2014 15:12, Lindsay Mathieson wrote:
> Will this make its way into the debian repo eventually?
This is a development release that is not meant to be published in
distributions such as Debian, CentOS etc.
Cheers
>
> http://ceph.com/debian-giant
>
>
>
> __
On Fri, 19 Dec 2014 03:27:53 PM you wrote:
> On 19/12/2014 15:12, Lindsay Mathieson wrote:
> > Will this make its way into the debian repo eventually?
>
> This is a development release that is not meant to be published in
> distributions such as Debian, CentOS etc.
Ah, thanks.
Its not clear from
On 19/12/2014 15:35, Lindsay Mathieson wrote:
> On Fri, 19 Dec 2014 03:27:53 PM you wrote:
>> On 19/12/2014 15:12, Lindsay Mathieson wrote:
>>> Will this make its way into the debian repo eventually?
>>
>> This is a development release that is not meant to be published in
>> distributions such as
On Fri, 19 Dec 2014 03:57:42 PM you wrote:
>The stable release have real names, that is what makes them different from
> development releases (dumpling, emperor, firefly, giant, hammer).
Ah, so we had two named firefly releases (Firefly 0.86 & Firefly 0.87) - they
were both production and we have
On 19/12/2014 16:10, Lindsay Mathieson wrote:
> On Fri, 19 Dec 2014 03:57:42 PM you wrote:
>> The stable release have real names, that is what makes them different from
>> development releases (dumpling, emperor, firefly, giant, hammer).
>
> Ah, so we had two named firefly releases (Firefly 0.86
Hello,
I'm a newbie to CEPH, gaining some familiarity by hosting some virtual
machines on a test cluster. I'm using a virtualisation product called
Proxmox Virtual Environment, which conveniently handles cluster setup,
pool setup, OSD creation etc.
During the attempted removal of an OSD, my pool
Hi,
Le 19/12/2014 15:57, Loic Dachary a écrit :
> The stable release have real names, that is what makes them different from
> development releases (dumpling, emperor, firefly, giant, hammer).
And I add that, from what I understand, one time in two the release
is LTS (Long Time Support). Firefl
Hello,
another issue we have experienced with qemu VMs
(qemu 2.0.0) with ceph-0.80 on Ubuntu 14.04
managed by opennebula 4.10.1:
The VMs are completly frozen when rebalancing takes place,
they do not even respond to ping anymore.
Looking at the qemu processes they are in state "Sl".
Is this a
On Thu, Dec 18, 2014 at 10:47 PM, Francois Lafont
wrote:
>
> Le 19/12/2014 02:18, Craig Lewis a écrit :
> > The daemons bind to *,
>
> Yes but *only* for the OSD daemon. Am I wrong?
>
> Personally I must provide IP addresses for the monitors
> in the /etc/ceph/ceph.conf, like this:
>
> [global]
>
Do you know if this value is not set if it uses 4MB or 4096 bytes?
Thanks,
Robert LeBlanc
On Thu, Dec 18, 2014 at 6:51 PM, Tyler Wilson wrote:
>
> Okay, this is rather unrelated to Ceph but I might as well mention how
> this is fixed. When using the Juno-Release OpenStack pages the
> 'rbd_store_
I'm still pretty new at troubleshooting Ceph and since no one has responded
yet I'll give a stab.
What is the size of your pool?
'ceph osd pool get size'
It seems like based on the number of incomplete PGs that it was '1'. I
understand that if you are able to bring osd 7 back in, it would clear
I've done single nodes. I have a couple VMs for RadosGW Federation
testing. It has a single virtual network, with both "clusters" on the same
network.
Because I'm only using a single OSD on a single host, I had to update the
crushmap to handle that. My Chef recipe runs:
ceph osd getcrushmap -o
Why did you remove osd.7?
Something else appears to be wrong. With all 11 OSDs up, you shouldn't
have any PGs stuck in stale or peering.
How badly are the clocks skewed between nodes? If it's bad enough, it can
cause communication problems between nodes. Ceph will complain if the
clocks are m
I think smaller clusters get chocked up with the default backfill. I've
seen latency on a four node cluster with 10 OSD each improve by setting
osd_max_backfills to 2. I would try lowering it and see if it helps.
Also, if you are running both cluster and VM traffic on the same network,
you could g
That seems odd. So you have 3 nodes, with 3 OSDs each. You should've been
able to mark osd.0 down and out, then stop the daemon without having those
issues.
It's generally best to mark an osd down, then out, and wait until the
cluster has recovered completely before stopping the daemon and remov
On Thu, Dec 18, 2014 at 8:44 PM, Sean Sullivan wrote:
> Thanks for the reply Gegory,
>
> Sorry if this is in the wrong direction or something. Maybe I do not
> understand
>
> To test uploads I either use bash time and either python-swiftclient or boto
> key.set_contents_from_filename to the radosg
Interesting indeed, those tuneables were suggested on the pve-user mailing list
too, and they certainly sound like they’ll ease the pressure during the
recovery operation. What I might not have explained very well though is that
the VMs hung indefinitely and past the end of the recovery process,
> The more I think about this problem, the less I think there'll be an easy
> answer, and it's more likely that I'll have to reproduce the scenario and
> actually pause myself next time in order to troubleshoot it?
It is even possible to simulate those crush problem. I reported a few examples
long
This is the last development release before Christmas. There are some API
cleanups for librados and librbd, and lots of bug fixes across the board
for the OSD, MDS, RGW, and CRUSH. The OSD also gets support for discard
(potentially helpful on SSDs, although it is off by default), and there
ar
With only one OSD down and size = 3, you shouldn't've had any PGs
inactive. At worst, they should've been active+degraded.
The only thought I have is that some of your PGs aren't mapping to the
correct number of OSDs. That's not supposed to be able to happen unless
you've messed up your crush ru
[Oh, sorry Craig for my mistake: I sent my response to your
personal address instead of sending it to the list. Sorry for
the duplicate. I send my message to list]
Hello,
Le 19/12/2014 19:17, Craig Lewis a écrit :
> I'm not using mon addr lines, and my ceph-mon daemons are bound to 0.0.0.0:*.
A
On Fri, Dec 19, 2014 at 4:03 PM, Francois Lafont wrote:
>
> Le 19/12/2014 19:17, Craig Lewis a écrit :
>
> > I'm not using mon addr lines, and my ceph-mon daemons are bound to
> 0.0.0.0:*.
>
> And do you have several IP addresses on your server?
> Can you contact the *same* monitor process with di
I react to this point.
Le 20/12/2014 02:14, Francois Lafont a écrit :
> when I create my cluster with the
> first monitor, I have to generate a monitor map with this
> command:
>
> monmaptool --create --add {hostname} {ip-address} --fsid {uuid}
> /tmp/monmap
>
Hello Sean,
On Fri, 19 Dec 2014 02:47:41 -0600 Sean Sullivan wrote:
> Hello Christian,
>
> Thanks again for all of your help! I started a bonnie test using the
> following::
> bonnie -d /mnt/rbd/scratch2/ -m $(hostname) -f -b
>
While that gives you a decent idea of what the limitations of ker
Le 20/12/2014 02:18, Craig Lewis a écrit :
>> And do you have several IP addresses on your server?
>> Can you contact the *same* monitor process with different IP addresses?
>> For instance:
>> telnet -e ']' ip_addr1 6789
>> telnet -e ']' ip_addr2 6789
>>
>
> Oh. The second one fails, ev
On Fri, Dec 19, 2014 at 6:19 PM, Francois Lafont wrote:
>
>
> So, indeed, I have to use routing *or* maybe create 2 monitors
> by server like this:
>
> [mon.node1-public1]
> host = ceph-node1
> mon addr = 10.0.1.1
>
> [mon.node1-public2]
> host = ceph-node1
> mon addr = 10.
Hi Sage,
Has the repo metadata been regenerated?
One of my reposync jobs can only see up to 0.89, using
http://ceph.com/rpm-testing.
Thanks
Anthony
On Sat, Dec 20, 2014 at 6:22 AM, Sage Weil wrote:
> This is the last development release before Christmas. There are some API
> cleanups for l
32 matches
Mail list logo