Shouldn't it just be:
step take default
step chooseleaf firstn 0 type rack
step emit
Like he has for data and metadata?
--
Dan
On Thu, Mar 28, 2013 at 2:51 AM, Martin Mailand wrote:
> Hi John,
>
> I still think this part in the crushmap is wrong.
>
> step take d
In the current kernel (3.2.0-39-generic) it's not fixed. It's quite
annoying as we have to reboot the host every couple of days ...
On Sat, 2013-03-16 at 12:29 +0100, Léon Keijser wrote:
> Thanks for the response. Currently we're running kernel 3.2.0-35
> (3.2.0-35.55 on 12.04.2).
>
>
> Léo
Thank you both.
well , no more error message :)
If i may, i still have some confusion about how to start operating on ceph!
_how can i start storing files that on the client at the server? correct me
if i'm wrong, at client side i'll save the files i need to store at the
mount point, then the se
Hi,It depends, what do you to use from Ceph? Object store? Block device? Distributed filesystem?Sébastien HanCloud Engineer"Always give 100%. Unless you're giving blood."PHONE : +33 (0)1 49 70 99 72 – MOBILE : +33 (0)6 52 84 44 70EMAIL : sebastien@enovance.com – SKYPE : han.sbastienADDRESS
On Wed, Mar 27, 2013 at 10:43:36PM -0700, Matthieu Patou wrote:
> On 03/27/2013 10:41 AM, Marco Aroldi wrote:
> >Hi list,
> >I'm trying to create a active/active Samba cluster on top of Cephfs
> >I would ask if Ceph fully supports CTDB at this time.
> If I'm not wrong Ceph (even CephFS) do not supp
Hello everybody,Quite recently François Charlier and I worked together on the Puppet modules for Ceph on behalf of our employer eNovance. In fact, François started to work on them last summer, back then he achieved the Monitor manifests. So basically, we worked on the OSD manifest. Modules are in p
Hi Dan,
so I change the crushmap to:
rule rbd {
ruleset 2
type replicated
min_size 1
max_size 10
step take default
step chooseleaf firstn 0 type rack
step emit
}
than I look at one pg:
2.33d 1 0 0 0 4194304 0
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
On 2013-03-28 09:16, Volker Lendecke wrote:
> On Wed, Mar 27, 2013 at 10:43:36PM -0700, Matthieu Patou wrote:
>> On 03/27/2013 10:41 AM, Marco Aroldi wrote:
>>> Hi list, I'm trying to create a active/active Samba cluster on
>>> top of Cephfs I would as
Object Store.
On Thu, Mar 28, 2013 at 11:51 AM, Sebastien Han
wrote:
> Hi,
>
> It depends, what do you to use from Ceph? Object store? Block device?
> Distributed filesystem?
>
>
> Sébastien Han
> Cloud Engineer
>
> "Always give 100%. Unless you're giving blood."
>
>
>
>
>
>
>
>
>
> PHONE :
On 03/28/2013 04:34 AM, Sebastien Han wrote:
Hello everybody,
Quite recently François Charlier and I worked together on the Puppet
modules for Ceph on behalf of our employer eNovance. In fact, François
started to work on them last summer, back then he achieved the Monitor
manifests. So basically
On Wed, 27 Mar 2013, Matthieu Patou wrote:
> On 03/27/2013 10:41 AM, Marco Aroldi wrote:
> > Hi list,
> > I'm trying to create a active/active Samba cluster on top of Cephfs
> > I would ask if Ceph fully supports CTDB at this time.
>
> If I'm not wrong Ceph (even CephFS) do not support exporting a
On 03/28/2013 07:41 AM, Sage Weil wrote:
On Wed, 27 Mar 2013, Matthieu Patou wrote:
On 03/27/2013 10:41 AM, Marco Aroldi wrote:
Hi list,
I'm trying to create a active/active Samba cluster on top of Cephfs
I would ask if Ceph fully supports CTDB at this time.
If I'm not wrong Ceph (even CephFS)
This is the perfectly normal distinction between "down" and "out". The
OSD has been marked down but there's a timeout period (default: 5
minutes) before it's marked "out" and the data gets reshuffled (to
avoid starting replication on a simple reboot, for instance).
-Greg
Software Engineer #42 @ htt
Thanks for the answer,
I haven't yet looked at the samba.git clone, sorry. I will.
Just a quick report on my test environment:
* cephfs mounted with kernel driver re-exported from 2 samba nodes
* If "node B" goes down, everything works like a charm: "node A" does
ip takeover and bring up the "nod
Not sure how much of a difference it makes at this point, but I also tend to use -i size=2048.Well, while running through the Ceph and XFS ML, I came across those options several times. Looks like you guys are using AGPL V3. I don't actually know too much about that license other than that it's fa
Hi Mark,
The introduction of http://en.wikipedia.org/wiki/Affero_General_Public_License
gives you the reason for using AGPL instead of GPL. In the case of a puppet
module, publishing under the GPL instead of AGPL would not make a practical
difference. A puppetmaster distributes the puppet modul
In my opinion eNovance choice for a copyleft license ( i.e. share and share alike in the creative common jargon ) is sensible : they are willing to share with potential competitors, as long as they are in the same mindset.Indeed, that's the main idea behind it. This is not really applicable for the
Hi Greg,
/etc/init.d/ceph stop osd.1
=== osd.1 ===
Stopping Ceph osd.1 on store1...kill 13413...done
root@store1:~# date -R
Thu, 28 Mar 2013 18:22:05 +0100
root@store1:~# ceph -s
health HEALTH_WARN 378 pgs degraded; 378 pgs stuck unclean; recovery
39/904 degraded (4.314%); recovering 15E o/s,
Looks like you either have a custom config, or have specified
somewhere that OSDs shouldn't be marked out. (ie, setting the 'noout'
flag). There can also be a bit of flux if your OSDs are reporting an
unusual number of failures, but you'd have seen failure reports if
that were going on.
-Greg
On T
Hi Greg,
I have a custom crush map, which I attached below.
My Goal is it to have two racks, each rack should be a failure domain.
That means for the rbd pool, which I use with a replication level of
two, I want one replica in one rack and the other replica in the other
rack. So that I could loose
Your crush map looks fine to me. I'm saying that your ceph -s output
showed the OSD still hadn't been marked out. No data will be migrated
until it's marked out.
After ten minutes it should have been marked out, but that's based on
a number of factors you have some control over. If you just want a
Hi Greg,
setting the osd manually out triggered the recovery.
But now it is the question, why is the osd not marked out after 300
seconds? That's a default cluster, I use the 0.59 build from your site.
And I didn't change any value, except for the crushmap.
That's my ceph.conf.
-martin
[global]
Martin,
Greg is talking about noout. With Ceph, you can specifically preclude
OSDs from being marked out when down to prevent rebalancing--e.g.,
during upgrades, short-term maintenance, etc.
http://ceph.com/docs/master/rados/operations/troubleshooting-osd/#stopping-w-out-rebalancing
On Thu, Mar
Hi John,
I did not set noout to the cluster.
-martin
On 28.03.2013 19:48, John Wilkins wrote:
> Martin,
>
> Greg is talking about noout. With Ceph, you can specifically preclude
> OSDs from being marked out when down to prevent rebalancing--e.g.,
> during upgrades, short-term maintenance, etc.
On 03/28/2013 01:03 AM, Martin Mailand wrote:
Hi,
today one of my mons crashed, the log is here.
http://pastebin.com/ugr1fMJR
I think the most important part is:
2013-03-28 01:57:48.564647 7fac6c0ea700 -1
auth/none/AuthNoneServiceHandler.h: In function 'virtual int
AuthNoneServiceHandler::handl
Hi Joao,
thanks for catching that up.
-martin
On 28.03.2013 20:03, Joao Eduardo Luis wrote:
>
> Hi Martin,
>
> As John said in his reply, these should be reported to ceph-devel (CC'ing).
>
> Anyway, this is bug #4519 [1]. It was introduced after 0.58, released
> under 0.59 and is already fix
Hi,
I get the same behavior an new created cluster as well, no changes to
the cluster config at all.
I stop the osd.1, after 20 seconds it got marked down. But it never get
marked out.
ceph version 0.59 (cbae6a435c62899f857775f66659de052fb0e759)
-martin
On 28.03.2013 19:48, John Wilkins wrote:
Hmm. The monitor code for checking this all looks good to me. Can you
go to one of your monitor nodes and dump the config?
(http://ceph.com/docs/master/rados/configuration/ceph-conf/?highlight=admin%20socket#viewing-a-configuration-at-runtime)
-Greg
On Thu, Mar 28, 2013 at 12:33 PM, Martin Mailand
Hi Greg,
the dump from mon.a is attached.
-martin
On 28.03.2013 20:55, Gregory Farnum wrote:
> Hmm. The monitor code for checking this all looks good to me. Can you
> go to one of your monitor nodes and dump the config?
> (http://ceph.com/docs/master/rados/configuration/ceph-conf/?highlight=admi
Martin,
Would you mind posting your Ceph configuration file too? I don't see
any value set for "mon_host": ""
On Thu, Mar 28, 2013 at 1:04 PM, Martin Mailand wrote:
> Hi Greg,
>
> the dump from mon.a is attached.
>
> -martin
>
> On 28.03.2013 20:55, Gregory Farnum wrote:
>> Hmm. The monitor cod
Hi John,
my ceph.conf is a bit further down in this email.
-martin
Am 28.03.2013 23:21, schrieb John Wilkins:
Martin,
Would you mind posting your Ceph configuration file too? I don't see
any value set for "mon_host": ""
On Thu, Mar 28, 2013 at 1:04 PM, Martin Mailand wrote:
Hi Greg,
the
Martin,
I'm just speculating: since I just rewrote the networking section and
there is an empty mon_host value, and I do recall a chat last week
where mon_host was considered a different setting now, maybe you might
try specifying:
[mon.a]
mon host = store1
mon addr = 192.168.195.
Hi John,
I did the changes and restarted the cluster, nothing changed.
ceph --admin-daemon /var/run/ceph/ceph-osd.2.asok config show|grep mon_host
"mon_host": "",
-martin
On 28.03.2013 23:45, John Wilkins wrote:
> Martin,
>
> I'm just speculating: since I just rewrote the networking section
I have a large bucket (about million objects) and it takes a few days to delete
it. Watching ceph -w, I only see 8 to 30 op/s. What's going on? Thanks.
The command:
radosgw-admin bucket rm --bucket=testbucket --purge-objects
___
ceph-users mailing list
I create an empty 150G volume them copy it to a second pool:
# rbd -p pool0 create --size 153750 steve150
# /usr/bin/time rbd cp pool0/steve150 pool1/steve150
Image copy: 100% complete...done.
303.44user 233.40system 1:52:10elapsed 7%CPU (0avgtext+0avgdata
248832maxresident)k
Notice there is
Disable the recovery lock file from ctdb completely.
And disable fcntl locking from samba.
To be blunt, unless your cluster filesystem is called GPFS,
locking is probably completely broken and should be avoided.
On Thu, Mar 28, 2013 at 8:46 AM, Marco Aroldi wrote:
> Thanks for the answer,
>
>
On Thu, 28 Mar 2013, ronnie sahlberg wrote:
> Disable the recovery lock file from ctdb completely.
> And disable fcntl locking from samba.
>
> To be blunt, unless your cluster filesystem is called GPFS,
> locking is probably completely broken and should be avoided.
Ha!
> On Thu, Mar 28, 2013 at
On Thu, Mar 28, 2013 at 6:09 PM, Sage Weil wrote:
> On Thu, 28 Mar 2013, ronnie sahlberg wrote:
>> Disable the recovery lock file from ctdb completely.
>> And disable fcntl locking from samba.
>>
>> To be blunt, unless your cluster filesystem is called GPFS,
>> locking is probably completely broke
The ctdb package comes with a tool "ping pong" that is used to test
and exercise fcntl() locking.
I think a good test is using this tool and then randomly powercycling
nodes in your fs cluster
making sure that
1, fcntl() locking is still coherent and correct
2, always recover within 20 seconds for
This is pretty cool, Sébastien.
On 03/28/2013 02:34 AM, Sebastien Han wrote:
Hello everybody,
Quite recently François Charlier and I worked together on the Puppet
modules for Ceph on behalf of our employer eNovance. In fact, François
started to work on them last summer, back then he achieved th
40 matches
Mail list logo