Re: [ceph-users] RDB

2013-06-17 Thread Gary Bruce
Hi All,

I finally got around to progressing with this but immediately got this
message. Any thoughts?

alphaceph@cephadmin1:~$ rbd create fooimage --size 1024 --pool barpool -m
cephserver1.zion.bt.co.uk -k /etc/ceph/ceph.client.admin.keyring
2013-06-17 08:38:43.955683 7f76a6b72780 -1 did not load config file, using
default settings.
2013-06-17 08:38:43.962518 7f76a6b72780 -1 monclient(hunting): authenticate
NOTE: no keyring found; disabled cephx authentication
2013-06-17 08:38:43.962541 7f76a6b72780  0 librados: client.admin
authentication error (95) Operation not supported
rbd: couldn't connect to the cluster!

FYI...
alphaceph@cephserver1:~$ sudo more /etc/ceph/ceph.client.admin.keyring

[client.admin]
key = AQCp5rNRkBLCHRAAOqfY/24mkYCQZ7sNy/8BDA==
alphaceph@cephserver1:~$ sudo more /etc/ceph/ceph.conf
[global]
fsid = 5e29db66-a1f1-4220-aa19-ab82020adc78
mon_initial_members = cephserver1
mon_host = 10.255.40.22
auth_supported = cephx
osd_journal_size = 1024
filestore_xattr_use_omap = true
osd_crush_chooseleaf_type = 0

Thanks in advance.
Gary


On Tue, Jun 11, 2013 at 8:14 PM, John Wilkins wrote:

> Gary,
>
> I've added that instruction to the docs. It should be up shortly. Let
> me know if you have other feedback for the docs.
>
> Regards,
>
> John
>
> On Mon, Jun 10, 2013 at 9:13 AM, Gary Bruce 
> wrote:
> > Hi again,
> >
> > I don't see anything in http://ceph.com/docs/master/start/ that mentions
> > installing ceph-common or a package that would have it as a dependency on
> > the admin server. If there's a gap in the documentation, I'd like to help
> > address it.
> >
> > If I need to install ceph-common on my admin node, how should I go about
> > doing it as this is not clear from the documentation. Some possible
> > approaches are to run one of these commands from my admin node,
> cephadmin1:
> >
> > *** sudo apt-get install ceph-common
> > *** sudo apt-get install ceph
> > *** ceph-deploy install --stable cuttlefish cephadmin1// I used
> > "ceph-deploy install --stable cuttlefish cephserver1" to install ceph on
> my
> > server node from my admin node.
> >
> > Any thoughts on the most appropriate way to install ceph-common (and
> other
> > required packages) on cephadmin?
> >
> > Thanks
> > Gary
> >
> >
> > On Sun, Jun 9, 2013 at 10:03 AM, Smart Weblications GmbH
> >  wrote:
> >>
> >> Hi,
> >>
> >> Am 09.06.2013 10:42, schrieb Gary Bruce:
> >> > Hi,
> >> >
> >> > I'm trying to run this from my admin node, have I missed a step?
> >> >
> >> >
> >> > alphaceph@cephadmin1:~/ceph-deploy/my-cluster$ rbd create fooimage
> >> > --size 1024
> >> > --pool barpool -m cephserver1.zion.bt.co.uk
> >> > 
> >> > -k /etc/ceph/ceph.client.admin.keyring
> >>
> >>
> >> Look:
> >>
> >> > The program 'rbd' is currently not installed. To run 'rbd' please ask
> >> > your
> >> > administrator to install the package 'ceph-common'
> >> >
> >>
> >> Maybe you missed installing ceph-common on your host cephadmin1
> >>
> >>
> >>
> >> --
> >>
> >> Mit freundlichen Grüßen,
> >>
> >>
> >> Smart Weblications GmbH
> >> Martinsberger Str. 1
> >> D-95119 Naila
> >>
> >> fon.: +49 9282 9638 200
> >> fax.: +49 9282 9638 205
> >> 24/7: +49 900 144 000 00 - 0,99 EUR/Min*
> >> http://www.smart-weblications.de
> >>
> >> --
> >> Sitz der Gesellschaft: Naila
> >> Geschäftsführer: Florian Wiessner
> >> HRB-Nr.: HRB 3840 Amtsgericht Hof
> >> *aus dem dt. Festnetz, ggf. abweichende Preise aus dem Mobilfunknetz
> >
> >
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>
>
>
> --
> John Wilkins
> Senior Technical Writer
> Intank
> john.wilk...@inktank.com
> (415) 425-9599
> http://inktank.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RDB

2013-06-17 Thread Gary Bruce
Hi All,

A bit of an update... I should have run the command from my-cluster
directory. I am now receiving this error:

alphaceph@cephadmin1:~/ceph-deploy/my-cluster$ rbd create fooimage --size
1024 --pool barpool -m cephserver1.zion.bt.co.uk -k
/etc/ceph/ceph.client.admin.keyring
2013-06-17 08:55:14.751204 7f1b28950780 -1 monclient(hunting): ERROR:
missing keyring, cannot use cephx for authentication
2013-06-17 08:55:14.751212 7f1b28950780  0 librados: client.admin
initialization error (2) No such file or directory
rbd: couldn't connect to the cluster!

More FYI
alphaceph@cephadmin1:~/ceph-deploy/my-cluster$ more
ceph.client.admin.keyring

[client.admin]
key = AQCp5rNRkBLCHRAAOqfY/24mkYCQZ7sNy/8BDA==

alphaceph@cephadmin1:~/ceph-deploy/my-cluster$ more ceph.conf
[global]
fsid = 5e29db66-a1f1-4220-aa19-ab82020adc78
mon initial members = cephserver1
mon host = 10.255.40.22
auth supported = cephx
osd journal size = 1024
filestore xattr use omap = true
osd crush chooseleaf type = 0


Thanks
Gary


On Mon, Jun 17, 2013 at 8:37 AM, Gary Bruce wrote:

> Hi All,
>
> I finally got around to progressing with this but immediately got this
> message. Any thoughts?
>
> alphaceph@cephadmin1:~$ rbd create fooimage --size 1024 --pool barpool -m
> cephserver1.zion.bt.co.uk -k /etc/ceph/ceph.client.admin.keyring
> 2013-06-17 08:38:43.955683 7f76a6b72780 -1 did not load config file, using
> default settings.
> 2013-06-17 08:38:43.962518 7f76a6b72780 -1 monclient(hunting):
> authenticate NOTE: no keyring found; disabled cephx authentication
> 2013-06-17 08:38:43.962541 7f76a6b72780  0 librados: client.admin
> authentication error (95) Operation not supported
> rbd: couldn't connect to the cluster!
>
> FYI...
> alphaceph@cephserver1:~$ sudo more /etc/ceph/ceph.client.admin.keyring
>
> [client.admin]
> key = AQCp5rNRkBLCHRAAOqfY/24mkYCQZ7sNy/8BDA==
> alphaceph@cephserver1:~$ sudo more /etc/ceph/ceph.conf
> [global]
> fsid = 5e29db66-a1f1-4220-aa19-ab82020adc78
> mon_initial_members = cephserver1
> mon_host = 10.255.40.22
> auth_supported = cephx
> osd_journal_size = 1024
> filestore_xattr_use_omap = true
> osd_crush_chooseleaf_type = 0
>
> Thanks in advance.
> Gary
>
>
> On Tue, Jun 11, 2013 at 8:14 PM, John Wilkins wrote:
>
>> Gary,
>>
>> I've added that instruction to the docs. It should be up shortly. Let
>> me know if you have other feedback for the docs.
>>
>> Regards,
>>
>> John
>>
>> On Mon, Jun 10, 2013 at 9:13 AM, Gary Bruce 
>> wrote:
>> > Hi again,
>> >
>> > I don't see anything in http://ceph.com/docs/master/start/ that
>> mentions
>> > installing ceph-common or a package that would have it as a dependency
>> on
>> > the admin server. If there's a gap in the documentation, I'd like to
>> help
>> > address it.
>> >
>> > If I need to install ceph-common on my admin node, how should I go about
>> > doing it as this is not clear from the documentation. Some possible
>> > approaches are to run one of these commands from my admin node,
>> cephadmin1:
>> >
>> > *** sudo apt-get install ceph-common
>> > *** sudo apt-get install ceph
>> > *** ceph-deploy install --stable cuttlefish cephadmin1// I used
>> > "ceph-deploy install --stable cuttlefish cephserver1" to install ceph
>> on my
>> > server node from my admin node.
>> >
>> > Any thoughts on the most appropriate way to install ceph-common (and
>> other
>> > required packages) on cephadmin?
>> >
>> > Thanks
>> > Gary
>> >
>> >
>> > On Sun, Jun 9, 2013 at 10:03 AM, Smart Weblications GmbH
>> >  wrote:
>> >>
>> >> Hi,
>> >>
>> >> Am 09.06.2013 10:42, schrieb Gary Bruce:
>> >> > Hi,
>> >> >
>> >> > I'm trying to run this from my admin node, have I missed a step?
>> >> >
>> >> >
>> >> > alphaceph@cephadmin1:~/ceph-deploy/my-cluster$ rbd create fooimage
>> >> > --size 1024
>> >> > --pool barpool -m cephserver1.zion.bt.co.uk
>> >> > 
>> >> > -k /etc/ceph/ceph.client.admin.keyring
>> >>
>> >>
>> >> Look:
>> >>
>> >> > The program 'rbd' is currently not installed. To run 'rbd' please ask
>> >> > your
>> >> > administrator to install the package 'ceph-common'
>> >> >
>> >>
>> >> Maybe you missed installing ceph-common on your host cephadmin1
>> >>
>> >>
>> >>
>> >> --
>> >>
>> >> Mit freundlichen Grüßen,
>> >>
>> >>
>> >> Smart Weblications GmbH
>> >> Martinsberger Str. 1
>> >> D-95119 Naila
>> >>
>> >> fon.: +49 9282 9638 200
>> >> fax.: +49 9282 9638 205
>> >> 24/7: +49 900 144 000 00 - 0,99 EUR/Min*
>> >> http://www.smart-weblications.de
>> >>
>> >> --
>> >> Sitz der Gesellschaft: Naila
>> >> Geschäftsführer: Florian Wiessner
>> >> HRB-Nr.: HRB 3840 Amtsgericht Hof
>> >> *aus dem dt. Festnetz, ggf. abweichende Preise aus dem Mobilfunknetz
>> >
>> >
>> >
>> > ___
>> > ceph-users mailing list
>> > ceph-users@lists.ceph.com
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >
>>
>>
>>
>> --
>> John Wilkins
>> Senior Technical Writer
>> Intank
>> john.wilk...@inktank.c

Re: [ceph-users] Disaster recovery of monitor

2013-06-17 Thread peter

On 2013-06-14 19:59, Joao Eduardo Luis wrote:

On 06/14/2013 02:39 PM, pe...@2force.nl wrote:

On 2013-06-13 20:10, pe...@2force.nl wrote:

On 2013-06-13 18:57, Joao Eduardo Luis wrote:

On 06/13/2013 05:25 PM, pe...@2force.nl wrote:

On 2013-06-13 18:06, Gregory Farnum wrote:

On Thursday, June 13, 2013, wrote:


Hello,
We ran into a problem with our test cluster after adding 
monitors. It
now seems that our main monitor doesn't want to start anymore. 
The

logs are flooded with:
2013-06-13 11:41:05.316982 7f7689ca4780  7 mon.a@0(leader).osd 
e2809

update_from_paxos  applying incremental 2810
2013-06-13 11:41:05.317043 7f7689ca4780  1 mon.a@0(leader).osd 
e2809

e2809: 9 osds: 9 up, 9 in
2013-06-13 11:41:05.317064 7f7689ca4780  7 mon.a@0(leader).osd 
e2809

update_from_paxos  applying incremental 2810
Is this accurate? It's applying the *same* incremental I've and 
over

again?

Yes, this is the current state:

Peter,
Can you point me to the full log of the monitor caught in this
apparent loop?
-Joao



Hi Joao,

Here it is:

http://www.2force.nl/ceph/ceph-mon.a.log.gz

Thanks,

Peter



Hi Joao,

Did you happen to figure out what is going on? If you need more log
files let me know.


Peter,

You can find all the updates on #5343 [1].

It is my understanding that you are running a test cluster; is this
correct?  If so, our suggestion is to start your monitor fresh.  We've
been able to figure out all the caused for this issue (thanks for your
help!):





- Injecting a monmap with a wrong fsid was the main culprit.  Given
you are on a version suffering from a bug that won't kill the monitor
if some sanity checks fail when the monitor is started, the monitor
was started even though said fsid mismatch was present.  A fix for
that will be hitting 0.61.4 soon, and has already hit master a few
days back.

- There was a bug in OSDMonitor::update_from_paxos() that would
ignore the return from OSDMap::apply_incremental(), thus leading to
the infinite loop in case the incremental failed to be applied.  That
should go into master soon.


However, with regard to getting the monitor running back again,
there's little we can do at the moment.  We don't believe the fix to
correct the incremental's fsid is necessary, as it should never happen
again once the patches are in and shouldn't even have happened in the
first place were the fsid in the injected monmap to be correct.  So,
if this is indeed a test cluster, it would be better to just start off
fresh; otherwise, let me know and we may be able to put a quick and
dirty fix to get your cluster back again.

Thanks!

  -Joao


[1] - http://tracker.ceph.com/issues/5343


Hi Joao,

You're welcome! Happy that we could help. I was at first hesitant to 
post to the mailinglist because I thought it was just user error. In 
this case it seems that due to our user error we uncovered a bug or at 
least something that should have never happened :) So if anyone out 
there is having the same feeling, just post. You never know what comes 
out.


Are there any other tips you might have for us and other users? Is it 
possible to have a backup of your monitor directory? Or is ensuring you 
have enough monitors enough? Is it possible for errors like this to be 
propagated to other monitors?


It would be really nice of there will be tools that can help with 
disaster recovery and some more documentation on this. I'm sure nobody 
would play around like we did with their live cluster but strange things 
do tend to happen (and bugs) and it is always nice to know if there is a 
way out. You don't want to end up with those petabytes sitting there :)


Thanks!

Peter

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Live Migrations with cephFS

2013-06-17 Thread Sebastien Han
Thank you, Sebastien Han. I am sure many are thankful you've published your thoughts and experiences with Ceph and even OpemStack.Thanks Bo! :)If I may, I would like to reword my question/statement with greater clarity: To force all instances to always boot from RBD volumes, would a person would have to make changes to something more than Horizon (demonstration GUI)? If the changes need only be in Horizon, the provider would then likely need to restrict or deny their customers access to their unmodified APIs. If they do not, then the unchanged APIs would allow for behavior the provider does not want.Thoughts? Corrections? Feel free to teach.This is correct. Forcing the boot from volume requires a modified version of the API which kinda tricky and GUI modifications. There are 2 cases:1. you're an ISP (public provider), you should forget about the idea unless you want to provide a _really_ close service.2. you're the only one managing your platform (private cloud) this might be doable but even so you'll encounter a lot of problems while upgrading. At the end it's up to you, if you're 100% sure that you have the complete control of your infra and that you know when, who and how new instances are booted (and occasionally don't care about update and compatibility). You can always hack the dashboard but it's more than that you have to automate the action that each time someone is booting a VM you have to create a volume from an image for this. This will prolong the process. At this point, I'll recommend you to push this blueprint, it'll run all the VM through ceph even the one not using the boot-from-volume option. https://blueprints.launchpad.net/nova/+spec/bring-rbd-support-libvirt-images-typeAn article is coming next week and will cover the entire subject.Cheers!Sébastien HanCloud Engineer"Always give 100%. Unless you're giving blood."Phone : +33 (0)1 49 70 99 72 – Mobile : +33 (0)6 52 84 44 70Email : sebastien@enovance.com – Skype : han.sbastienAddress : 10, rue de la Victoire – 75009 ParisWeb : www.enovance.com – Twitter : @enovance
On Jun 17, 2013, at 8:00 AM, Wolfgang Hennerbichler  wrote:On 06/14/2013 08:00 PM, Ilja Maslov wrote:Hi,Is live migration supported with RBD and KVM/OpenStack?Always wanted to know but was afraid to ask :)totally works in my productive setup. but we don't use openstack in thisinstallation, just KVM/RBD.Pardon brevity and formatting, replying from the phone.Cheers,IljaRobert Sander  wrote:On 14.06.2013 12:55, Alvaro Izquierdo Jimeno wrote:By default, openstack uses NFS but… other options are available….can weuse cephFS instead of NFS?Wouldn't you use qemu-rbd for your virtual guests in OpenStack?AFAIK CephFS is not needed for KVM/qemu virtual machines.Regards--Robert SanderHeinlein Support GmbHSchwedter Str. 8/9b, 10119 Berlinhttp://www.heinlein-support.deTel: 030 / 405051-43Fax: 030 / 405051-19Zwangsangaben lt. §35a GmbHG:HRB 93818 B / Amtsgericht Berlin-Charlottenburg,Geschäftsführer: Peer Heinlein -- Sitz: Berlin___ceph-users mailing listceph-users@lists.ceph.comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.comThis email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you are not the intended recipient, please note that any review, dissemination, disclosure, alteration, printing, circulation, retention or transmission of this e-mail and/or any file or attachment transmitted with it, is prohibited and may be unlawful. If you have received this e-mail or any file or attachment transmitted with it in error please notify postmas...@openet.com. Although Openet has taken reasonable precautions to ensure no viruses are present in this email, we cannot accept responsibility for any loss or damage arising from the use of this email or attachments.___ceph-users mailing listceph-users@lists.ceph.comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com-- DI (FH) Wolfgang HennerbichlerSoftware DevelopmentUnit Advanced Computing TechnologiesRISC Software GmbHA company of the Johannes Kepler University LinzIT-CenterSoftwarepark 354232 HagenbergAustriaPhone: +43 7236 3343 245Fax: +43 7236 3343 250wolfgang.hennerbich...@risc-software.athttp://www.risc-software.at___ceph-users mailing listceph-users@lists.ceph.comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph and open source cloud software: Path of least resistance

2013-06-17 Thread Jaime Melis
Hi Jens,

with regard to OpenNebula I would like to point out a couple of things.
OpenNebula has official support not just for CentOS but for three other
distros, among which there's Ubuntu, which as far as I know has Ceph and
rbd supported libvirt and qemu-kvm versions.

Also, as far as I know there's a current effort at CentOS to provide an rpm
repo with a full Ceph stack with updated libvirt and qemu-kvm versions
which should be ready by the end of the month. So setting up a working
cluster with Ceph (that works as of now) and deploying newer libvirt and
qemu-kvm version compatible with rbd will be very easy in just a matter of
weeks.

However, from the point-of-view of OpenNebula all of this is pretty much
transparent, the Ceph support is working and as long as you have the
requirements stated above you should be good to go.

regards,
Jaime


On Sun, Jun 16, 2013 at 7:48 PM, Jens Kristian Søgaard <
j...@mermaidconsulting.dk> wrote:

> Hi guys,
>
> I'm looking to setup an open source cloud IaaS system that will work well
> together with Ceph. I'm looking for a system that will handle running KVM
> virtual servers with persistent storage on a number of physical servers
> with a multi-tenant dashboard.
>
> I have now tried a number of systems, but having difficulties finding
> something that will work with Ceph in an optimal way. Or at least, having
> difficulties finding hints on how to achieve that.
>
> By optimal I mean:
>
> a) To have Ceph as the only storage, so that I don't have a NFS SPoF nor
> have to wait for images to be copied from server to server.
>
> b) To run KVM with the async flush feature in 1.4.2 (or backported) and
> with the librbd cache.
>
>
> Any of you guys are doing this? - have hints to offer?
>
> I have tried CloudStack, but found that it was not possible to rely fully
> on Ceph storage. I learnt that it would potentially be possible with the
> upcoming 4.2 release, so I tried installed CloudStack from the development
> source code tree. I wasn't able to get this working because of various bugs
> (to be expected when running a development version ofcourse).
>
> I also tried OpenNebula, but found that it was very hard to get working on
> the recommended CentOS 6.4 distribution. By upgrading all sorts of systems
> and manually patching parts of the system I was able to get it "almost
> working". However in the end, I ended up in a dilemma where OpenNebula
> needed a newer qemu version to support RBDs and that newer qemu didn't work
> well with the older libvirt. On the other hand if I upgraded libvirt, I
> couldn't get it to work with the older qemu versions with backported RBD
> support, as the newer libvirt where setting an auth_supported=none option
> that stopped it from working. It didn't seem possible to convince
> OpenNebula to store a secret for Ceph with libvirt.
>
> I have been looking at OpenStack, but by reading the documentation and
> googling it seems that it is not possible to configure OpenStack to use the
> librbd cache with Ceph. Could this be right?
>
> Or is it merely the case, that you cannot configure it on a per-VM basis,
> so that you have to rely on the default settings in ceph.conf? (which
> wouldn't be a problem for me)
>
> Any advice you could give would be greatly appreciated!
>
> Thanks,
> --
> Jens Kristian Søgaard, Mermaid Consulting ApS,
> j...@mermaidconsulting.dk,
> http://www.mermaidconsulting.**com/ 
> __**_
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/**listinfo.cgi/ceph-users-ceph.**com
>



-- 
Join us at OpenNebulaConf2013  in Berlin, 24-26
September, 2013
--
Jaime Melis
Project Engineer
OpenNebula - The Open Source Toolkit for Cloud Computing
www.OpenNebula.org | jme...@opennebula.org
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph and open source cloud software: Path of least resistance

2013-06-17 Thread Jens Kristian Søgaard

Hi Jaime,

We spoke on IRC when I was trying to setup OpenNebula. Thanks for all 
the help and hints there!


It is right that I found that my primary problem was that I choose 
CentOS 6.4 from the list of supported distributions, as that is the one 
I'm most comfortable with.


If I had chosen Ubuntu from the get go, I would have have run into far 
fewer problems.


However, I don't think OpenNebula currently fulfills the goals I set up. 
If it indeed does, it would be really nice - and I would start over 
setting up OpenNebula on Ubuntu instead.


My problems with OpenNebula as far as goals go are:

Reg. goal a) From my initial experience, it seems I cannot rely solely 
on Ceph storage. Images have to be copied back and forth between the 
servers.


Reg. goal b) The qemu-kvm binary in the supported Ubuntu 12.10 
distribution does not include async flush. I don't know if this is 
available as a backport from somewhere else, as my attempts to simply 
upgrade qemu didn't go well.


--
Jens Kristian Søgaard, Mermaid Consulting ApS,
j...@mermaidconsulting.dk,
http://www.mermaidconsulting.com/
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph and open source cloud software: Path of least resistance

2013-06-17 Thread Wolfgang Hennerbichler


On 06/17/2013 12:51 PM, Jens Kristian Søgaard wrote:

> Reg. goal b) The qemu-kvm binary in the supported Ubuntu 12.10
> distribution does not include async flush. I don't know if this is
> available as a backport from somewhere else, as my attempts to simply
> upgrade qemu didn't go well.

I've packaged those for ubuntu 12.04 amd64, and you can download them here:
http://www.wogri.at/Qemu-Ceph-Packages.343.0.html

Wolfgang
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph and open source cloud software: Path of least resistance

2013-06-17 Thread Jens Kristian Søgaard

Hi Wolfgang,


I've packaged those for ubuntu 12.04 amd64, and you can download them here:


Thanks for the link!

I'm not that familiar with Ubuntu, so sorry for the stupid question.

Will this .dev be compatible with 12.10?

OpenNebula doesn't list 12.04 as a supported distribution, so I'm more 
inclined to 12.10.


--
Jens Kristian Søgaard, Mermaid Consulting ApS,
j...@mermaidconsulting.dk,
http://www.mermaidconsulting.com/
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph and open source cloud software: Path of least resistance

2013-06-17 Thread Wolfgang Hennerbichler
On 06/17/2013 01:03 PM, Jens Kristian Søgaard wrote:
> Hi Wolfgang,
> 
>> I've packaged those for ubuntu 12.04 amd64, and you can download them
>> here:
> 
> Thanks for the link!

no problem.

> I'm not that familiar with Ubuntu, so sorry for the stupid question.
> 
> Will this .dev be compatible with 12.10?

hm. I don't think so. But I haven't tried. The debian packages define
dependencies (just like RPM), and if those dependencies aren't met it
won't install. So the worst that can happen to you is that you have to
build qemu by hand, which wasn't really too hard (and I'm not a big fan
of do-it-yourself-compiling or makefiles, too)

> OpenNebula doesn't list 12.04 as a supported distribution, so I'm more
> inclined to 12.10.

it seems you're doomed :)


-- 
DI (FH) Wolfgang Hennerbichler
Software Development
Unit Advanced Computing Technologies
RISC Software GmbH
A company of the Johannes Kepler University Linz

IT-Center
Softwarepark 35
4232 Hagenberg
Austria

Phone: +43 7236 3343 245
Fax: +43 7236 3343 250
wolfgang.hennerbich...@risc-software.at
http://www.risc-software.at
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph and open source cloud software: Path of least resistance

2013-06-17 Thread Jens Kristian Søgaard

Hi Wolfgang,


won't install. So the worst that can happen to you is that you have to
build qemu by hand, which wasn't really too hard (and I'm not a big fan
of do-it-yourself-compiling or makefiles, too)


Well, as a veteran C-programmer I have no problems compiling things or 
tweaking Makefiles - that's not the issue.


My experience with OpenNebula on CentOS 6.4 was that when I manually 
compiled and installed qemu 1.4.2 it broke compatibility with 
OpenNebula/libvirt. Upgrading libvirt to the newest version only seemed 
to make matters worse.


However, on CentOS I found the RPMs on ceph.com with the patches 
backported to the existing qemu version on CentOS - and that worked fine 
as far as qemu-kvm goes.


--
Jens Kristian Søgaard, Mermaid Consulting ApS,
j...@mermaidconsulting.dk,
http://www.mermaidconsulting.com/
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph and open source cloud software: Path of least resistance

2013-06-17 Thread Stratos Psomadakis
On 06/16/2013 08:48 PM, Jens Kristian Søgaard wrote:
> Hi guys,
>
> I'm looking to setup an open source cloud IaaS system that will work
> well together with Ceph. I'm looking for a system that will handle
> running KVM virtual servers with persistent storage on a number of
> physical servers with a multi-tenant dashboard.
>
> I have now tried a number of systems, but having difficulties finding
> something that will work with Ceph in an optimal way. Or at least,
> having difficulties finding hints on how to achieve that.
>
> By optimal I mean:
>
> a) To have Ceph as the only storage, so that I don't have a NFS SPoF
> nor have to wait for images to be copied from server to server.
>
> b) To run KVM with the async flush feature in 1.4.2 (or backported)
> and with the librbd cache.
>
>
> Any of you guys are doing this? - have hints to offer?
>
> I have tried CloudStack, but found that it was not possible to rely
> fully on Ceph storage. I learnt that it would potentially be possible
> with the upcoming 4.2 release, so I tried installed CloudStack from
> the development source code tree. I wasn't able to get this working
> because of various bugs (to be expected when running a development
> version ofcourse).
>
> I also tried OpenNebula, but found that it was very hard to get
> working on the recommended CentOS 6.4 distribution. By upgrading all
> sorts of systems and manually patching parts of the system I was able
> to get it "almost working". However in the end, I ended up in a
> dilemma where OpenNebula needed a newer qemu version to support RBDs
> and that newer qemu didn't work well with the older libvirt. On the
> other hand if I upgraded libvirt, I couldn't get it to work with the
> older qemu versions with backported RBD support, as the newer libvirt
> where setting an auth_supported=none option that stopped it from
> working. It didn't seem possible to convince OpenNebula to store a
> secret for Ceph with libvirt.
>
> I have been looking at OpenStack, but by reading the documentation and
> googling it seems that it is not possible to configure OpenStack to
> use the librbd cache with Ceph. Could this be right?
>
> Or is it merely the case, that you cannot configure it on a per-VM
> basis, so that you have to rely on the default settings in ceph.conf?
> (which wouldn't be a problem for me)
>
> Any advice you could give would be greatly appreciated!
>
> Thanks,

Hi,

you might want to take a look at Synnefo. [1]

Synnefo is a complete open source cloud IaaS platform, which uses Google
Ganeti [2] for the VM cluster management at the backend and implements /
exposes OpenStack APIs at the frontend. Synnefo supports Ceph / RBD on
the API layer, as a 'disk template' when creating VMs, and passes that
information to Ganeti, which actually does the RBD device handling.

At the moment Ganeti only supports the in-kernel RBD driver, although
support for the qemu-rbd driver should be implemented soon. Using the
in-kernel RBD driver means that you should probably run a relatively
modern kernel, but it also means that caching and flushing is handled by
the kernel mechanisms (page cache, block layer etc), without the need to
rely on specific qemu / libvirt versions to support them. Ganeti does
*not* use libvirt in the backend and supports out-of-the-box both KVM
and Xen.

You can also read this blog post [3] for more information, to see how we
use Synnefo + Ganeti + Ceph to power a large scale public cloud service.

[1] http://www.synnefo.org
[2] https://code.google.com/p/ganeti/
[3]
http://synnefo-software.blogspot.gr/2013/02/we-are-happy-to-announce-that-synnefo_11.html

Thanks,
Stratos

-- 
Stratos Psomadakis





signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph and open source cloud software: Path of least resistance

2013-06-17 Thread Jens Kristian Søgaard

Hi Stratos,


you might want to take a look at Synnefo. [1]


I did take a look at it earlier, but decided not to test it.

Mainly I was deterred because I found the documentation a bit lacking. I 
opened up the section on File Storage and found that there were only 
chapter titles, but no actual content. Perhaps I was too quick to 
dismiss it.


A bit more practical problem for me was that my test equipment consists 
of a single server (besides the Ceph cluster). As far as I understood 
the docs, there was a bug that makes it impossible to run Synnefo on a 
single server (to be fixed in the next version)?


Regarding my goals, I read through the installation guide and it 
recommends setting up an NFS server on one of the servers to serve 
images to the rest. This is what I wanted to avoid. Is that optional 
and/or could be replaced with Ceph?



At the moment Ganeti only supports the in-kernel RBD driver, although
support for the qemu-rbd driver should be implemented soon. Using the


Hmm, I wanted to avoid using the in-kernel RBD driver, as I figured it 
lead to various problems. Is it not a problem in practice?


I was thinking it would be wisest to stay with the distribution kernel, 
but I guess you swap it out for a later version?


The rbds for all my existing VMs would probably have to be converted 
back from format 2 to format 1, right?


--
Jens Kristian Søgaard, Mermaid Consulting ApS,
j...@mermaidconsulting.dk,
http://www.mermaidconsulting.com/
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Upgrade from bobtail

2013-06-17 Thread Wolfgang Hennerbichler
Hi, i'm planning to Upgrade my bobtail (latest) cluster to cuttlefish. Are 
there any outstanding issues that I should be aware of? Anything that could 
brake my productive setup?

Wolfgang

-- 
Sent from my mobile device
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Upgrade from bobtail

2013-06-17 Thread Stefan Priebe - Profihost AG
Hi,

http://tracker.ceph.com/issues/5232

http://tracker.ceph.com/issues/5238

http://tracker.ceph.com/issues/5375

Stefan

This mail was sent from my iPhone.

Am 17.06.2013 um 17:06 schrieb Wolfgang Hennerbichler 
:

> Hi, i'm planning to Upgrade my bobtail (latest) cluster to cuttlefish. Are 
> there any outstanding issues that I should be aware of? Anything that could 
> brake my productive setup?
> 
> Wolfgang
> 
> -- 
> Sent from my mobile device
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Upgrade from bobtail

2013-06-17 Thread Sage Weil
On Mon, 17 Jun 2013, Wolfgang Hennerbichler wrote:
> Hi, i'm planning to Upgrade my bobtail (latest) cluster to cuttlefish. 
> Are there any outstanding issues that I should be aware of? Anything 
> that could brake my productive setup?

There will be another point release out in the next day or two that 
resolves a rare sequence of errors during the upgrade that can be 
problematic (see the 0.61.3 release notes).  There are also several fixes 
for udev/ceph-disk/ceph-deploy on rpm-based distros that will be included.  
If you can wait a couple days I would suggest that.

sage

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Need help with Ceph error

2013-06-17 Thread Gregory Farnum
Yep, you can't connect to your monitors so nothing else is going to
work either. There's a wealth of conversations about debugging monitor
connection issues in the mailing list and irc archives (and I think
some in the docs), but as a quick start list:
1) make sure the monitor processes are actually running in top
2) connect to them using the admin socket and see what state they
think they're in
3) see if you can connect to them from their host instead of a
different one (different keys might be present).
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com


On Sun, Jun 16, 2013 at 11:48 PM, Sreejith Keeriyattil
 wrote:
> Hi
> The issue it hangs when I type any ceph commands.. :(
> ===
> root@xtream:~# ceph -s
> ^C
> root@xtream:~# service ceph start
> === mds.a ===
> Starting Ceph mds.a on xtream...already running
> === osd.0 ===
> Mounting xfs on xtream:/var/lib/ceph/osd/ceph-0
> ^C
> root@xtream:~#
>
> ==
> Thanks and regards
> Sreejith KJ
>
> -Original Message-
> From: Gregory Farnum [mailto:g...@inktank.com]
> Sent: Friday, June 14, 2013 9:07 PM
> To: Sreejith Keeriyattil
> Cc: ceph-us...@ceph.com
> Subject: Re: [ceph-users] Need help with Ceph error
>
> On Fri, Jun 14, 2013 at 12:20 AM, Sreejith Keeriyattil 
>  wrote:
>> Hi
>>
>> To keep it simple I disabled cephx authentication but after that am
>> getting the below error.
>>
>> ==
>> ==
>>
>> root@xtream:/etc/ceph# service ceph  -v start
>>
>> /usr/bin/ceph-conf -c /etc/ceph/ceph.conf -n mon.a "user"
>>
>> /usr/bin/ceph-conf -c /etc/ceph/ceph.conf -n mds.a "user"
>>
>> === mds.a ===
>>
>> /usr/bin/ceph-conf -c /etc/ceph/ceph.conf -n mds.a "pid file"
>>
>> --- xtream# mkdir -p /var/run/ceph
>>
>> /usr/bin/ceph-conf -c /etc/ceph/ceph.conf -n mds.a "log dir"
>>
>> /usr/bin/ceph-conf -c /etc/ceph/ceph.conf -n mds.a "auto start"
>>
>> --- xtream# [ -e /var/run/ceph/mds.a.pid ] || exit 1   # no pid, presumably
>> not
>>
>> pid=`cat /var/run/ceph/mds.a.pid`
>>
>> [ -e /proc/$pid ] && grep -q ceph-mds /proc/$pid/cmdline &&
>> grep -qwe -i
>>
>> exit 1  # pid is something else
>>
>> Starting Ceph mds.a on xtream...already running
>>
>> /usr/bin/ceph-conf -c /etc/ceph/ceph.conf -n osd.0 "user"
>>
>> === osd.0 ===
>>
>> /usr/bin/ceph-conf -c /etc/ceph/ceph.conf -n osd.0 "pid file"
>>
>> --- xtream# mkdir -p /var/run/ceph
>>
>> /usr/bin/ceph-conf -c /etc/ceph/ceph.conf -n osd.0 "log dir"
>>
>> /usr/bin/ceph-conf -c /etc/ceph/ceph.conf -n osd.0 "auto start"
>>
>> --- xtream# [ -e /var/run/ceph/osd.0.pid ] || exit 1   # no pid, presumably
>> not
>>
>> pid=`cat /var/run/ceph/osd.0.pid`
>>
>> [ -e /proc/$pid ] && grep -q ceph-osd /proc/$pid/cmdline &&
>> grep -qwe -i
>>
>> exit 1  # pid is something else
>>
>> /usr/bin/ceph-conf -c /etc/ceph/ceph.conf -n osd.0 "copy executable to"
>>
>> /usr/bin/ceph-conf -c /etc/ceph/ceph.conf -n osd.0 "osd data"
>>
>> /usr/bin/ceph-conf -c /etc/ceph/ceph.conf -n osd.0 "fs path"
>>
>> /usr/bin/ceph-conf -c /etc/ceph/ceph.conf -n osd.0 "devs"
>>
>> /usr/bin/ceph-conf -c /etc/ceph/ceph.conf -n osd.0 "lock file"
>>
>> /usr/bin/ceph-conf -c /etc/ceph/ceph.conf -n osd.0 "admin socket"
>>
>> /usr/bin/ceph-conf -c /etc/ceph/ceph.conf -n osd.0 "max open files"
>>
>> /usr/bin/ceph-conf -c /etc/ceph/ceph.conf -n osd.0 "restart on core dump"
>>
>> /usr/bin/ceph-conf -c /etc/ceph/ceph.conf -n osd.0 "valgrind"
>>
>> /usr/bin/ceph-conf -c /etc/ceph/ceph.conf -n osd.0 "pre mount command"
>>
>> /usr/bin/ceph-conf -c /etc/ceph/ceph.conf -n osd.0 "osd mkfs type"
>>
>> /usr/bin/ceph-conf -c /etc/ceph/ceph.conf -n osd.0 "osd mount options xfs"
>>
>> --- xtream# true
>>
>> Mounting xfs on xtream:/var/lib/ceph/osd/ceph-0
>>
>> --- xtream# modprobe xfs ; egrep -q '^[^ ]+ /var/lib/ceph/osd/ceph-0'
>> /proc/moun
>>
>> /usr/bin/ceph-conf -c /etc/ceph/ceph.conf -n osd.0 "osd crush update
>> on start"
>>
>> /usr/bin/ceph-conf -c /etc/ceph/ceph.conf -n osd.0 "osd crush location"
>>
>> /usr/bin/ceph-conf -c /etc/ceph/ceph.conf -n osd.0 "osd crush initial
>> weight"
>>
>> /usr/bin/ceph-conf -c /etc/ceph/ceph.conf -n osd.0 "keyring"
>>
>>
>>
>> It hangs after this
>>
>> My ceph.conf file looks like this
>
> I think you're still having issues connecting to your monitors. Can you run 
> "ceph -s" and provide the output?
> -Greg
> Software Engineer #42 @ http://inktank.com | http://ceph.com
>
> 
> Happiest Minds Disclaimer
>
> This message is for the sole use of the intended recipient(s) and may contain 
> confidential, proprietary or legally privileged information. Any unauthorized 
> review, use, disclosure or distribution is prohibited. If you are not the 
> original intended recipient of the message, please contact the sender by 
> reply email and destroy all copies of the original message.
>
>

Re: [ceph-users] Upgrade from bobtail

2013-06-17 Thread Travis Rhoden
I'm actually planning this same upgrade on Saturday.  Is the memory
leak from Bobtail during deep-scrub known to be squashed?  I've been
seeing that a lot lately.

I know Bobtail->Cuttlefish is only one way, due to the mon
re-architecting.  But in general, whenever we do upgrades we usually
have a fall-back/reversion plan in case things go wrong.  Is that ever
going to be possible with Ceph?

 - Travis

On Mon, Jun 17, 2013 at 12:27 PM, Sage Weil  wrote:
> On Mon, 17 Jun 2013, Wolfgang Hennerbichler wrote:
>> Hi, i'm planning to Upgrade my bobtail (latest) cluster to cuttlefish.
>> Are there any outstanding issues that I should be aware of? Anything
>> that could brake my productive setup?
>
> There will be another point release out in the next day or two that
> resolves a rare sequence of errors during the upgrade that can be
> problematic (see the 0.61.3 release notes).  There are also several fixes
> for udev/ceph-disk/ceph-deploy on rpm-based distros that will be included.
> If you can wait a couple days I would suggest that.
>
> sage
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Influencing reads/writes

2013-06-17 Thread Gregory Farnum
On Sun, Jun 16, 2013 at 11:10 PM, Wolfgang Hennerbichler
 wrote:
>
>
> On 06/16/2013 01:27 AM, Matthew Walster wrote:
>> In the same way that we have CRUSH maps for determining placement
>> groups, I was wondering if anyone had stumbled across a way to influence
>> a *client* (be it CephFS or RBD) as to where they should read/write data
>> from/to.
>
> I think the concept of CRUSH doesn't really involve your wish, to write
> to specific locations (i was wishing for the same some time in the past,
> and then I RTFM'ed more, and in the end found out that this wish is not
> very trivial to implement). Although reading locally is possible, as
> michael lowe stated in his other e-mail.

This is correct. What you can do is set up pools with different rules,
such that you have a "West" and "East" pool, you split up your OSDs
into "west" and "east" groups, and then the pool selects a primary
from the matching set of OSDs and a secondary/tertiary from the other.
Then put the RBD images in the appropriate pool.

On Sat, Jun 15, 2013 at 6:25 PM, Michael Lowe  wrote:
> My read of http://ceph.com/releases/v0-63-released/ has this for rbd reads
> in the dev branch.
FYI, right now this really is just *local*, ie you're on the same
host. We've had low-intensity discussions around enhancing this core
functionality to be a "read-from-closest" model for a while now, but
have yet to implement anything because it's much more complex than the
simple hack we're currently using and requires stuff like interpreting
a closeness model (probably based on CRUSH, but we don't have any
similar functionality to borrow from).
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] subscribe

2013-06-17 Thread StClair, John

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Ceph auth get-or-create

2013-06-17 Thread Amit Vijairania
Hello!

How do you add new pool access to existing Ceph Client?

e.g.
At first create a new user -- openstack-volumes:

ceph auth get-or-create client.openstack-volumes mon 'allow r' osd 'allow
class-read object_prefix rbd_children, allow rwx *pool=openstack-volumes*,
allow rx pool=openstack-images'

Add another pool for this user to access -- openstack-volumes:

ceph auth get-or-create client.openstack-volumes mon 'allow r' osd 'allow
class-read object_prefix rbd_children, allow rwx *pool=openstack-volumes*,
allow rwx *pool=openstack-volumes-2*, allow rx pool=openstack-images'

Thanks!

Amit Vijairania  |  978.319.3684
--*--
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] How to remove /var/lib/ceph/osd/ceph-2?

2013-06-17 Thread Craig Lewis
If you followed the standard setup, each OSD is it's own disk +
filesystem.  /var/lib/ceph/osd/ceph-2 is in use, as the mount point for
the OSD.2 filesystem.  Double check by examining the output of the
`mount` command.

I get the same error when I try to rename a directory that's used as a
mount point.

Try `umount /var/lib/ceph/osd/ceph-2` instead of the mv and rm.  The
fuser command is telling you that the kernel has a filesystem mounted in
that directory.  Nothing else appears to be using it, so the umount
should complete successfully.


Also, you should fix that time skew on mon.ceph-node5.  The mailing list
archives have several posts with good answers.


On 6/15/2013 2:14 AM, Da Chun wrote:
> Hi all,
> On Ubuntu 13.04 with ceph 0.61.3.
> I want to remove osd.2 from my cluster. The following steps were
> performed:
> root@ceph-node6:~# ceph osd out osd.2
> marked out osd.2.
> root@ceph-node6:~# ceph -w
>health HEALTH_WARN clock skew detected on mon.ceph-node5
>monmap e1: 3 mons at
> {ceph-node4=172.18.46.34:6789/0,ceph-node5=172.18.46.35:6789/0,ceph-node6=172.18.46.36:6789/0},
> election epoch 124, quorum 0,1,2 ceph-node4,ceph-node5,ceph-node6
>osdmap e414: 6 osds: 5 up, 5 in
> pgmap v10540: 456 pgs: 456 active+clean; 12171 MB data, 24325 MB
> used, 50360 MB / 74685 MB avail
>mdsmap e102: 1/1/1 up {0=ceph-node4=up:active}
>
> 2013-06-15 16:55:22.096059 mon.0 [INF] pgmap v10540: 456 pgs: 456
> active+clean; 12171 MB data, 24325 MB used, 50360 MB / 74685 MB avail
> ^C
> root@ceph-node6:~# stop ceph-osd id=2
> ceph-osd stop/waiting
> root@ceph-node6:~# ceph osd crush remove osd.2
> removed item id 2 name 'osd.2' from crush map
> root@ceph-node6:~# ceph auth del osd.2
> updated
> root@ceph-node6:~# ceph osd rm 2
> removed osd.2
> root@ceph-node6:~# mv /var/lib/ceph/osd/ceph-2
> /var/lib/ceph/osd/ceph-2.bak
> mv: cannot move '/var/lib/ceph/osd/ceph-2' to
> '/var/lib/ceph/osd/ceph-2.bak': Device or resource busy
>
> Everything was working OK until the last step to remove the osd.2
> directory /var/lib/ceph/osd/ceph-2.
> root@ceph-node6:~# fuser -v /var/lib/ceph/osd/ceph-2
>  USERPID ACCESS COMMAND
> /var/lib/ceph/osd/ceph-2:
>  root kernel mount /var/lib/ceph/osd/ceph-2  
> // What does this mean?
> root@ceph-node6:~# lsof +D /var/lib/ceph/osd/ceph-2
> root@ceph-node6:~#
>
> I restarted the system, and found that the osd.2 daemon was still running:
> root@ceph-node6:~# ps aux | grep osd
> root  1264  1.4 12.3 550940 125732 ?   Ssl  16:41   0:20
> /usr/bin/ceph-osd --cluster=ceph -i 2 -f
> root  2876  0.0  0.0   4440   628 ?Ss   16:44   0:00
> /bin/sh -e -c /usr/bin/ceph-osd --cluster="${cluster:-ceph}" -i "$id"
> -f /bin/sh
> root  2877  4.9 18.2 613780 185676 ?   Sl   16:44   1:04
> /usr/bin/ceph-osd --cluster=ceph -i 5 -f
>
> I have to take this workaround:
> root@ceph-node6:~# rm -rf /var/lib/ceph/osd/ceph-2
> rm: cannot remove '/var/lib/ceph/osd/ceph-2': Device or resource busy
> root@ceph-node6:~# ls /var/lib/ceph/osd/ceph-2
> root@ceph-node6:~# shutdown -r now
> 
> root@ceph-node6:~# ps aux | grep osd
> root  1416  0.0  0.0   4440   628 ?Ss   17:10   0:00
> /bin/sh -e -c /usr/bin/ceph-osd --cluster="${cluster:-ceph}" -i "$id"
> -f /bin/sh
> root  1417  8.9  5.8 468052 59868 ?Sl   17:10   0:02
> /usr/bin/ceph-osd --cluster=ceph -i 5 -f
> root@ceph-node6:~# rm -r /var/lib/ceph/osd/ceph-2
> root@ceph-node6:~#
>
> Any idea? HELP!
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Simulating Disk Failure

2013-06-17 Thread Craig Lewis

Thanks.  I'll have to get more creative.  :-)


On 6/14/13 18:19 , Gregory Farnum wrote:
Yeah. You've picked up on some warty bits of Ceph's error handling 
here for sure, but it's exacerbated by the fact that you're not 
simulating what you think. In a real disk error situation the 
filesystem would be returning EIO or something, but here it's 
returning ENOENT. Since the OSD is authoritative for that key space 
and the filesystem says there is no such object, presto! It doesn't 
exist.
If you restart the OSD it does a scan of the PGs on-disk as well as 
what it should have, and can pick up on the data not being there and 
recover. But "correctly" handling data that has been (from the local 
FS' perspective) properly deleted under a running process would 
require huge and expensive contortions on the part of the daemon (in 
any distributed system that I can think of).

-Greg

On Friday, June 14, 2013, Craig Lewis wrote:

So I'm trying to break my test cluster, and figure out how to put
it back together again.  I'm able to fix this, but the behavior
seems strange to me, so I wanted to run it past more experienced
people.

I'm doing these tests using RadosGW.  I currently have 2 nodes,
with replication=2.  (I haven't gotten to the cluster expansion
testing yet).

I'm going to upload a file, then simulate a disk failure by
deleting some PGs on one of the OSDs.  I have seen this mentioned
as the way to fix OSDs that filled up during recovery/backfill.  I
expected the cluster to detect the error, change the cluster
health to warn, then return the data from another copy.  Instead,
I got a 404 error.



me@client ~ $ s3cmd ls
2013-06-12 00:02  s3://bucket1

me@client ~ $ s3cmd ls s3://bucket1
2013-06-12 00:0213 8ddd8be4b179a529afa5f2ffae4b9858 
s3://bucket1/hello.txt


me@client ~ $ s3cmd put Object1 s3://bucket1
Object1 -> s3://bucket1/Object1  [1 of 1]
 4 of 4   100% in   62s 6.13 MB/s  done

 me@client ~ $ s3cmd ls s3://bucket1
 2013-06-13 01:10   381M 15bdad3e014ca5f5c9e5c706e17d65f3 
s3://bucket1/Object1
 2013-06-12 00:0213 8ddd8be4b179a529afa5f2ffae4b9858 
s3://bucket1/hello.txt






So at this point, the cluster is healthy, and we can download
objects from RGW.


me@dev-ceph0:/var/lib/ceph/osd/ceph-0/current$ ceph status
   health HEALTH_OK
   monmap e2: 2 mons at
{dev-ceph0=192.168.18.24:6789/0,dev-ceph1=192.168.18.25:6789/0
},
election epoch 12, quorum 0,1 dev-ceph0,dev-ceph1
   osdmap e44: 2 osds: 2 up, 2 in
pgmap v4055: 248 pgs: 248 active+clean; 2852 MB data, 7941 MB
used, 94406 MB / 102347 MB avail; 17B/s rd, 0op/s
   mdsmap e1: 0/0/1 up

me@client ~ $ s3cmd get s3://bucket1/Object1 ./Object.Download1
s3://bucket1/Object1 -> ./Object.Download1  [1 of 1]
 4 of 4   100% in 13s27.63 MB/s  done






Time to simulate a failure.  Let's delete all the PGs used by
.rgw.buckets on OSD.0.

me@dev-ceph0:~$ ceph osd tree

# idweighttype nameup/downreweight
-10.09998root default
-20.04999host dev-ceph0
00.04999osd.0up1
-30.04999host dev-ceph1
10.04999osd.1up1


me@dev-ceph0:~$ ceph osd dump | grep .rgw.buckets
pool 9 '.rgw.buckets' rep size 2 min_size 1 crush_ruleset 0
object_hash rjenkins pg_num 8 pgp_num 8 last_change 21 owner
18446744073709551615

me@dev-ceph0:~$ cd /var/lib/ceph/osd/ceph-0/current
me@dev-ceph0:/var/lib/ceph/osd/ceph-0/current$ du -sh 9.*
321M9.0_head
289M9.1_head
425M9.2_head
357M9.3_head
358M9.4_head
309M9.5_head
401M9.6_head
397M9.7_head

me@dev-ceph0:/var/lib/ceph/osd/ceph-0/current$ sudo rm -rf 9.*




The cluster is still healthy

me@dev-ceph0:/var/lib/ceph/osd/ceph-0/current$ ceph status
   health HEALTH_OK
   monmap e2: 2 mons at
{dev-ceph0=192.168.18.24:6789/0,dev-ceph1=192.168.18.25:6789/0
},
election epoch 12, quorum 0,1 dev-ceph0,dev-ceph1
   osdmap e44: 2 osds: 2 up, 2 in
pgmap v4059: 248 pgs: 248 active+clean; 2852 MB data, 7941 MB
used, 94406 MB / 102347 MB avail; 16071KB/s rd, 3op/s
   mdsmap e1: 0/0/1 up




It probably hasn't noticed the damage yet, there's no I/O on this
test cluster unless I generate it.  Lets retrieve some data,
that'll make the cluster notice.

me@client ~ $ s3cmd get s3://bucket1/Object1 ./Object.Download2
s3://bucket1/Object1 -> ./Object.Download2  [1 of 1]
ERROR: S3 error: 404 (Not Found):

me@client ~ $ s3cmd ls s3://bucket1
ERROR: S3 error: 404 (NoSuchKey):



I wasn't expecting that.  I expected my obj

Re: [ceph-users] rbd rm results in osd marked down wrongly with 0.61.3

2013-06-17 Thread Sage Weil
Hi Florian,

If you can trigger this with logs, we're very eager to see what they say 
about this!  The http://tracker.ceph.com/issues/5336 bug is open to track 
this issue.

Thanks!
sage


On Thu, 13 Jun 2013, Smart Weblications GmbH - Florian Wiessner wrote:

> Hi,
> 
> Is really no one on the list interrested in fixing this? Or am i the only one
> having this kind of bug/problem?
> 
> Am 11.06.2013 16:19, schrieb Smart Weblications GmbH - Florian Wiessner:
> > Hi List,
> > 
> > i observed that an rbd rm  results in some osds mark one osd as down
> > wrongly in cuttlefish.
> > 
> > The situation gets even worse if there are more than one rbd rm  
> > running
> > in parallel.
> > 
> > Please see attached logfiles. The rbd rm command was issued on 20:24:00 via
> > cronjob, 40 seconds later the osd 6 got marked down...
> > 
> > 
> > ceph osd tree
> > 
> > # idweight  type name   up/down reweight
> > -1  7   pool default
> > -3  7   rack unknownrack
> > -2  1   host node01
> > 0   1   osd.0   up  1
> > -4  1   host node02
> > 1   1   osd.1   up  1
> > -5  1   host node03
> > 2   1   osd.2   up  1
> > -6  1   host node04
> > 3   1   osd.3   up  1
> > -7  1   host node06
> > 5   1   osd.5   up  1
> > -8  1   host node05
> > 4   1   osd.4   up  1
> > -9  1   host node07
> > 6   1   osd.6   up  1
> > 
> > 
> > I have seen some patches to parallelize rbd rm, but i think there must be 
> > some
> > other issue, as my clients seem to not be able to do IO when ceph is
> > recovering... I think this has worked better in 0.56.x - there was IO while
> > recovering.
> > 
> > I also observed in the log of osd.6 that after heartbeat_map reset_timeout, 
> > the
> > osd tries to connect to the other osds, but it retries so fast that you 
> > could
> > think this is a DoS attack...
> > 
> > 
> > Please advise..
> > 
> 
> 
> -- 
> 
> Mit freundlichen Gr??en,
> 
> Florian Wiessner
> 
> Smart Weblications GmbH
> Martinsberger Str. 1
> D-95119 Naila
> 
> fon.: +49 9282 9638 200
> fax.: +49 9282 9638 205
> 24/7: +49 900 144 000 00 - 0,99 EUR/Min*
> http://www.smart-weblications.de
> 
> --
> Sitz der Gesellschaft: Naila
> Gesch?ftsf?hrer: Florian Wiessner
> HRB-Nr.: HRB 3840 Amtsgericht Hof
> *aus dem dt. Festnetz, ggf. abweichende Preise aus dem Mobilfunknetz
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Upgrade from bobtail

2013-06-17 Thread Wolfgang Hennerbichler
On Mon, Jun 17, 2013 at 02:10:27PM -0400, Travis Rhoden wrote:
> I'm actually planning this same upgrade on Saturday.  Is the memory
> leak from Bobtail during deep-scrub known to be squashed?  I've been
> seeing that a lot lately.

this is actually the reason why we're planning to upgrade, too. one of the 
OSD's went nuts yesterday, and ate up all the memory. Ceph exploded, but - and 
this is the good news - it recovered smoothly. 

> I know Bobtail->Cuttlefish is only one way, due to the mon
> re-architecting.  But in general, whenever we do upgrades we usually
> have a fall-back/reversion plan in case things go wrong.  Is that ever
> going to be possible with Ceph?

just from my guts i guess this will stabilize when the mon architecture 
stabilizes. but ceph is young, and young means going forward only. 
 
>  - Travis
> 
> On Mon, Jun 17, 2013 at 12:27 PM, Sage Weil  wrote:
> > On Mon, 17 Jun 2013, Wolfgang Hennerbichler wrote:
> >> Hi, i'm planning to Upgrade my bobtail (latest) cluster to cuttlefish.
> >> Are there any outstanding issues that I should be aware of? Anything
> >> that could brake my productive setup?
> >
> > There will be another point release out in the next day or two that
> > resolves a rare sequence of errors during the upgrade that can be
> > problematic (see the 0.61.3 release notes).  There are also several fixes
> > for udev/ceph-disk/ceph-deploy on rpm-based distros that will be included.
> > If you can wait a couple days I would suggest that.
> >
> > sage
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

-- 
http://www.wogri.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] ceph iscsi questions

2013-06-17 Thread Da Chun
Hi List,

I want to deploy a ceph cluster with latest cuttlefish, and export it with 
iscsi interface to my applications.
Some questions here:
1. Which Linux distro and release would you recommend? I used Ubuntu 13.04 for 
testing purpose before.
2. Which iscsi target is better? LIO, SCST, or others?
3. The system for the iscsi target will be a single point of failure. How to 
eliminate it and make good use of ceph's nature of distribution?


Thanks!___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-deploy issues rhel6

2013-06-17 Thread Gary Lowell
Hi Derek -

If you are still having problems with ceph-deploy, please forward the ceph.log 
file to me, I can start trying to figure out what's gone wrong. 

Thanks,
Gary


On Jun 12, 2013, at 7:09 PM, Derek Yarnell  wrote:

> Hi,
> 
> I am trying to run ceph-deploy on a very basic 1 node configuration.
> But this is causing an exception that is confusing me,
> 
> ceph-deploy mon create cbcbobj00.umiacs.umd.edu
> ceph-mon: mon.noname-a 192.168.7.235:6789/0 is local, renaming to
> mon.cbcbobj00
> ceph-mon: set fsid to a602d0c8-5c6e-442c-bad4-5d9801924a60
> ceph-mon: created monfs at /var/lib/ceph/mon/ceph-cbcbobj00 for
> mon.cbcbobj00
> Traceback (most recent call last):
>  File "/usr/bin/ceph-deploy", line 21, in 
>main()
>  File "/usr/lib/python2.6/site-packages/ceph_deploy/cli.py", line 112,
> in main
>return args.func(args)
>  File "/usr/lib/python2.6/site-packages/ceph_deploy/mon.py", line 234,
> in mon
>mon_create(args)
>  File "/usr/lib/python2.6/site-packages/ceph_deploy/mon.py", line 138,
> in mon_create
>init=init,
>  File "/usr/lib/python2.6/site-packages/pushy/protocol/proxy.py", line
> 255, in 
>(conn.operator(type_, self, args, kwargs))
>  File "/usr/lib/python2.6/site-packages/pushy/protocol/connection.py",
> line 66, in operator
>return self.send_request(type_, (object, args, kwargs))
>  File
> "/usr/lib/python2.6/site-packages/pushy/protocol/baseconnection.py",
> line 323, in send_request
>return self.__handle(m)
>  File
> "/usr/lib/python2.6/site-packages/pushy/protocol/baseconnection.py",
> line 639, in __handle
>raise e
> pushy.protocol.proxy.ExceptionProxy: [Errno 2] No such file or directory
> 
> Thanks,
> derek
> 
> 
> -- 
> ---
> Derek T. Yarnell
> University of Maryland
> Institute for Advanced Computer Studies
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Need help with Ceph error

2013-06-17 Thread Sreejith Keeriyattil
Hi
==
root@xtream:~# service ceph start
=== mds.a ===
Starting Ceph mds.a on xtream...already running
=== osd.0 ===
Mounting xfs on xtream:/var/lib/ceph/osd/ceph-0
2013-06-18 04:26:16.373075 7f30ffd2d700  0 -- :/27212 >> 10.16.23.44:6789/0 
pipe(0x1136490 sd=3 :0 s=1 pgs=0 cs=0 l=1).fault
2013-06-18 04:26:19.373812 7f3106459700  0 -- :/27212 >> 10.16.23.44:6789/0 
pipe(0x7f30f4000c00 sd=3 :0 s=1 pgs=0 cs=0 l=1).fault
2013-06-18 04:26:22.374725 7f30ffd2d700  0 -- :/27212 >> 10.16.23.44:6789/0 
pipe(0x7f30f4003010 sd=3 :0 s=1 pgs=0 cs=0 l=1).fault
2013-06-18 04:26:25.375248 7f3106459700  0 -- :/27212 >> 10.16.23.44:6789/0 
pipe(0x7f30f4003ad0 sd=3 :0 s=1 pgs=0 cs=0 l=1).fault
^C
root@xtream:~# ceph -s
2013-06-18 04:27:24.288603 7f0ead78e700  0 -- :/27292 >> 10.16.23.44:6789/0 
pipe(0x2df7460 sd=3 :0 s=1 pgs=0 cs=0 l=1).fault
2013-06-18 04:27:27.289759 7f0eb3eba700  0 -- :/27292 >> 10.16.23.44:6789/0 
pipe(0x7f0ea4000c00 sd=3 :0 s=1 pgs=0 cs=0 l=1).fault
^C
=

I did some research and tweaks and now am running into this error .Is there any 
command to check connectivity  to monitor?.

Thanks and Regards
Sreejith KJ

-Original Message-
From: Gregory Farnum [mailto:g...@inktank.com]
Sent: Monday, June 17, 2013 11:38 PM
To: Sreejith Keeriyattil
Cc: ceph-us...@ceph.com
Subject: Re: [ceph-users] Need help with Ceph error

Yep, you can't connect to your monitors so nothing else is going to work 
either. There's a wealth of conversations about debugging monitor connection 
issues in the mailing list and irc archives (and I think some in the docs), but 
as a quick start list:
1) make sure the monitor processes are actually running in top
2) connect to them using the admin socket and see what state they think they're 
in
3) see if you can connect to them from their host instead of a different one 
(different keys might be present).
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com


On Sun, Jun 16, 2013 at 11:48 PM, Sreejith Keeriyattil 
 wrote:
> Hi
> The issue it hangs when I type any ceph commands.. :(
> ===
> root@xtream:~# ceph -s
> ^C
> root@xtream:~# service ceph start
> === mds.a ===
> Starting Ceph mds.a on xtream...already running === osd.0 === Mounting
> xfs on xtream:/var/lib/ceph/osd/ceph-0 ^C root@xtream:~#
>
> ==
> Thanks and regards
> Sreejith KJ
>
> -Original Message-
> From: Gregory Farnum [mailto:g...@inktank.com]
> Sent: Friday, June 14, 2013 9:07 PM
> To: Sreejith Keeriyattil
> Cc: ceph-us...@ceph.com
> Subject: Re: [ceph-users] Need help with Ceph error
>
> On Fri, Jun 14, 2013 at 12:20 AM, Sreejith Keeriyattil 
>  wrote:
>> Hi
>>
>> To keep it simple I disabled cephx authentication but after that am
>> getting the below error.
>>
>> =
>> =
>> ==
>>
>> root@xtream:/etc/ceph# service ceph  -v start
>>
>> /usr/bin/ceph-conf -c /etc/ceph/ceph.conf -n mon.a "user"
>>
>> /usr/bin/ceph-conf -c /etc/ceph/ceph.conf -n mds.a "user"
>>
>> === mds.a ===
>>
>> /usr/bin/ceph-conf -c /etc/ceph/ceph.conf -n mds.a "pid file"
>>
>> --- xtream# mkdir -p /var/run/ceph
>>
>> /usr/bin/ceph-conf -c /etc/ceph/ceph.conf -n mds.a "log dir"
>>
>> /usr/bin/ceph-conf -c /etc/ceph/ceph.conf -n mds.a "auto start"
>>
>> --- xtream# [ -e /var/run/ceph/mds.a.pid ] || exit 1   # no pid, presumably
>> not
>>
>> pid=`cat /var/run/ceph/mds.a.pid`
>>
>> [ -e /proc/$pid ] && grep -q ceph-mds /proc/$pid/cmdline &&
>> grep -qwe -i
>>
>> exit 1  # pid is something else
>>
>> Starting Ceph mds.a on xtream...already running
>>
>> /usr/bin/ceph-conf -c /etc/ceph/ceph.conf -n osd.0 "user"
>>
>> === osd.0 ===
>>
>> /usr/bin/ceph-conf -c /etc/ceph/ceph.conf -n osd.0 "pid file"
>>
>> --- xtream# mkdir -p /var/run/ceph
>>
>> /usr/bin/ceph-conf -c /etc/ceph/ceph.conf -n osd.0 "log dir"
>>
>> /usr/bin/ceph-conf -c /etc/ceph/ceph.conf -n osd.0 "auto start"
>>
>> --- xtream# [ -e /var/run/ceph/osd.0.pid ] || exit 1   # no pid, presumably
>> not
>>
>> pid=`cat /var/run/ceph/osd.0.pid`
>>
>> [ -e /proc/$pid ] && grep -q ceph-osd /proc/$pid/cmdline &&
>> grep -qwe -i
>>
>> exit 1  # pid is something else
>>
>> /usr/bin/ceph-conf -c /etc/ceph/ceph.conf -n osd.0 "copy executable to"
>>
>> /usr/bin/ceph-conf -c /etc/ceph/ceph.conf -n osd.0 "osd data"
>>
>> /usr/bin/ceph-conf -c /etc/ceph/ceph.conf -n osd.0 "fs path"
>>
>> /usr/bin/ceph-conf -c /etc/ceph/ceph.conf -n osd.0 "devs"
>>
>> /usr/bin/ceph-conf -c /etc/ceph/ceph.conf -n osd.0 "lock file"
>>
>> /usr/bin/ceph-conf -c /etc/ceph/ceph.conf -n osd.0 "admin socket"
>>
>> /usr/bin/ceph-conf -c /etc/ceph/ceph.conf -n osd.0 "max open files"
>>
>> /usr/bin/ceph-conf -