[ceph-users] osd down question

2014-11-03 Thread ??
hello, I am running ceph v0.87 for one week, at this week, many osd have marking down, but I run "ps -ef | grep osd", I can see the osd process, the osd not really down, then, I check osd log, I see many logs like "osd.XX from dead osd.YY,marking down", if the 0.87 will check other osd process ?

Re: [ceph-users] giant release osd down

2014-11-03 Thread Mark Kirkwood
On 04/11/14 03:02, Sage Weil wrote: On Mon, 3 Nov 2014, Mark Kirkwood wrote: Ah, I missed that thread. Sounds like three separate bugs: - pool defaults not used for initial pools - osd_mkfs_type not respected by ceph-disk - osd_* settings not working The last one is a real shock; I would ex

Re: [ceph-users] emperor -> firefly 0.80.7 upgrade problem

2014-11-03 Thread Samuel Just
If you have osds that are close to full, you may be hitting 9626. I pushed a branch based on v0.80.7 with the fix, wip-v0.80.7-9626. -Sam On Mon, Nov 3, 2014 at 2:09 PM, Chad Seys wrote: >> >> No, it is a change, I just want to make sure I understand the >> scenario. So you're reducing CRUSH wei

Re: [ceph-users] cephfs survey results

2014-11-03 Thread Blair Bethwaite
On 4 November 2014 01:50, Sage Weil wrote: > In the Ceph session at the OpenStack summit someone asked what the CephFS > survey results looked like. Thanks Sage, that was me! > Here's the link: > > https://www.surveymonkey.com/results/SM-L5JV7WXL/ > > In short, people want > > fsck > mu

Re: [ceph-users] emperor -> firefly 0.80.7 upgrade problem

2014-11-03 Thread Chad Seys
> > No, it is a change, I just want to make sure I understand the > scenario. So you're reducing CRUSH weights on full OSDs, and then > *other* OSDs are crashing on these bad state machine events? That is right. The other OSDs shutdown sometime later. (Not immediately.) I really haven't tested

Re: [ceph-users] Ceph Giant not fixed RepllicatedPG:NotStrimming?

2014-11-03 Thread Samuel Just
Can you reproduce with debug osd = 20 debug filestore = 20 debug ms = 1 In the [osd] section of that osd's ceph.conf? -Sam On Sun, Nov 2, 2014 at 9:10 PM, Ta Ba Tuan wrote: > Hi Sage, Samuel & All, > > I upgraded to GAINT, but still appearing that errors |: > I'm trying on deleting related obj

Re: [ceph-users] emperor -> firefly 0.80.7 upgrade problem

2014-11-03 Thread Gregory Farnum
On Mon, Nov 3, 2014 at 12:28 PM, Chad Seys wrote: > On Monday, November 03, 2014 13:50:05 you wrote: >> On Mon, Nov 3, 2014 at 11:41 AM, Chad Seys wrote: >> > On Monday, November 03, 2014 13:22:47 you wrote: >> >> Okay, assuming this is semi-predictable, can you start up one of the >> >> OSDs tha

Re: [ceph-users] emperor -> firefly 0.80.7 upgrade problem

2014-11-03 Thread Chad Seys
On Monday, November 03, 2014 13:50:05 you wrote: > On Mon, Nov 3, 2014 at 11:41 AM, Chad Seys wrote: > > On Monday, November 03, 2014 13:22:47 you wrote: > >> Okay, assuming this is semi-predictable, can you start up one of the > >> OSDs that is going to fail with "debug osd = 20", "debug filestor

Re: [ceph-users] emperor -> firefly 0.80.7 upgrade problem

2014-11-03 Thread Gregory Farnum
On Mon, Nov 3, 2014 at 11:41 AM, Chad Seys wrote: > On Monday, November 03, 2014 13:22:47 you wrote: >> Okay, assuming this is semi-predictable, can you start up one of the >> OSDs that is going to fail with "debug osd = 20", "debug filestore = >> 20", and "debug ms = 1" in the config file and the

Re: [ceph-users] emperor -> firefly 0.80.7 upgrade problem

2014-11-03 Thread Chad Seys
On Monday, November 03, 2014 13:22:47 you wrote: > Okay, assuming this is semi-predictable, can you start up one of the > OSDs that is going to fail with "debug osd = 20", "debug filestore = > 20", and "debug ms = 1" in the config file and then put the OSD log > somewhere accessible after it's cras

Re: [ceph-users] emperor -> firefly 0.80.7 upgrade problem

2014-11-03 Thread Gregory Farnum
Okay, assuming this is semi-predictable, can you start up one of the OSDs that is going to fail with "debug osd = 20", "debug filestore = 20", and "debug ms = 1" in the config file and then put the OSD log somewhere accessible after it's crashed? Can you also verify that all of your monitors are r

Re: [ceph-users] emperor -> firefly 0.80.7 upgrade problem

2014-11-03 Thread Chad Seys
> There's a "ceph osd metadata" command, but i don't recall if it's in > Firefly or only giant. :) It's in firefly. Thanks, very handy. All the OSDs are running 0.80.7 at the moment. What next? Thanks again, Chad. ___ ceph-users mailing list ceph-us

Re: [ceph-users] emperor -> firefly 0.80.7 upgrade problem

2014-11-03 Thread Gregory Farnum
[ Re-adding the list. ] On Mon, Nov 3, 2014 at 10:49 AM, Chad Seys wrote: > >> > Next I executed >> > >> > 'ceph osd crush tunables optimal' >> > >> > to upgrade CRUSH mapping. >> >> Okay...you know that's a data movement command, right? > > Yes. > >> So you should expect it to impact operati

Re: [ceph-users] Swift + radosgw: How do I find accounts/containers/objects limitation?

2014-11-03 Thread Yehuda Sadeh
On Mon, Nov 3, 2014 at 9:37 AM, Narendra Trivedi (natrived) wrote: > Thanks. I think the limit is 100 by default and it can be disabled. As far > as I understand, there are no object limit on radosgw side of things only > from Swift end (i.e. 5GB) ….right? In short, if someone tries to upload a >

Re: [ceph-users] giant release osd down

2014-11-03 Thread Shiv Raj Singh
Thanks for the comments guys I'm going to deploy it from scratch and this time ill capture every price of debug information. Hopefully this will give me the reasons why ,, thanks On Mon, Nov 3, 2014 at 7:01 PM, Ian Colle wrote: > Christian, > > Why are you not fond of ceph-deploy? > > Ian

Re: [ceph-users] 0.87 rados df fault

2014-11-03 Thread Gregory Farnum
On Mon, Nov 3, 2014 at 4:40 AM, Thomas Lemarchand wrote: > Update : > > /var/log/kern.log.1:Oct 31 17:19:17 c-mon kernel: [17289149.746084] > [21787] 0 21780 492110 185044 920 240143 0 > ceph-mon > /var/log/kern.log.1:Oct 31 17:19:17 c-mon kernel: [17289149.746115] > [131

Re: [ceph-users] emperor -> firefly 0.80.7 upgrade problem

2014-11-03 Thread Gregory Farnum
On Mon, Nov 3, 2014 at 7:46 AM, Chad Seys wrote: > Hi All, >I upgraded from emperor to firefly. Initial upgrade went smoothly and all > placement groups were active+clean . > Next I executed > 'ceph osd crush tunables optimal' > to upgrade CRUSH mapping. Okay...you know that's a data mov

Re: [ceph-users] Swift + radosgw: How do I find accounts/containers/objects limitation?

2014-11-03 Thread Narendra Trivedi (natrived)
Thanks. I think the limit is 100 by default and it can be disabled. As far as I understand, there are no object limit on radosgw side of things only from Swift end (i.e. 5GB) right? In short, if someone tries to upload a 1TB of object onto Swift + RadosGW, it has to be truncated at the Swif

Re: [ceph-users] emperor -> firefly 0.80.7 upgrade problem

2014-11-03 Thread Chad Seys
P.S. The OSDs interacted with some 3.14 krbd clients before I realized that kernel version was too old for the firefly CRUSH map. Chad. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] emperor -> firefly 0.80.7 upgrade problem

2014-11-03 Thread Chad Seys
Hi All, I upgraded from emperor to firefly. Initial upgrade went smoothly and all placement groups were active+clean . Next I executed 'ceph osd crush tunables optimal' to upgrade CRUSH mapping. Now I keep having OSDs go down or have requests blocked for long periods of time. I start

Re: [ceph-users] giant release osd down

2014-11-03 Thread Christian Balzer
On Mon, 3 Nov 2014 06:02:08 -0800 (PST) Sage Weil wrote: > On Mon, 3 Nov 2014, Mark Kirkwood wrote: > > On 03/11/14 14:56, Christian Balzer wrote: > > > On Sun, 2 Nov 2014 14:07:23 -0800 (PST) Sage Weil wrote: > > > > > > > On Mon, 3 Nov 2014, Christian Balzer wrote: > > > > > c) But wait, you sp

[ceph-users] cephfs survey results

2014-11-03 Thread Sage Weil
In the Ceph session at the OpenStack summit someone asked what the CephFS survey results looked like. Here's the link: https://www.surveymonkey.com/results/SM-L5JV7WXL/ In short, people want fsck multimds snapshots quotas sage ___ ceph-users

Re: [ceph-users] giant release osd down

2014-11-03 Thread Sage Weil
On Mon, 3 Nov 2014, Mark Kirkwood wrote: > On 03/11/14 14:56, Christian Balzer wrote: > > On Sun, 2 Nov 2014 14:07:23 -0800 (PST) Sage Weil wrote: > > > > > On Mon, 3 Nov 2014, Christian Balzer wrote: > > > > c) But wait, you specified a pool size of 2 in your OSD section! Tough > > > > luck, beca

Re: [ceph-users] where to download 0.87 RPMS?

2014-11-03 Thread Tim Serong
On 11/01/2014 05:10 AM, Patrick McGarry wrote: > As I understand it SUSE does their own builds of things. Just on > cursory examination it looks like the following repo uses Firefly: > > https://susestudio.com/a/HVbCUu/master-ceph This is Jan Kalcic's ceph appliance, using packages from: http:

Re: [ceph-users] 0.87 rados df fault

2014-11-03 Thread Thomas Lemarchand
Update : /var/log/kern.log.1:Oct 31 17:19:17 c-mon kernel: [17289149.746084] [21787] 0 21780 492110 185044 920 240143 0 ceph-mon /var/log/kern.log.1:Oct 31 17:19:17 c-mon kernel: [17289149.746115] [13136] 0 1313652172 1753 590 0 ceph

Re: [ceph-users] question about activate OSD

2014-11-03 Thread German Anders
Hi Udo, I try that also but failed. Here are the steps that I made, the strange thing is that when run the "prepare" commad, it finished ok, but.. if i take a look into the log files, i found this also: ceph@cephbkdeploy01:~/desp-bkp-cluster$ ceph-deploy --overwrite-conf osd prepare ceph

Re: [ceph-users] 0.87 rados df fault

2014-11-03 Thread Thomas Lemarchand
Update : this error is linked to a crashed mon. It crashed during the weekend. I try to understand why. I never had a mon crash before Giant. -- Thomas Lemarchand Cloud Solutions SAS - Responsable des systèmes d'information On lun., 2014-11-03 at 11:08 +0100, Thomas Lemarchand wrote: > Hello a

[ceph-users] Fwd: Error creating monitors

2014-11-03 Thread Sakhi Hadebe
Can someone please help out. I am stuck Regards, Sakhi Hadebe Engineer: South African National Research Network (SANReN)Competency Area, Meraka, CSIR Tel: +27 12 841 2308 Fax: +27 12 841 4223 Cell: +27 71 331 9622 Email: shad...@csir.co.za >>> Sakhi Hadebe 10/31/2014 1:28 PM >>> Hi

[ceph-users] 0.87 rados df fault

2014-11-03 Thread Thomas Lemarchand
Hello all, I upgraded my cluster to Giant. Everything is working well, but on one mon I get a strange error when I do "rados df" : root@a-mon:~# rados df 2014-11-03 10:57:15.313618 7ff2434f0700 0 -- :/1009400 >> 10.94.67.202:6789/0 pipe(0xe37890 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0xe37b20).fault pool

Re: [ceph-users] rhel7 krbd backported module repo ?

2014-11-03 Thread Alexandre DERUMIER
>>http://gitbuilder.ceph.com/kmod-rpm-rhel7beta-x86_64-basic/ref/rhel7/x86_64/ >> >> >>But that hasn't been updated since July. Great ! Thanks! (I think it's build from https://github.com/ceph/ceph-client/tree/rhel7 ?) - Mail original - De: "Dan van der Ster" À: "Alexandre DERUMI

Re: [ceph-users] rhel7 krbd backported module repo ?

2014-11-03 Thread Alexandre DERUMIER
>>Not that I know of. krbd *fixes* are getting backported to stable >>kernels regularly though. Thanks. (I was thinking more about new features support like coming discard support in 3.18 for example) - Mail original - De: "Ilya Dryomov" À: "Alexandre DERUMIER" Cc: "ceph-users"

Re: [ceph-users] rhel7 krbd backported module repo ?

2014-11-03 Thread Dan van der Ster
There's this one: http://gitbuilder.ceph.com/kmod-rpm-rhel7beta-x86_64-basic/ref/rhel7/x86_64/ But that hasn't been updated since July. Cheers, Dan On Mon Nov 03 2014 at 5:35:23 AM Alexandre DERUMIER wrote: > Hi, > > I would like to known if a repository is available for rhel7/centos7 with >

Re: [ceph-users] giant release osd down

2014-11-03 Thread Christian Balzer
Hello, On Mon, 3 Nov 2014 01:01:32 -0500 (EST) Ian Colle wrote: > Christian, > > Why are you not fond of ceph-deploy? > In short, this very thread. Ceph-deploy hides a number of things from the users that are pretty vital for a working ceph cluster and insufficiently or not at all documented

Re: [ceph-users] SSD MTBF

2014-11-03 Thread Emmanuel Lacour
On Mon, Sep 29, 2014 at 10:31:03AM +0200, Emmanuel Lacour wrote: > > Dear ceph users, > > > we are managing ceph clusters since 1 year now. Our setup is typically > made of Supermicro servers with OSD sata drives and journal on SSD. > > Those SSD are all failing one after the other after one ye

Re: [ceph-users] rhel7 krbd backported module repo ?

2014-11-03 Thread Ilya Dryomov
On Mon, Nov 3, 2014 at 7:35 AM, Alexandre DERUMIER wrote: > Hi, > > I would like to known if a repository is available for rhel7/centos7 with > last krbd module backported ? > > > I known that such module is available in ceph enterprise repos, but is it > available for non subscribers ? Not tha

Re: [ceph-users] ceph version 0.79, rbd flatten report Segmentation fault (core dumped)

2014-11-03 Thread Ilya Dryomov
On Mon, Nov 3, 2014 at 9:31 AM, wrote: > > root@CONTROLLER-4F:~# rbd -p volumes flatten > f3e81ea3-1d5b-487a-a55e-53efff604d54_disk > *** Caught signal (Segmentation fault) ** > in thread 7fe99984f700 > ceph version 0.79 (4c2d73a5095f527c3a2168deb5fa54b3c8991a6e) > 1: (()+0x22a4f) [0x7fe9a1745

Re: [ceph-users] mds isn't working anymore after osd's running full

2014-11-03 Thread Jasper Siero
Hello Greg, I saw that the site of the previous link of the logs uses a very short expiring time so I uploaded it to another one: http://www.mediafire.com/download/gikiy7cqs42cllt/ceph-mds.th1-mon001.log.tar.gz Thanks, Jasper Van: gregory.far...@inktan