Re: [ceph-users] Backport rbd.ko to 2.6.32 Linux Kernel

2014-03-31 Thread Ирек Фасихов
Have to make changes at the level of API / ABI kernel. May be better to use KVM virtualization with support Rados Block Device (RBD)? 2014-04-01 10:10 GMT+04:00 Vilobh Meshram : > But if needed to be done how easy/hard it is ? Or we need to take into > consideration anything before back portin

Re: [ceph-users] Backport rbd.ko to 2.6.32 Linux Kernel

2014-03-31 Thread Ирек Фасихов
Not backport for 2.6.32 and in the future is not planned. 2014-04-01 9:19 GMT+04:00 Vilobh Meshram : > What is the procedure to back port rbd.ko to 2.6.32 Linux Kernel ? > > Thanks, > Vilobh > > ___ > ceph-users mailing list > ceph-users@lists.ceph.c

[ceph-users] Backport rbd.ko to 2.6.32 Linux Kernel

2014-03-31 Thread Vilobh Meshram
What is the procedure to back port rbd.ko to 2.6.32 Linux Kernel ? Thanks, Vilobh ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Removal of object from erasure coded pool does not free up space

2014-03-31 Thread Mark Kirkwood
I'm taking a look at erasure coded pools in (ceph 0.78-336-gb9e29ca). I'm doing a simple test where I use 'rados put' to load a 1G file into an erasure coded pool, and then 'rados rm' to remove it later. Checking with 'rados df' shows no objects in the pool and no KB, but the object space is s

Re: [ceph-users] EC pool errors with some k/m combinations

2014-03-31 Thread Michael Nelson
On Mon, 31 Mar 2014, Michael Nelson wrote: Hi Loic, On Sun, 30 Mar 2014, Loic Dachary wrote: Hi Michael, I'm trying to reproduce the problem from sources (today's instead of yesterday's but there is no difference that could explain the behaviour you have): cd src rm -fr /tmp/dev /tmp/o

Re: [ceph-users] RBD snapshots aware of CRUSH map?

2014-03-31 Thread Josh Durgin
On 03/31/2014 03:03 PM, Brendan Moloney wrote: Hi, I was wondering if RBD snapshots use the CRUSH map to distribute snapshot data and live data on different failure domains? If not, would it be feasible in the future? Currently rbd snapshots and live objects are stored in the same place, since

[ceph-users] RBD snapshots aware of CRUSH map?

2014-03-31 Thread Brendan Moloney
Hi, I was wondering if RBD snapshots use the CRUSH map to distribute snapshot data and live data on different failure domains? If not, would it be feasible in the future? Thanks, Brendan ___ ceph-users mailing list ceph-users@lists.ceph.com http://lis

Re: [ceph-users] OSD Restarts cause excessively high load average and "requests are blocked > 32 sec"

2014-03-31 Thread Quenten Grasso
Thanks Greg, Looking forward to the new release! Regards, Quenten Grasso -Original Message- From: Gregory Farnum [mailto:g...@inktank.com] Sent: Tuesday, 1 April 2014 3:08 AM To: Quenten Grasso Cc: Kyle Bader; ceph-users@lists.ceph.com Subject: Re: [ceph-users] OSD Restarts cause excess

[ceph-users] RBD does not load at boot

2014-03-31 Thread Dan Koren
Even though it is included in /etc/rc.modules and initramfs has been updated. Suggestions much appreciated. MTIA, dk ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] MDS crash when client goes to sleep

2014-03-31 Thread Gregory Farnum
Yes, Zheng's fix for the MDS crash is in current mainline and will be in the next Firefly RC release. Sage, is there something else we can/should be doing when a client goes to sleep that we aren't already? (ie, flushing out all dirty data or something and disconnecting?) -Greg Software Engineer #

Re: [ceph-users] OSDs crashing frequently

2014-03-31 Thread Aaron Ten Clay
Well that was quick! osd.0 crashed already, here's the log (~20 MiB): http://www.aarontc.com/logs/ceph-osd.0.log.bz2 I updated the bug report as well. Thanks, -Aaron On Mon, Mar 31, 2014 at 2:16 PM, Aaron Ten Clay wrote: > Greg, > > I'm in the process of doing so now. joshd asked for "debug

Re: [ceph-users] Mon hangs when started after Emperor upgrade

2014-03-31 Thread Sage Weil
On Mon, 31 Mar 2014, Jens Kristian Søgaard wrote: > Hi, > > > Perhaps as a workaround you should just wipe this mon's data dir and > > remake it? > > That's a possibility ofcourse! > > I really would like to know if there's something to gain from using > leveldb from ceph-extras instead of the d

Re: [ceph-users] OSDs crashing frequently

2014-03-31 Thread Aaron Ten Clay
Greg, I'm in the process of doing so now. joshd asked for "debug filestore = 20" as well, and I just restarted an OSD with those changes. As soon as it crashes again, I'll post the log file. joshd also had me open a bug: http://tracker.ceph.com/issues/7922 Thanks, -Aaron On Mon, Mar 31, 2014 a

Re: [ceph-users] OSDs crashing frequently

2014-03-31 Thread Gregory Farnum
Can you reproduce this with "debug osd = 20" and "debug ms = 1" set on the OSD? I think we'll need that data to track down what exactly has gone wrong here. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Mon, Mar 31, 2014 at 1:22 PM, Aaron Ten Clay wrote: > Hello fellow Ce

Re: [ceph-users] OSD mystery

2014-03-31 Thread Dan Koren
Thanks for the prompt reply. The OSDs are set up on dedicated devices, and the mappings are in /etc/fstab. mount shows: /dev/rssda on /var/lib/ceph/osd/ceph-0 type xfs (rw) and similar on all other nodes. Thx, dk On Mon, Mar 31, 2014 at 1:12 PM, Gregory Farnum wrote: > Well, you killed them a

[ceph-users] OSDs crashing frequently

2014-03-31 Thread Aaron Ten Clay
Hello fellow Cephers! Recently, before and after the update from 0.77 to 0.78, about half the OSDs in my cluster crash quite frequently with 'osd/PG.cc: 5255: FAILED assert(0 == "we got a bad state machine event")' I'm not sure if this is a bug (there are some similar-sounding reports in Redmine

Re: [ceph-users] OSD mystery

2014-03-31 Thread Gregory Farnum
Well, you killed them as part of the reboot...they should have restarted automatically when the system turned on, but that will depend on your configuration and how they were set up. (Eg, if they are each getting a dedicated hard drive, make sure the system knows the drive is present.) What version

Re: [ceph-users] OSD mystery

2014-03-31 Thread Dan Koren
Hi Greg, Thanks for the prompt response. Sure enough, I do see all the OSDs are now down. However, I do not understand the meaning of the sentence about killing the OSDs. This was an OS level reboot of the entire cluster, not issuing any ceph commands either before or after the restart. Doesn't Cep

Re: [ceph-users] OSD mystery

2014-03-31 Thread Gregory Farnum
If you wait longer, you should see the remaining OSDs get marked down. We detect down OSDs in two ways: 1) OSDs heartbeat each other frequently and issue reports when the heartbeat responses take too long. (This is the main way.) 2) OSDs periodically send statistics to the monitors, and if these st

[ceph-users] OSD mystery

2014-03-31 Thread Dan Koren
On a 4 node cluster (admin + 3 mon/osd nodes) I see the following shortly after rebooting the cluster and waiting for a couple of minutes: root@rts23:~# ps -ef | grep ceph && ceph osd tree root 4183 1 0 12:09 ?00:00:00 /usr/bin/ceph-mon --cluster=ceph -i rts23 -f root 577

Re: [ceph-users] RDO - CEPH

2014-03-31 Thread Vilobh Meshram
What logs should I see to explore more about what might be going wrong here ? Thanks, Vilobh On 3/30/14, 3:00 PM, "Vilobh Meshram" wrote: >Hi Loic, > >Thanks for your reply. >Not really I have setup 3 nodes as storage nodes ³ceph osd tree² output >also confirms that. > >I was more concerned abo

Re: [ceph-users] Ceph: Error librbd to create a clone

2014-03-31 Thread Jean-Charles Lopez
Hi Patrick When you call the clone method, add an extra argument specifying the features to be used for the clone: 1 for layering only 3 for layering and striping Adapt the value to your requirements Then it should work. JC On Monday, March 31, 2014, COPINE Patrick wrote: > Hi, > > Now, I ha

Re: [ceph-users] Mon hangs when started after Emperor upgrade

2014-03-31 Thread Jens Kristian Søgaard
Hi, Perhaps as a workaround you should just wipe this mon's data dir and remake it? That's a possibility ofcourse! I really would like to know if there's something to gain from using leveldb from ceph-extras instead of the distribution? -- Jens Kristian Søgaard, Mermaid Consulting ApS, j...@

Re: [ceph-users] Mon hangs when started after Emperor upgrade

2014-03-31 Thread Dan Van Der Ster
Perhaps as a workaround you should just wipe this mon's data dir and remake it? In the past when I upgraded our mons from spinning disks to SSDs, I went through a procedure to remake each mon from scratch (wiping and resyncing each mon's leveldb one at a time). I did something like this: servic

Re: [ceph-users] Mon hangs when started after Emperor upgrade

2014-03-31 Thread Jens Kristian Søgaard
Hi Gregory, Is the mon process doing anything (that is, does it have any CPU usage)? This looks to be an internal leveldb issue, but not one that we've run into before, so I think there must be something unique about the leveldb store involved. No, it is not doing anything at all. I'm not sur

Re: [ceph-users] Security Hole?

2014-03-31 Thread Dan Van Der Ster
Hi, I can't reproduce that with a dumpling cluster: # cat ceph.client.dpm.keyring [client.dpm] key = xxx caps mon = "allow r" caps osd = "allow x, allow rwx pool=dpm" # ceph health --id dpm HEALTH_OK # ceph auth list --id dpm Error EACCES: access denied Cheers, Dan _

Re: [ceph-users] How do I know which object takes storage space?

2014-03-31 Thread Gregory Farnum
Not directly. However, that "used" total is compiled by summing up the output of "df" from each individual OSD disk. There's going to be some space used up by the local filesystem metadata, by RADOS metadata like OSD maps, and (depending on configuration) your journal files. 2350MB / 48 OSDs = ~49M

Re: [ceph-users] Mon hangs when started after Emperor upgrade

2014-03-31 Thread Gregory Farnum
Is the mon process doing anything (that is, does it have any CPU usage)? This looks to be an internal leveldb issue, but not one that we've run into before, so I think there must be something unique about the leveldb store involved. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com

Re: [ceph-users] Security Hole?

2014-03-31 Thread Gregory Farnum
Hmm, this might be considered a bit of a design oversight. Looking at the auth keys is a read operation, and the client has read permissions... You might want to explore the more fine-grained command-based monitor permissions as a workaround, but I've created a ticket to try and close that read per

Re: [ceph-users] cephx key for CephFS access only

2014-03-31 Thread Gregory Farnum
At present, the only security permission on the MDS is "allowed to do stuff", so "rwx" and "*" are synonymous. In general "*" means "is an admin", though, so you'll be happier in the future if you use "rwx". You may also want a more restrictive set of monitor capabilities as somebody else recently

Re: [ceph-users] Problem with object size in rados bench

2014-03-31 Thread Gregory Farnum
That's not an expected error when running this test; have you validated that your cluster is working in any other ways? Eg, what's the output of "ceph -s". -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Fri, Mar 28, 2014 at 5:53 AM, wrote: > Hi, > > When I try "rados -p

Re: [ceph-users] Backward compatibility of librados in Firefly

2014-03-31 Thread Gregory Farnum
I believe it's just that there was an issue for a while where the return codes were incorrectly not being filled in, and now they are. So the prval params you pass in when constructing the compound ops are where the values will be set. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.

Re: [ceph-users] OSD Restarts cause excessively high load average and "requests are blocked > 32 sec"

2014-03-31 Thread Gregory Farnum
Yep, that looks like http://tracker.ceph.com/issues/7093, which is fixed in dumpling and most of the dev releases since emperor. ;) I also cherry-picked the fix to the emperor branch and it will be included whenever we do another point release of that. -Greg Software Engineer #42 @ http://inktank.c

Re: [ceph-users] MDS debugging

2014-03-31 Thread Kenneth Waegeman
- Message from Martin B Nielsen - Date: Mon, 31 Mar 2014 15:55:24 +0200 From: Martin B Nielsen Subject: Re: [ceph-users] MDS debugging To: Kenneth Waegeman Cc: ceph-users Hi, I can see you're running mon, mds and osd on the same server. That's true, we have 3

Re: [ceph-users] MDS debugging

2014-03-31 Thread Kenneth Waegeman
- Message from "Yan, Zheng" - Date: Mon, 31 Mar 2014 21:09:20 +0800 From: "Yan, Zheng" Subject: Re: [ceph-users] MDS debugging To: Kenneth Waegeman Cc: ceph-users On Mon, Mar 31, 2014 at 7:49 PM, Kenneth Waegeman wrote: Hi all, Before the weekend we started s

Re: [ceph-users] MDS debugging

2014-03-31 Thread Martin B Nielsen
Hi, I can see you're running mon, mds and osd on the same server. Also, from a quick glance you're using around 13GB resident memory. If you only have 16GB in your system I'm guessing you'll be swapping about now (or close). How much mem does the system hold? Also, how busy are the disks? Or is

Re: [ceph-users] MDS debugging

2014-03-31 Thread Yan, Zheng
On Mon, Mar 31, 2014 at 7:49 PM, Kenneth Waegeman wrote: > Hi all, > > Before the weekend we started some copying tests over ceph-fuse. Initially, > this went ok. But then the performance started dropping gradually. Things > are going very slow now: what does the copying test like? Regards Yan,

Re: [ceph-users] How do I know which object takes storage space?

2014-03-31 Thread Hell
> 31 марта 2014 г., в 12:01, Jianing Yang написал(а): > > > Hi, all > > I've deleted everything image in my only pool but still have 2350 MB > data remains. Is there a command that help get which files/objects are > still in use? > "rados -p ls | xargs -n 1 rados -p stat" will give you th

[ceph-users] MDS debugging

2014-03-31 Thread Kenneth Waegeman
Hi all, Before the weekend we started some copying tests over ceph-fuse. Initially, this went ok. But then the performance started dropping gradually. Things are going very slow now: 2014-03-31 13:36:37.047423 mon.0 [INF] pgmap v265871: 1300 pgs: 1300 active+clean; 19872 GB data, 59953 GB

[ceph-users] How do I know which object takes storage space?

2014-03-31 Thread Jianing Yang
Hi, all I've deleted everything image in my only pool but still have 2350 MB data remains. Is there a command that help get which files/objects are still in use? , | | ceph -w | cluster 33064485-f73e-4db2-b9d6-8f4463334619 | health HEALTH_OK | monmap e1: 3 mons at {a=10.86.32.

Re: [ceph-users] Ceph: Error librbd to create a clone

2014-03-31 Thread COPINE Patrick
Hi, Now, I have an exception. Before starting the Python program, I made the following actions. The exception is : > root@ceph-clt:~/src# python v4.py > > Traceback (most recent call last): > > File "v4.py", line 22, in > > rbd_inst.clone(p_ioctx, 'foo3', 's1-foo3', c_ioctx, 'c1-foo3