[ceph-users] locking rbd device

2013-08-26 Thread Wolfgang Hennerbichler
hi list, I realize there's a command called "rbd lock" to lock an image. Can libvirt use this to prevent virtual machines from being started simultaneously on different virtualisation containers? wogri -- http://www.wogri.at ___ ceph-users mailing

[ceph-users] The whole cluster hangs when changing MTU to 9216

2013-08-26 Thread Da Chun Ng
Centos 6.4Ceph Cuttlefish 0.61.7, or 0.61.8. I changed the MTU to 9216(or 9000), then restarted all the cluster nodes. The whole cluster hung, with messages in the mon log as below:4048 2013-08-26 15:52:43.028554 7fd83f131700 1 mon.ceph0@0(electing).elector(15) init, last seen epoch 154049 2013

Re: [ceph-users] The whole cluster hangs when changing MTU to 9216

2013-08-26 Thread James Harper
> > Centos 6.4 > Ceph Cuttlefish 0.61.7, or 0.61.8. > > I changed the MTU to 9216(or 9000), then restarted all the cluster nodes. > The whole cluster hung, with messages in the mon log as below: Does tcpdump report any tcp or ip checksum errors? (tcpdump -v -s0 -i http://lists.ceph.com/listinfo

[ceph-users] Problems with keyrings during deployment

2013-08-26 Thread Francesc Alted
Hi, I am a newcomer to Ceph. After having a look at the docs (BTW, it is nice to see its concepts being implemented), I am trying to do some tests, mainly to check the Python APIs to access RADOS and RDB components. I am following this quick guide: http://ceph.com/docs/next/start/quick-ceph-dep

Re: [ceph-users] How to migrate from a "missing auth" monitor files to a regular one?

2013-08-26 Thread Yu Changyuan
Thank you, after apply 25 times 'ceph auth add mon.a', unpatched version works. Here's the details step: 1. stop cluster(mon,osd and mds), backup current /var/lib/ceph/mon/ceph-a dir 2. start patched ceph-mon and ceph-osd(i am not sure ceph-osd is necessary or not) 3. run 'ceph auth add mon.a' 25

Re: [ceph-users] How to migrate from a "missing auth" monitor files to a regular one?

2013-08-26 Thread Yu Changyuan
On Sun, Aug 25, 2013 at 10:27 PM, Joao Eduardo Luis wrote: > On 08/25/2013 12:36 PM, Yu Changyuan wrote: > >> Today, when I restart ceph service, the problem I asked on mail-list >> before happened >> again(http://article.gmane.**org/gmane.comp.file-systems.**ceph.user/2995

[ceph-users] Hardware recommendations

2013-08-26 Thread Shain Miley
Good morning, I am in the process of deciding what hardware we are going to purchase for our new ceph based storage cluster. I have been informed that I must submit my purchase needs by the end of this week in order to meet our FY13 budget requirements (which does not leave me much time). We

Re: [ceph-users] ceph-mon listens on wrong interface

2013-08-26 Thread Fuchs, Andreas (SwissTXT)
Hi Sage Thanks for your answer I had ceph.conf already adjusted mon_hosts has the list of public ip's of the mon servers but ceph-mon is listening on eth0 instead of the ip listed in mon_hosts also entering [mon.ceph-ceph01] sections with host= and mon_addr= entries did not change this do I

[ceph-users] Some help needed with ceph deployment

2013-08-26 Thread Johannes Klarenbeek
our help, since I'm stuck for the moment. Regards, Johannes __ Informatie van ESET Endpoint Antivirus, versie van database viruskenmerken 8730 (20130826) __ Het bericht is gecontroleerd door ESET Endpoint Antivirus. http://www.eset.com _

Re: [ceph-users] Some help needed with ceph deployment

2013-08-26 Thread Johannes Klarenbeek
_ Informatie van ESET Endpoint Antivirus, versie van database viruskenmerken 8730 (20130826) __ Het bericht is gecontroleerd door ESET Endpoint Antivirus. http://www.eset.com ___ ceph-users mailing list ceph-users@lists.ceph.com<mai

Re: [ceph-users] Some help needed with ceph deployment

2013-08-26 Thread Alfredo Deza
e journaling for my OSD’s but it doesn’t show in my conf > file. Also the journaling partitions are 2GB big and not 1024MB (if that is > what it means then). > > ** ** > > ** ** > > ** ** > > I can really use your help, since I’m stuck for the moment. > > ** ** > > Regards, > > Johannes

[ceph-users] librados pthread_create failure

2013-08-26 Thread Greg Poirier
So, in doing some testing last week, I believe I managed to exhaust the number of threads available to nova-compute last week. After some investigation, I found the pthread_create failure and increased nproc for our Nova user to, what I considered, a ridiculous 120,000 threads after reading that li

Re: [ceph-users] ceph-mon listens on wrong interface

2013-08-26 Thread Sage Weil
On Mon, 26 Aug 2013, Fuchs, Andreas (SwissTXT) wrote: > Hi Sage > > Thanks for your answer > > I had ceph.conf already adjusted > mon_hosts has the list of public ip's of the mon servers > > but ceph-mon is listening on eth0 instead of the ip listed in mon_hosts > > also entering [mon.ceph-ce

Re: [ceph-users] librados pthread_create failure

2013-08-26 Thread Gregory Farnum
On Mon, Aug 26, 2013 at 9:24 AM, Greg Poirier wrote: > So, in doing some testing last week, I believe I managed to exhaust the > number of threads available to nova-compute last week. After some > investigation, I found the pthread_create failure and increased nproc for > our Nova user to, what I

Re: [ceph-users] librados pthread_create failure

2013-08-26 Thread Greg Poirier
Gregs are awesome, apparently. Thanks for the confirmation. I know that threads are light-weight, it's just the first time I've ever run into something that uses them... so liberally. ^_^ On Mon, Aug 26, 2013 at 10:07 AM, Gregory Farnum wrote: > On Mon, Aug 26, 2013 at 9:24 AM, Greg Poirier >

Re: [ceph-users] Hardware recommendations

2013-08-26 Thread Martin B Nielsen
Hi Shain, Those R515 seem to mimic our servers (2U supermicro w. 12x 3.5" bays and 2x 2.5" in the rear for OS). Since we need a mix of SSD & platter we have 8x 4TB drives and 4x 500GB SSD + 2x 250GB SSD for OS in each node (2x 8-port LSI 2308 in IT-mode) We've partitioned 10GB from each 4x 500GB

Re: [ceph-users] rbd in centos6.4

2013-08-26 Thread raj kumar
Thank you so much. This is helpful not only for me, but for all beginners. Raj On Fri, Aug 23, 2013 at 5:31 PM, Kasper Dieter wrote: > Once the cluster is created on Ceph server nodes with MONs and OSDs on it > you have to copy the config + auth info to the clients: > > #--- on server node, e.

Re: [ceph-users] 1 particular ceph-mon never jobs on 0.67.2

2013-08-26 Thread Travis Rhoden
Hi Sage, Thanks for the response. I noticed that as well, and suspected hostname/DHCP/DNS shenanigans. What's weird is that all nodes are identically configured. I also have monitors running on n0 and n12, and they come up fine, every time. Here's the mon_host line from ceph.conf: mon_initial

Re: [ceph-users] 1 particular ceph-mon never jobs on 0.67.2

2013-08-26 Thread Sage Weil
On Mon, 26 Aug 2013, Travis Rhoden wrote: > Hi Sage, > > Thanks for the response.  I noticed that as well, and suspected > hostname/DHCP/DNS shenanigans.  What's weird is that all nodes are > identically configured.  I also have monitors running on n0 and n12, and > they come up fine, every time.

Re: [ceph-users] 1 particular ceph-mon never jobs on 0.67.2

2013-08-26 Thread Travis Rhoden
Cool. So far I have tried: start on (local-filesystems and net-device-up IFACE=eth0) start on (local-filesystems and net-device-up IFACE=eth0 and net-device-up IFACE=eth1) About to try: start on (local-filesystems and net-device-up IFACE=eth0 and net-device-up IFACE=eth1 and started network-serv

Re: [ceph-users] ceph-mon listens on wrong interface

2013-08-26 Thread Fuchs, Andreas (SwissTXT)
Hi Sage Many thanks for your answer. The Cluster is now up and running and "talking" on the right interfaces. Regards Andi -Original Message- From: Sage Weil [mailto:s...@inktank.com] Sent: Montag, 26. August 2013 18:20 To: Fuchs, Andreas (SwissTXT) Cc: ceph-us...@ceph.com Subject: RE:

Re: [ceph-users] Hardware recommendations

2013-08-26 Thread Shain Miley
Martin, Thank you very much for sharing your insight on hardware options. This will be very useful for us going forward. Shain Shain Miley | Manager of Systems and Infrastructure, Digital Media | smi...@npr.org | 202.513.3649 From: Martin B Nielsen [mar...@uni

Re: [ceph-users] Significant slowdown of osds since v0.67 Dumpling

2013-08-26 Thread Samuel Just
Can you attach a log from the startup of one of the dumpling osds on your production machine (no need for logging, just need some of the information dumped on every boot)? libleveldb is leveldb. We've used leveldb for a few things since bobtail. If anything, the load on leveldb should be lighter

Re: [ceph-users] Optimal configuration to validate Ceph

2013-08-26 Thread Samuel Just
If you create a pool with size 1 (no replication), (2) should be somewhere around 3x the speed of (1) assuming the client workload has enough parallelism and is well distributed over objects (so a random rbd workload with a large queue depth rather than a small sequential workload with a small queu

Re: [ceph-users] locking rbd device

2013-08-26 Thread Josh Durgin
On 08/26/2013 12:03 AM, Wolfgang Hennerbichler wrote: hi list, I realize there's a command called "rbd lock" to lock an image. Can libvirt use this to prevent virtual machines from being started simultaneously on different virtualisation containers? wogri Yes - that's the reason for lock co

Re: [ceph-users] bucket count limit

2013-08-26 Thread Samuel Just
As I understand it, that should actually help avoid bucket contention and thereby increase performance. Yehuda, anything to add? -Sam On Thu, Aug 22, 2013 at 7:08 AM, Mostowiec Dominik wrote: > Hi, > > I think about sharding s3 buckets in CEPH cluster, create bucket-per-XX (256 > buckets) or eve

Re: [ceph-users] locking rbd device

2013-08-26 Thread Josh Durgin
On 08/26/2013 01:49 PM, Josh Durgin wrote: On 08/26/2013 12:03 AM, Wolfgang Hennerbichler wrote: hi list, I realize there's a command called "rbd lock" to lock an image. Can libvirt use this to prevent virtual machines from being started simultaneously on different virtualisation containers? w

Re: [ceph-users] Significant slowdown of osds since v0.67 Dumpling

2013-08-26 Thread Oliver Daudey
Hey Samuel, I have been trying to get it reproduced on my test-cluster and seem to have found a way. Try: `rbd bench-write test --io-threads 80 --io-pattern=rand'. On my test-cluster, this closely replicates what I see during profiling on my production-cluster, including the extra CPU-usage by l

Re: [ceph-users] Storage, File Systems and Data Scrubbing

2013-08-26 Thread Samuel Just
ceph-osd builds a transactional interface on top of the usual posix operations so that we can do things like atomically perform an object write and update the osd metadata. The current implementation requires our own journal and some metadata ordering (which is provided by the backing filesystem's

Re: [ceph-users] lvm for a quick ceph lab cluster test

2013-08-26 Thread Samuel Just
Seems reasonable to me. I'm not sure I've heard anything about using LVM under ceph. Let us know how it goes! -Sam On Wed, Aug 21, 2013 at 5:18 PM, Liu, Larry wrote: > Hi guys, > > I'm a newbie in ceph. Wonder if I can use 2~3 LVM disks on each server, > total 2 servers to run a quick ceph clus

Re: [ceph-users] ceph-mon listens on wrong interface

2013-08-26 Thread Alfredo Deza
On Mon, Aug 26, 2013 at 3:41 PM, Fuchs, Andreas (SwissTXT) < andreas.fu...@swisstxt.ch> wrote: > Hi Sage > > Many thanks for your answer. The Cluster is now up and running and > "talking" on the right interfaces. > > Regards > Andi > > -Original Message- > From: Sage Weil [mailto:s...@inkt

Re: [ceph-users] osd/OSD.cc: 4844: FAILED assert(_get_map_bl(epoch, bl)) (ceph 0.61.7)

2013-08-26 Thread Samuel Just
This is the same osd, and hasn't been working in the mean time? Can your clsuter operate without that osd? -Sam On Mon, Aug 19, 2013 at 2:05 PM, Olivier Bonvalet wrote: > Le lundi 19 août 2013 à 12:27 +0200, Olivier Bonvalet a écrit : >> Hi, >> >> I have an OSD which crash every time I try to st

Re: [ceph-users] Sequential placement

2013-08-26 Thread Samuel Just
I think rados bench is actually creating new objects with each IO. Can you paste in the command you used? -Sam On Tue, Aug 20, 2013 at 7:28 AM, daniel pol wrote: > Hi ! > > Ceph newbie here with a placement question. I'm trying to get a simple Ceph > setup to run well with sequential reads big pa

Re: [ceph-users] radosgw subusers permission problem

2013-08-26 Thread Yehuda Sadeh
On Fri, Aug 23, 2013 at 5:31 AM, Mihály Árva-Tóth wrote: > Hello, > > I have an user with 3 subuser: > > { "user_id": "johndoe", > "display_name": "John Doe", > "email": "", > "suspended": 0, > "max_buckets": 1000, > "auid": 0, > "subusers": [ > { "id": "johndoe:readonly", >

Re: [ceph-users] osd/OSD.cc: 4844: FAILED assert(_get_map_bl(epoch, bl)) (ceph 0.61.7)

2013-08-26 Thread Stefan Priebe
i had the same problem and backported 6951d2345a5d837c3b14103bd4d8f5ee4407c937 to ceph. It fixes it for me. Greets, Stefan Am 26.08.2013 23:10, schrieb Samuel Just: This is the same osd, and hasn't been working in the mean time? Can your clsuter operate without that osd? -Sam On Mon, Aug 19

Re: [ceph-users] osd/OSD.cc: 4844: FAILED assert(_get_map_bl(epoch, bl)) (ceph 0.61.7)

2013-08-26 Thread Samuel Just
I just backported that one and 2 related patches to cuttlefish head. -Sam On Mon, Aug 26, 2013 at 2:14 PM, Stefan Priebe wrote: > i had the same problem and backported > 6951d2345a5d837c3b14103bd4d8f5ee4407c937 to ceph. > > It fixes it for me. > > Greets, > Stefan > > Am 26.08.2013 23:10, schrieb

Re: [ceph-users] Sequential placement

2013-08-26 Thread Gregory Farnum
In addition to that, Ceph uses full data journaling — if you have two journals on the OS drive then you'll be limited to what that OS drive can provide, divided by two (if you have two-copy happening). -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Mon, Aug 26, 2013 at 2:09

Re: [ceph-users] cuttlefish operatiing a cluster(start ceph all) failed

2013-08-26 Thread Samuel Just
Usually you need to run the inictl stuff on the node the process is on to control the process. -Sam On Fri, Aug 16, 2013 at 12:28 AM, maoqi1982 wrote: > Hi list: > > After I deployed a cuttlefish(6.1.07) cluster on three nodes(OS Ubuntu > 12.04),one ceph-deploy node ,one monitor node and a O

Re: [ceph-users] Significant slowdown of osds since v0.67 Dumpling

2013-08-26 Thread Samuel Just
Saw your logs. I thought it might be enabling filestore_xattr_use_omap, but it isn't. PGLog::undirty() doesn't seem to be using very much cpu. -Sam On Mon, Aug 26, 2013 at 2:04 PM, Oliver Daudey wrote: > Hey Samuel, > > I have been trying to get it reproduced on my test-cluster and seem to > ha

Re: [ceph-users] Significant slowdown of osds since v0.67 Dumpling

2013-08-26 Thread Oliver Daudey
Hey Samuel, Nope, "PGLog::undirty()" doesn't use as much CPU as before, but I found it curious that it still showed up, as I thought you disabled it. As long as you can reproduce the abnormal leveldb CPU-usage. Let me know if I can help with anything. Regards, Oliver On ma, 2013-08-2

Re: [ceph-users] Optimal configuration to validate Ceph

2013-08-26 Thread Samuel Just
Up to the point that you saturate the network, sure. Note that rados bench defaults to 16 writes at a time, so I would not expect a single rados bench client with 16 concurrent writes to show linear scaling past 16 osds (perhaps 32 if you have replication enabled). For larger numbers of osds, you

Re: [ceph-users] paxos is_readable spam on monitors?

2013-08-26 Thread Joao Eduardo Luis
On 08/16/2013 10:40 PM, Jeppesen, Nelson wrote: Hello Ceph-users, Running dumping (upgraded yesterday) and several hours after the upgrade the following type of message repeated over and over in logs. Started about 8 hours ago. 1 mon.1@0(leader).paxos(paxos active c 6005920..6006535) is_reada

Re: [ceph-users] Sequential placement

2013-08-26 Thread Chen, Xiaoxi
The "random" may come from ceph trunks. For RBD, Ceph trunk the image to 4M(default) objects, for Rados bench , it already 4M objects if you didn't set the parameters. So from XFS's view, there are lots of 4M files, in default, with ag!=1 (allocation group, specified during mkfs, default seems t

Re: [ceph-users] Significant slowdown of osds since v0.67 Dumpling

2013-08-26 Thread Matthew Anderson
Hi Guys, I'm having the same problem as Oliver with 0.67.2. CPU usage is around double that of the 0.61.8 OSD's in the same cluster which appears to be causing the performance decrease. I did a perf comparison (not sure if I did it right but it seems ok). Both hosts are the same spec running Ubun

Re: [ceph-users] Significant slowdown of osds since v0.67 Dumpling

2013-08-26 Thread Samuel Just
I just pushed a patch to wip-dumpling-log-assert (based on current dumpling head). I had disabled most of the code in PGLog::check() but left an (I thought) innocuous assert. It seems that with (at least) g++ 4.6.3, stl list::size() is linear in the size of the list, so that assert actually trave