Re: [ceph-users] Weird problem with mkcephfs

2013-03-25 Thread Steve Carter
Although it doesn't attempt to login to my other machines as I thought it was designed to do, as I know it did the last time I built a cluster. Not sure what I'm doing wrong. -Steve On 03/23/2013 10:35 PM, Steve Carter wrote: I changed: for k in $dir/key.* to: for k in $dir/key* and it a

[ceph-users] Ceph Crach at sync_thread_timeout after heavy random writes.

2013-03-25 Thread Chen, Xiaoxi
Hi list, We have hit and reproduce this issue for several times, ceph will suicide because FileStore: sync_entry timed out after a very heavy random IO on top of the RBD. My test environment is: 4 Nodes ceph cluster with 20 HDDs for OSDs and 4 Intel

Re: [ceph-users] Ceph Crach at sync_thread_timeout after heavy random writes.

2013-03-25 Thread Wolfgang Hennerbichler
Hi, this could be related to this issue here and has been reported multiple times: http://tracker.ceph.com/issues/3737 In short: They're working on it, they know about it. Wolfgang On 03/25/2013 10:01 AM, Chen, Xiaoxi wrote: > Hi list, > > We have hit and reproduce this issue for sev

Re: [ceph-users] Ceph Crach at sync_thread_timeout after heavy random writes.

2013-03-25 Thread Chen, Xiaoxi
Hi Wolfgang, Thanks for the reply,but why my problem is related with issue#3737? I cannot find any direct link between them. I didnt turn on qemu cache and my qumu/VM work fine Xiaoxi 在 2013-3-25,17:07,"Wolfgang Hennerbichler" 写道: > Hi, > > this could be related

Re: [ceph-users] Ceph Crach at sync_thread_timeout after heavy random writes.

2013-03-25 Thread Wolfgang Hennerbichler
Hi Xiaoxi, sorry, I thought you were testing within VMs and caching turned on (I assumed, you didn't tell us if you really did use your benchmark within vms and if not, how you tested rbd outside of VMs). It just triggered an alarm in me because we had also experienced issues with benchmarking wit

Re: [ceph-users] Ceph Crach at sync_thread_timeout after heavy random writes.

2013-03-25 Thread Chen, Xiaoxi
Hi, 在 2013-3-25,17:30,"Wolfgang Hennerbichler" 写道: > Hi Xiaoxi, > > sorry, I thought you were testing within VMs and caching turned on (I > assumed, you didn't tell us if you really did use your benchmark within > vms and if not, how you tested rbd outside of VMs). Yes,I really testing within

Re: [ceph-users] Ceph Crach at sync_thread_timeout after heavy random writes.

2013-03-25 Thread Wolfgang Hennerbichler
On 03/25/2013 10:35 AM, Chen, Xiaoxi wrote: > OK,but my VM didnt crash, it's ceph-osd daemon crashed. So is it safe for me > to say the issue I hit is a different issue?(not #3737) Yes, then it surely is a different issue. Actually you just said ceph crashed, no mention of an OSD, so it was ha

[ceph-users] kernel BUG when mapping unexisting rbd device

2013-03-25 Thread Dan van der Ster
Hi, Apologies if this is already a known bug (though I didn't find it). If we try to map a device that doesn't exist, we get an immediate and reproduceable kernel BUG (see the P.S.). We hit this by accident because we forgot to add the --pool . This works: [root@afs245 /]# rbd map afs254-vicepa

Re: [ceph-users] Ceph Crach at sync_thread_timeout after heavy random writes.

2013-03-25 Thread Chen, Xiaoxi
Rephrase it to make it more clear From: ceph-users-boun...@lists.ceph.com [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Chen, Xiaoxi Sent: 2013年3月25日 17:02 To: 'ceph-users@lists.ceph.com' (ceph-users@lists.ceph.com) Cc: ceph-de...@vger.kernel.org Subject: [ceph-users] Ceph Crach at sync

Re: [ceph-users] Weird problem with mkcephfs

2013-03-25 Thread Sage Weil
They keyring.* vs key.* distinction in mkcephfs appears correct. Can you attach your ceph.conf? It looks a bit like no daemons are defined. sage On Mon, 25 Mar 2013, Steve Carter wrote: > Although it doesn't attempt to login to my other machines as I thought it was > designed to do, as I kno

Re: [ceph-users] Ceph Crach at sync_thread_timeout after heavy random writes.

2013-03-25 Thread Sage Weil
Hi Xiaoxi, On Mon, 25 Mar 2013, Chen, Xiaoxi wrote: > From Ceph-w , ceph reports a very high Ops (1+ /s) , but > technically , 80 spindles can provide up to 150*80/2=6000 IOPS for 4K random > write. > > When digging into the code, I found that the OSD write data to > Pagecac

[ceph-users] SSD Capacity and Partitions for OSD Journals

2013-03-25 Thread Peter_Jung
Hi, I have a couple of HW provisioning questions in regards to SSD for OSD Journals. I'd like to provision 12 OSDs per a node and there are enough CPU clocks and Memory. Each OSD is allocated one 3TB HDD for OSD data - these 12 * 3TB HDDs are in non-RAID. For increasing access and (sequential)

Re: [ceph-users] RadosGW fault tolerance

2013-03-25 Thread Rustam Aliyev
Hi Yehuda, Thanks for reply, my comments below inline. On 25/03/2013 04:32, Yehuda Sadeh wrote: On Sun, Mar 24, 2013 at 7:14 PM, Rustam Aliyev wrote: Hi, I was testing RadosGW setup and observed strange behavior - RGW becomes unresponsive or won't start whenever cluster health is degraded (e

[ceph-users] v0.56.4 released

2013-03-25 Thread Sage Weil
There have been several important fixes that we've backported to bobtail that users are hitting in the wild. Most notably, there was a problem with pool names with - and _ that OpenStack users were hitting, and memory usage by ceph-osd and other daemons due to the trimming of in-memory logs. Th

Re: [ceph-users] v0.56.4 released

2013-03-25 Thread Sage Weil
On Mon, 25 Mar 2013, Sage Weil wrote: > There is one minor change (fix) in the output to the 'ceph osd tree > --format=json' command. Please see the full release notes. Greg just reminded me about one additional note about upgrades (that should hopefully affect noone): * The MDS disk format has

Re: [ceph-users] Ceph Crach at sync_thread_timeout after heavy random writes.

2013-03-25 Thread Chen, Xiaoxi
Hi Sage, Thanks for your mail.When turn on filestore sync flush, it seems works and OSD process doesn't suicide any more . I have already disabled flusher long age since both Mark's and my report show disable flusher seems to improve performance(so my original configuration is filestore_

Re: [ceph-users] Weird problem with mkcephfs

2013-03-25 Thread Steve Carter
Sage, Sure, here you go: [global] auth cluster required = cephx auth service required = cephx auth client required = cephx max open files = 4096 [mon] mon data = /data/${name} keyring = /data/${name}/keyring [osd] osd data = /data/${name} keyring = /data/${name}

Re: [ceph-users] SSD Capacity and Partitions for OSD Journals

2013-03-25 Thread Matthieu Patou
On 03/25/2013 04:07 PM, peter_j...@dell.com wrote: Hi, I have a couple of HW provisioning questions in regards to SSD for OSD Journals. I’d like to provision 12 OSDs per a node and there are enough CPU clocks and Memory. Each OSD is allocated one 3TB HDD for OSD data – these 12 * 3TB HDDs