[ceph-users] rbd recover tool for stopped ceph cluster

2015-02-04 Thread minchen
rbd recover tool is an offline tool to recover rbd image when ceph cluster is stopped. It is usefull when you want to recover rbd image on a broken ceph cluster in urgent. I have used a similar prototype tool succeessfully recovering a large rbd image in ceph cluster whose scale is 900+ o

Re: [ceph-users] Ceph Supermicro hardware recommendation

2015-02-04 Thread Colombo Marco
Hi Christian, On 04/02/15 02:39, "Christian Balzer" wrote: >On Tue, 3 Feb 2015 15:16:57 + Colombo Marco wrote: > >> Hi all, >> I have to build a new Ceph storage cluster, after i‘ve read the >> hardware recommendations and some mail from this mailing list i would >> like to buy these serv

[ceph-users] Question about output message and object update for ceph class

2015-02-04 Thread Dennis Chen
Hello, I write a ceph client using rados lib to execute a funcution upon the object. CLIENT SIDE CODE === int main() { ... strcpy(in, "from client"); err = rados_exec(io, objname, "devctl", "devctl_op", in, strlen(in), out, 128); if (err < 0) { fprintf(stderr, "r

Re: [ceph-users] Question about output message and object update for ceph class

2015-02-04 Thread Dennis Chen
I take back the question, because I just found that for a succeed write opetion in the class, *no* data in the out buffer... On Wed, Feb 4, 2015 at 5:44 PM, Dennis Chen wrote: > Hello, > > I write a ceph client using rados lib to execute a funcution upon the object. > > CLIENT SIDE CODE > ===

Re: [ceph-users] Ceph Supermicro hardware recommendation

2015-02-04 Thread Nick Fisk
If it's of any interest, we are building our cluster with these:- http://www.supermicro.nl/products/system/4U/F617/SYS-F617H6-FTPT_.cfm It seemed to us that with 2U servers quite a fair chunk of the cost goes on the metal case and redundant power supplies. The Storage optimised Fat Twin seemed

Re: [ceph-users] Monitor Restart triggers half of our OSDs marked down

2015-02-04 Thread Christian Eichelmann
Hi Greg, the behaviour is indeed strange. Today I was trying to reproduce the problem, but no matter which monitor I've restarted, no matter how many times, the bahviour was like expected: A new monitor election was called and everything contiuned normally. Then I continued my failover tests and

Re: [ceph-users] Ceph Supermicro hardware recommendation

2015-02-04 Thread Udo Lembke
Hi Marco, Am 04.02.2015 10:20, schrieb Colombo Marco: ... > We choosen the 6TB of disk, because we need a lot of storage in a small > amount of server and we prefer server with not too much disks. > However we plan to use max 80% of a 6TB Disk > 80% is too much! You will run into trouble. Ceph

[ceph-users] snapshoting on btrfs vs xfs

2015-02-04 Thread Cristian Falcas
Hi, We have an openstack installation that uses ceph as the storage backend. We use mainly snapshot and boot from snapshot from an original instance with a 200gb disk. Something like this: 1. import original image 2. make volume from image (those 2 steps were done only once, when we installed ope

[ceph-users] PG to pool mapping?

2015-02-04 Thread Chad William Seys
Hi all, How do I determine which pool a PG belongs to? (Also, is it the case that all objects in a PG belong to one pool?) Thanks! C. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] snapshoting on btrfs vs xfs

2015-02-04 Thread Sage Weil
On Wed, 4 Feb 2015, Cristian Falcas wrote: > Hi, > > We have an openstack installation that uses ceph as the storage backend. > > We use mainly snapshot and boot from snapshot from an original > instance with a 200gb disk. Something like this: > 1. import original image > 2. make volume from imag

Re: [ceph-users] PG to pool mapping?

2015-02-04 Thread Gregory Farnum
On Wed, Feb 4, 2015 at 1:20 PM, Chad William Seys wrote: > Hi all, >How do I determine which pool a PG belongs to? >(Also, is it the case that all objects in a PG belong to one pool?) PGs are of the form "1.a2b3c4". The part prior to the period is the pool ID; the part following distingui

Re: [ceph-users] PG to pool mapping?

2015-02-04 Thread Lincoln Bryant
On Feb 4, 2015, at 3:27 PM, Gregory Farnum wrote: > On Wed, Feb 4, 2015 at 1:20 PM, Chad William Seys > wrote: >> Hi all, >> How do I determine which pool a PG belongs to? >> (Also, is it the case that all objects in a PG belong to one pool?) > > PGs are of the form "1.a2b3c4". The part prio

Re: [ceph-users] snapshoting on btrfs vs xfs

2015-02-04 Thread Cristian Falcas
Thank you for the clarifications. We will try to report back, but I'm not sure our use case is relevant. We are trying to use every dirty trick to speed up the VMs. We have only 1 replica, and 2 pools. One pool with journal on disk, where the original instance exists (we want to keep this one sa

Re: [ceph-users] snapshoting on btrfs vs xfs

2015-02-04 Thread Daniel Schwager
Hi Cristian, > We will try to report back, but I'm not sure our use case is relevant. > We are trying to use every dirty trick to speed up the VMs. we have the same use-case. > The second pool is for the tests machines and has the journal in ram, > so this part is very volatile. We don't really

Re: [ceph-users] snapshoting on btrfs vs xfs

2015-02-04 Thread Lindsay Mathieson
On 5 February 2015 at 07:22, Sage Weil wrote: > > Is the snapshoting performed by ceph or by the fs? Can we switch to > > xfs and have the same capabilities: instant snapshot + instant boot > > from snapshot? > > The feature set and capabilities are identical. The difference is that on > btrfs w

[ceph-users] RGW put file question

2015-02-04 Thread baijia...@126.com
when I put file failed, and run the function " RGWRados::cls_obj_complete_cancel", why we use CLS_RGW_OP_ADD not use CLS_RGW_OP_CANCEL? why we set poolid is -1 and set epoch is 0? baijia...@126.com___ ceph-users mailing list ceph-users@lists.ceph.com

Re: [ceph-users] ceph Performance random write is more then sequential

2015-02-04 Thread Alexandre DERUMIER
Hi, >>What I saw after enabling RBD cache it is working as expected, means >>sequential write has better MBps than random write. can somebody explain this >>behaviour ? This is because rbd_cache merge coalesced ios in bigger ios, so it's working only with sequential workload. you'll do less i

Re: [ceph-users] Ceph Supermicro hardware recommendation

2015-02-04 Thread Christian Balzer
Hello, On Wed, 4 Feb 2015 09:20:24 + Colombo Marco wrote: > Hi Christian, > > > > On 04/02/15 02:39, "Christian Balzer" wrote: > > >On Tue, 3 Feb 2015 15:16:57 + Colombo Marco wrote: > > > >> Hi all, > >> I have to build a new Ceph storage cluster, after i‘ve read the > >> hardware

Re: [ceph-users] RGW put file question

2015-02-04 Thread baijia...@126.com
when I put the same file with multi threads, sometimes put file head oid "ref.ioctx.operate(ref.oid, &op); " return -ECANCELED. I think this is normal. but fuction jump to done_cancel, and run the complete_update_index_cancel(or index_op.cancel() ), but osd execute rgw_bucket_complete_op with

Re: [ceph-users] snapshoting on btrfs vs xfs

2015-02-04 Thread Cristian Falcas
We want to use this script as a service for start/stop (but it wasn't tested yet): #!/bin/bash # chkconfig: - 50 90 # description: make a journal for osd.0 in ram start () { -f /dev/shm/osd.0.journal || ceph-osd -i 0 --mkjournal } stop () { service ceph stop osd.0 && ceph-osd -i osd.0 --flush-j

[ceph-users] command to flush rbd cache?

2015-02-04 Thread Udo Lembke
Hi all, is there any command to flush the rbd cache like the "echo 3 > /proc/sys/vm/drop_caches" for the os cache? Udo ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] command to flush rbd cache?

2015-02-04 Thread Dan Mick
On 02/04/2015 10:44 PM, Udo Lembke wrote: > Hi all, > is there any command to flush the rbd cache like the > "echo 3 > /proc/sys/vm/drop_caches" for the os cache? > > Udo Do you mean the kernel rbd or librbd? The latter responds to flush requests from the hypervisor. The former...I'm not sure i

Re: [ceph-users] command to flush rbd cache?

2015-02-04 Thread Josh Durgin
On 02/05/2015 07:44 AM, Udo Lembke wrote: Hi all, is there any command to flush the rbd cache like the "echo 3 > /proc/sys/vm/drop_caches" for the os cache? librbd exposes it as rbd_invalidate_cache(), and qemu uses it internally, but I don't think you can trigger that via any user-facing qemu

Re: [ceph-users] command to flush rbd cache?

2015-02-04 Thread Udo Lembke
Hi Dan, I mean qemu-kvm, also librbd. But how I can kvm told to flush the buffer? Udo On 05.02.2015 07:59, Dan Mick wrote: > On 02/04/2015 10:44 PM, Udo Lembke wrote: >> Hi all, >> is there any command to flush the rbd cache like the >> "echo 3 > /proc/sys/vm/drop_caches" for the os cache? >> >>

Re: [ceph-users] command to flush rbd cache?

2015-02-04 Thread Udo Lembke
Hi Josh, thanks for the info. detach/reattach schould be fine for me, because it's only for performance testing. #2468 would be fine of course. Udo On 05.02.2015 08:02, Josh Durgin wrote: > On 02/05/2015 07:44 AM, Udo Lembke wrote: >> Hi all, >> is there any command to flush the rbd cache like

Re: [ceph-users] command to flush rbd cache?

2015-02-04 Thread Dan Mick
I don't know the details well; I know the device itself supports the block-device-level cache-flush commands (I know there's a SCSI-specific one but I don't know offhand if there's a device generic one) so the guest OS can, and does, request flushing. I can't remember if there's also a qemu comman

Re: [ceph-users] ceph Performance random write is more then sequential

2015-02-04 Thread Sumit Gaur
Yes, So far I have tried both the options and in both cases I am able to get better sequential performance then random (as explained by somnath) *But *performance numbers(iops, mbps) are way less then default option, I can understand that as ceph is dealing with 1000 times more objects then defau