Re: [ceph-users] Ceph + VMWare

2016-10-05 Thread Daniel Schwager
Hi all, we are using Ceph (jewel 10.2.2, 10GBit Ceph frontend/backend, 3 nodes, each 8 OSD's and 2 journal SSD's) in out VMware environment especially for test environments and templates - but currently not for productive machines (because of missing FC-redundancy & performance). On our Linux

Re: [ceph-users] CephFS: No space left on device

2016-10-05 Thread Yan, Zheng
On Wed, Oct 5, 2016 at 2:27 PM, Mykola Dvornik wrote: > Hi Zheng, > > Many thanks for you reply. > > This indicates the MDS metadata is corrupted. Did you do any unusual > operation on the cephfs? (e.g reset journal, create new fs using > existing metadata pool) > > No, nothing has been explicitly

Re: [ceph-users] 6 Node cluster with 24 SSD per node: Hardwareplanning/ agreement

2016-10-05 Thread Christian Balzer
It would really help to have a better understanding of your applications needs, IOPS versus bandwidth, etc. If for example your DB transactions are small, but plentiful (something like 2000 transactions per seconds) against a well defined and not too larger working set and all your other I/O nee

Re: [ceph-users] 6 Node cluster with 24 SSD per node: Hardwareplanning/ agreement

2016-10-05 Thread Christian Balzer
Hello, On Wed, 05 Oct 2016 13:43:27 +0200 Denny Fuchs wrote: > hi, > > I get a call from Mellanox and we get now a offer for the following > network: > > * 2 x SN2100 100Gb/s Switch 16 ports Which incidentally is a half sized (identical HW really) Arctica 3200C. > * 10 x ConnectX 4LX-EN 25Gb

[ceph-users] The principle of config Federated Gateways

2016-10-05 Thread Brian Chang-Chien
Hi all I have a question about config federated gateway Why only sync data and metadata between zones in the same regions and only sync metadata between zones in the different regions In different regions, can't sync zone data , can tell me any concern? Thx

Re: [ceph-users] Ceph consultants?

2016-10-05 Thread Alan Johnson
I did have some similar issues and resolved it by installing parted 3.2 (I can't say if this was definitive) but it worked for me. I also only used create (after disk zap) rather than prepare/activate. From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Steve Taylor Sent: W

Re: [ceph-users] Ceph consultants?

2016-10-05 Thread Steve Taylor
Try using 'ceph-deploy osd create' instead of 'ceph-deploy osd prepare' and 'ceph-deploy osd activate' when using an entire disk for an OSD. That will create a journal partition and co-locate your journal on the same disk with the OSD, but that's fine for an initial dev setup.

[ceph-users] unable to start radosgw after upgrade from 10.2.2 to 10.2.3

2016-10-05 Thread Andrei Mikhailovsky
Hello everyone, I've just updated my ceph to version 10.2.3 from 10.2.2 and I am no longer able to start the radosgw service. When executing I get the following error: 2016-10-05 22:14:10.735883 7f1852d26a00 0 ceph version 10.2.3 (ecc23778eb545d8dd55e2e4735b53cc93f92e65b), process radosgw, pi

Re: [ceph-users] Ceph consultants?

2016-10-05 Thread Tracy Reed
On Wed, Oct 05, 2016 at 01:17:52PM PDT, Peter Maloney spake thusly: > What do you need help with specifically? Setting up ceph isn't very > complicated... just fixing it when things go wrong should be. What type > of scale are you working with, and do you already have hardware? Or is > the problem

Re: [ceph-users] Ceph consultants?

2016-10-05 Thread Peter Maloney
What do you need help with specifically? Setting up ceph isn't very complicated... just fixing it when things go wrong should be. What type of scale are you working with, and do you already have hardware? Or is the problem more to do with integrating it with clients? On 10/05/16 20:16, Erick Perez

Re: [ceph-users] Recovery/Backfill Speedup

2016-10-05 Thread Dan Jakubiec
Thank Ronny, I am working with Reed on this problem. Yes something is very strange. Docs say osd_max_backfills default to 10, but when we examined the run-time configuration using "ceph --show-config" it was showing osd_max_backfills set to 1 (we are running latest Jewel release). We have e

Re: [ceph-users] Ceph + VMWare

2016-10-05 Thread Oliver Dzombic
Hi Patrick, we are currently trying to get ceph running with it for a customer. ( Means our stuff = cephfs, customer stuff = vmware on ONE ceph cluster ). Unluckily iscsi sucks ( ohne OSD fails = iscsi lock -> need restart iscsi daemon on ceph servers ). NFS sucks ( no natural HA ) So if you ca

[ceph-users] Ceph + VMWare

2016-10-05 Thread Patrick McGarry
Hey guys, Starting to buckle down a bit in looking at how we can better set up Ceph for VMWare integration, but I need a little info/help from you folks. If you currently are using Ceph+VMWare, or are exploring the option, I'd like some simple info from you: 1) Company 2) Current deployment size

Re: [ceph-users] Ceph consultants?

2016-10-05 Thread Erick Perez - Quadrian Enterprises
Im intereses Too! Remote ceph services. Erick Perez El 10/05/2016 1:07 p. m., "Tracy Reed" escribió: > > Hello all, > > Any independent Ceph consultants out there? We have been trying to get Ceph > going and it's been very slow going. We don't have anything working yet > a

[ceph-users] Ceph consultants?

2016-10-05 Thread Tracy Reed
Hello all, Any independent Ceph consultants out there? We have been trying to get Ceph going and it's been very slow going. We don't have anything working yet after a month! We really can't waste much more time on this by ourselves. At this point we're looking to pay someone for a few hours to

Re: [ceph-users] Adding OSD Nodes and Changing Crushmap

2016-10-05 Thread David Turner
One more note.. In case it wasn't obvious, make sure to modify the cluster's current crush map after you add the storage and not upload a previous version of the map that you showed a segment of in your inquiry. [cid:imagea54a6e.JPG@c73d2cef.4cab056c]

Re: [ceph-users] Adding OSD Nodes and Changing Crushmap

2016-10-05 Thread David Turner
That is the correct modification to change the failure domain from osd to host. You can make the change to host from osd in your crush map any time after you add the 2 new storage nodes (It is important to have enough hosts to at least match your cluster's replica size before changing the crush

[ceph-users] Adding OSD Nodes and Changing Crushmap

2016-10-05 Thread Mike Jacobacci
Hi, I just wanted to get a sanity check if possible, I apologize if my questions are stupid, I am still new to Ceph and I am feeling uneasy adding new nodes. Right now we have one OSD node with 10 OSD disks (plus 2 disks for caching) and this week we are going to add two more nodes with the same

Re: [ceph-users] Recovery/Backfill Speedup

2016-10-05 Thread Ronny Aasen
On 04.10.2016 16:31, Reed Dier wrote: Attempting to expand our small ceph cluster currently. Have 8 nodes, 3 mons, and went from a single 8TB disk per node to 2x 8TB disks per node, and the rebalancing process is excruciatingly slow. Originally at 576 PGs before expansion, and wanted to allow

Re: [ceph-users] What's the current status of rbd_recover_tool ?

2016-10-05 Thread Jason Dillaman
It definitely isn't actively maintained. It was contributed nearly two years ago for a worst-case recovery when the Ceph cluster itself cannot be started but all the individual data objects are still readily available. For RBD disaster recovery, I would suggest taking a look at RBD mirroring suppo

Re: [ceph-users] [EXTERNAL] Benchmarks using fio tool gets stuck

2016-10-05 Thread Mario Rodríguez Molins
Doing some tests using iperf, our network has a bandwidth among nodes of 940 Mbits/sec. According to our metrics of network use in this cluster, hosts with OSD have a peek traffic of about 200 Mbits/sec each and the client which runs FIO about 300 Mbits/sec. It doesn't seem to be saturated the netw

Re: [ceph-users] [EXTERNAL] Benchmarks using fio tool gets stuck

2016-10-05 Thread Will . Boege
Because you do not have segregated networks, the cluster traffic is most likely drowning out the FIO user traffic. This is especially exacerbated by the fact that it is only a 1gb link between the cluster nodes. If you are planning on using this cluster for anything other than testing, you’ll

Re: [ceph-users] [EXTERNAL] Benchmarks using fio tool gets stuck

2016-10-05 Thread Mario Rodríguez Molins
Hi, Currently, we do not have a separated cluster network and our setup is: - 3 nodes for OSD with 1Gbps links. Each node is running a unique OSD daemon. Although we plan to increase the number of OSDs per host. - 3 virtual machines also with 1Gbps links, where each vm is running one monitor dae

Re: [ceph-users] cephfs kernel driver - failing to respond to cache pressure

2016-10-05 Thread Stephen Horton
Hello Zheng This is my initial email containing ceph -s and session ls info. I will send cache dump shortly. Note that per John's suggestion, I have upgraded the offending clients to 4.8 kernel, so my cache dump will be current with these new clients. Thanks, Stephen Begin forwarded message:

Re: [ceph-users] cephfs kernel driver - failing to respond to cache pressure

2016-10-05 Thread Stephen Horton
Clients are almost all idle, very little load on the cluster. I can see no errors or warnings in the client logs when the file share is unmounted. Thx! > On Oct 4, 2016, at 10:31 PM, Yan, Zheng wrote: > >> On Tue, Oct 4, 2016 at 11:30 PM, John Spray wrote: >>> On Tue, Oct 4, 2016 at 5:09 PM, S

Re: [ceph-users] [EXTERNAL] Benchmarks using fio tool gets stuck

2016-10-05 Thread Will . Boege
What does your network setup look like? Do you have a separate cluster network? Can you explain how you are performing the FIO test? Are you mounting a volume through krbd and testing that from a different server? On Oct 5, 2016, at 3:11 AM, Mario Rodríguez Molins mailto:mariorodrig...@tuenti.

Re: [ceph-users] Merging CephFS data pools

2016-10-05 Thread Burkhard Linke
Hi, On 10/05/2016 02:18 PM, Yan, Zheng wrote: On Wed, Oct 5, 2016 at 5:06 PM, Burkhard Linke wrote: Hi, I've managed to move the data from the old pool to the new one using some shell scripts and cp/rsync. Recursive getfattr on the mount point does not reveal any file with a layout refering

Re: [ceph-users] 6 Node cluster with 24 SSD per node: Hardwareplanning/ agreement

2016-10-05 Thread Denny Fuchs
hi, Even better than 10G, 25GB is clocked faster than 10GB, so you should see slightly lower latency vs 10G. Just make sure the kernel you will be using will support those Nics. ah, nice to know :-) Thanks for that hint ! cu denny ___ ceph-users ma

Re: [ceph-users] Merging CephFS data pools

2016-10-05 Thread Yan, Zheng
On Wed, Oct 5, 2016 at 5:06 PM, Burkhard Linke wrote: > Hi, > > I've managed to move the data from the old pool to the new one using some > shell scripts and cp/rsync. Recursive getfattr on the mount point does not > reveal any file with a layout refering the old pool. > > Nonetheless 486 objects

Re: [ceph-users] 6 Node cluster with 24 SSD per node: Hardwareplanning/ agreement

2016-10-05 Thread Nick Fisk
> -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Denny Fuchs > Sent: 05 October 2016 12:43 > To: ceph-users@lists.ceph.com > Subject: Re: [ceph-users] 6 Node cluster with 24 SSD per node: > Hardwareplanning/ agreement > > hi, > > I get a

[ceph-users] Ceph Developer Monthly

2016-10-05 Thread Patrick McGarry
Hey cephers, Just another reminder that today at 12:30 EDT we will be holding this month's CDM. Http://wiki.ceph.com/Planning Please join us if you are doing active work on Ceph. Thanks! ___ ceph-users mailing list ceph-users@lists.ceph.com http://list

Re: [ceph-users] 6 Node cluster with 24 SSD per node: Hardwareplanning/ agreement

2016-10-05 Thread Denny Fuchs
hi, I get a call from Mellanox and we get now a offer for the following network: * 2 x SN2100 100Gb/s Switch 16 ports * 10 x ConnectX 4LX-EN 25Gb card for hypervisor and OSD nodes * 4 x Adapter from Mellanox QSA to SFP+ port for interconnecting to our HP 2920 switches * 3 x Copper split cabl

Re: [ceph-users] 6 Node cluster with 24 SSD per node: Hardwareplanning/ agreement

2016-10-05 Thread Denny Fuchs
Hi, Am 05.10.2016 10:48, schrieb Christian Balzer: The switch has nothing to do IPoIB, as the name implies it's entirely native Infiniband with IP encoded onto it. Thus its benefits from fast CPUs. ahh, I suggested it ... :-) but on some documents from Mellanox I thought, it has to be suppor

Re: [ceph-users] Merging CephFS data pools

2016-10-05 Thread Burkhard Linke
Hi, I've managed to move the data from the old pool to the new one using some shell scripts and cp/rsync. Recursive getfattr on the mount point does not reveal any file with a layout refering the old pool. Nonetheless 486 objects are left in the pool: ... POOLS: NAME I

Re: [ceph-users] 6 Node cluster with 24 SSD per node: Hardware planning / agreement

2016-10-05 Thread Denny Fuchs
hi, With Xeon E3 1245's (3.6Ghz with all 4 cores Turbo'd) and P3700 Journal with 10GB networking I have managed to get it down to around 600-700us. Make sure you force P-States and C-states as without I was only getting about 2ms. I've written it in our buy/change list :-) Ah ok, fair do's.

Re: [ceph-users] 6 Node cluster with 24 SSD per node: Hardwareplanning/ agreement

2016-10-05 Thread Christian Balzer
Hello, On Wed, 05 Oct 2016 10:18:19 +0200 Denny Fuchs wrote: > Hi and good morning, > > Am 04.10.2016 17:19, schrieb Burkhard Linke: > > >> * Storage NIC: 1 x Infiniband MCX314A-BCCT > >> ** I red, that ConnectX-3 Pro is better supported, than the X-4 and a > >> bit cheaper > >> ** Switch: 2

Re: [ceph-users] 6 Node cluster with 24 SSD per node: Hardwareplanning/ agreement

2016-10-05 Thread Denny Fuchs
Hi and good morning, Am 04.10.2016 17:19, schrieb Burkhard Linke: * Storage NIC: 1 x Infiniband MCX314A-BCCT ** I red, that ConnectX-3 Pro is better supported, than the X-4 and a bit cheaper ** Switch: 2 x Mellanox SX6012 (56Gb/s) ** Active FC cables ** Maybe VPI is nice to have, but unsure.

[ceph-users] Benchmarks using fio tool gets stuck

2016-10-05 Thread Mario Rodríguez Molins
Hello, We are setting a new cluster of Ceph and doing some benchmarks on it. At this moment, our cluster consists of: - 3 nodes for OSD. In our current configuration one daemon per node. - 3 nodes for monitors (MON). In two of these nodes, there is a metadata server (MDS). Benchmarks are perfor

[ceph-users] What's the current status of rbd_recover_tool ?

2016-10-05 Thread Bartłomiej Święcki
Hi, I'm currently checking the possibility of rbd image export in a disaster recovery scenario. I've found rbd-recovery-tool (https://github.com/ceph/ceph/tree/master/src/tools/rbd_recover_tool) which looks promising. However I couldn't make it work even with 0.80.9. Do you know what's the st