[ceph-users] Kernel Module

2013-05-07 Thread Gandalf Corvotempesta
Do I need the kernel module? I'm planning an Infrastructure for CephFS, QEmu and RGW. Will these need the kernel module or all is done in userspace? If I understood docs properly, the only case when the kernel module is needed is for use of a RBD block device directly from Linux, like a mount poin

Re: [ceph-users] EPEL packages for QEMU-KVM with rbd support?

2013-05-07 Thread Dan van der Ster
Hi Barry, On Mon, May 6, 2013 at 7:06 PM, Barry O'Rourke wrote: > Hi, > > I built a modified version of the fc17 package that I picked up from > koji [1]. That might not be ideal for you as fc17 uses systemd rather > than init, we use an in-house configuration management system which > handles se

Re: [ceph-users] HEALTH WARN: clock skew detected

2013-05-07 Thread Joao Eduardo Luis
On 05/06/2013 01:07 PM, Michael Lowe wrote: Um, start it? You must have synchronized clocks in a fault tolerant system (google Byzantine generals clock) and the way to do that is ntp, therefore ntp is required. On May 6, 2013, at 1:34 AM, Varun Chandramouli wrote: Hi Michael, Thanks for y

[ceph-users] Format of option string for rados_conf_set()

2013-05-07 Thread Guido Winkelmann
Hi, The API documentation for librados says that, instead of providing command line options or a configuration file, the rados object can also be configured by manually setting options with rados_conf_set() (or Rados::conf_set() for the C++ interface). This takes both the option and value as C-

Re: [ceph-users] Format of option string for rados_conf_set()

2013-05-07 Thread Wido den Hollander
On 05/07/2013 12:08 PM, Guido Winkelmann wrote: Hi, The API documentation for librados says that, instead of providing command line options or a configuration file, the rados object can also be configured by manually setting options with rados_conf_set() (or Rados::conf_set() for the C++ interfa

Re: [ceph-users] RadosGW High Availability

2013-05-07 Thread Igor Laskovy
I tried do that and put behind RR DNS, but unfortunately only one host can server requests from clients - second host does not responds totally. I am not to good familiar with apache, in standard log files nothing helpful. Maybe this whole HA design is wrong? Does anybody resolve HA for Rados Gate

Re: [ceph-users] 0.61 Cuttlefish released

2013-05-07 Thread Igor Laskovy
Hi, where can I read more about ceph-disk? On Tue, May 7, 2013 at 5:51 AM, Sage Weil wrote: > Spring has arrived (at least for some of us), and a new stable release of > Ceph is ready! Thank you to everyone who has contributed to this release! > > Bigger ticket items since v0.56.x "Bobtail":

Re: [ceph-users] HEALTH WARN: clock skew detected

2013-05-07 Thread Varun Chandramouli
Hi All, Thanks for the replies. I started the ntp daemon and the warnings as well as the crashes seem to have gone. This is the first time I set up a cluster (of physical machines), and was unaware of the need to synchronize the clocks. Probably should have googled it more :). Pardon my ignorance.

Re: [ceph-users] EPEL packages for QEMU-KVM with rbd support?

2013-05-07 Thread Barry O'Rourke
Hi, I'm not using OpenStack, I've only really been playing around with Ceph on test machines. I'm currently speccing up my production cluster and will probably end up running it along with OpenNebula. Barry On 07/05/13 10:01, Dan van der Ster wrote: Hi Barry, On Mon, May 6, 2013 at 7:06 PM

[ceph-users] Dell R515 performance and specification question

2013-05-07 Thread Barry O'Rourke
Hi, I'm looking to purchase a production cluster of 3 Dell Poweredge R515's which I intend to run in 3 x replication. I've opted for the following configuration; 2 x 6 core processors 32Gb RAM H700 controller (1Gb cache) 2 x SAS OS disks (in RAID1) 2 x 1Gb ethernet (bonded for cluster network

[ceph-users] scrub error: found clone without head

2013-05-07 Thread Dzianis Kahanovich
I have 4 scrub errors (3 PGs - "found clone without head"), on one OSD. Not repairing. How to repair it exclude re-creating of OSD? Now it "easy" to clean+create OSD, but in theory - in case there are multiple OSDs - it may cause data lost. -- WBR, Dzianis Kahanovich AKA Denis Kaganovich, http:/

Re: [ceph-users] Dell R515 performance and specification question

2013-05-07 Thread Jens Kristian Søgaard
Hi, I'd be interested to hear from anyone running a similar configuration I'm running a somewhat similar configuration here. I'm wondering why you have left out SSDs for the journals? I gather they would be quite important to achieve a level of performance for hosting 100 virtual machines

[ceph-users] Best practice for osd_min_down_reporters

2013-05-07 Thread Wido den Hollander
Hi, I was just upgrading a 9 nodes, 36 OSD cluster running the next branch from some days ago to the Cuttlefish release. While rebooting the nodes one by one and waiting for a active+clean for all PGs I noticed that some weird things happened. I reboot a node and see: "osdmap e580: 36 osds

Re: [ceph-users] Dell R515 performance and specification question

2013-05-07 Thread Mike Lowe
FWIW, here is what I have for my ceph cluster: 4 x HP DL 180 G6 12Gb RAM P411 with 512MB Battery Backed Cache 10GigE 4 HP MSA 60's with 12 x 1TB 7.2k SAS and SATA drives (bought at different times so there is a mix) 2 HP D2600 with 12 x 3TB 7.2k SAS Drives I'm currently running 79 qemu/kvm vm's

Re: [ceph-users] Best practice for osd_min_down_reporters

2013-05-07 Thread Andrey Korolyov
Hi Wido, I have experienced same problem almost half a year ago, and finally set this value to 3 - no more wrong marks was given, except extreme high disk load when OSD really went down for a couple of seconds. On Tue, May 7, 2013 at 4:59 PM, Wido den Hollander wrote: > Hi, > > I was just upgrad

Re: [ceph-users] Dell R515 performance and specification question

2013-05-07 Thread Barry O'Rourke
Hi, I'm running a somewhat similar configuration here. I'm wondering why you have left out SSDs for the journals? I can't go into exact prices due to our NDA, but I can say that getting a couple of decent SSD disks from Dell will increase the cost per server by a four figure sum, and we're o

Re: [ceph-users] Dell R515 performance and specification question

2013-05-07 Thread Mark Nelson
On 05/07/2013 06:50 AM, Barry O'Rourke wrote: Hi, I'm looking to purchase a production cluster of 3 Dell Poweredge R515's which I intend to run in 3 x replication. I've opted for the following configuration; 2 x 6 core processors 32Gb RAM H700 controller (1Gb cache) 2 x SAS OS disks (in RAID1)

Re: [ceph-users] HEALTH WARN: clock skew detected

2013-05-07 Thread Mike Lowe
You've learned on of the three computer science facts you need to know about distributed systems, and I'm glad I could pass something on: 1. Consistent, Available, Distributed - pick any two 2. To completely guard against k failures where you don't know which one failed just by looking you need

Re: [ceph-users] HEALTH WARN: clock skew detected

2013-05-07 Thread Joao Eduardo Luis
On 05/07/2013 03:20 PM, Mike Lowe wrote: You've learned on of the three computer science facts you need to know about distributed systems, and I'm glad I could pass something on: 1. Consistent, Available, Distributed - pick any two To some degree of Consistent, Available and Distributed. :-P

Re: [ceph-users] Kernel Module

2013-05-07 Thread Gregory Farnum
On Tuesday, May 7, 2013, Gandalf Corvotempesta wrote: > Do I need the kernel module? I'm planning an Infrastructure for > CephFS, QEmu and RGW. Will these need the kernel module or all is done > in userspace? > > If I understood docs properly, the only case when the kernel module is > needed is fo

Re: [ceph-users] Best practice for osd_min_down_reporters

2013-05-07 Thread Gregory Farnum
On Tuesday, May 7, 2013, Wido den Hollander wrote: > Hi, > > I was just upgrading a 9 nodes, 36 OSD cluster running the next branch > from some days ago to the Cuttlefish release. > > While rebooting the nodes one by one and waiting for a active+clean for > all PGs I noticed that some weird things

Re: [ceph-users] Dell R515 performance and specification question

2013-05-07 Thread Dave Spano
Barry, I have a similar setup and found that the 600GB 15K SAS drives work well. The 2TB 7200 disks did not work as well due to my not using SSD. Running the journal and the data on big slow drives will result in slow writes. All the big boys I've encountered are running SSDs. Currently, I'm u

Re: [ceph-users] 0.61 Cuttlefish released

2013-05-07 Thread John Wilkins
Igor, I haven't closed out 3674, because I haven't covered that part yet. Chef docs are now in the wiki, but I'll be adding ceph-disk docs shortly. On Tue, May 7, 2013 at 3:25 AM, Igor Laskovy wrote: > Hi, > > where can I read more about ceph-disk? > > > On Tue, May 7, 2013 at 5:51 AM, Sage We

[ceph-users] OSD crash during script, 0.56.4

2013-05-07 Thread Travis Rhoden
Hey folks, Saw this crash the other day: ceph version 0.56.4 (63b0f854d1cef490624de5d6cf9039735c7de5ca) 1: /usr/bin/ceph-osd() [0x788fba] 2: (()+0xfcb0) [0x7f19d1889cb0] 3: (gsignal()+0x35) [0x7f19d0248425] 4: (abort()+0x17b) [0x7f19d024bb8b] 5: (__gnu_cxx::__verbose_terminate_handler()+0x1

Re: [ceph-users] Mounting CephFS - mount error 5 = Input/output error

2013-05-07 Thread Wyatt Gorman
Here's the result of running ceph-mds -i a -d ceph-mds -i a -d 2013-05-07 13:33:11.816963 b732a710 0 starting mds.a at :/0 ceph version 0.56.6 (95a0bda7f007a33b0dc7adf4b330778fa1e5d70c), process ceph-mds, pid 9900 2013-05-07 13:33:11.824077 b4a1bb70 0 mds.-1.0 ms_handle_connect on 10.81.2.100:67

Re: [ceph-users] Kernel Module

2013-05-07 Thread Gandalf Corvotempesta
2013/5/7 Gregory Farnum : > As long as you're planning to use ceph-fuse for your filesystem access, you > don't need anything in the kernel. I will not use ceph-fuse but plain ceph-fs when production ready. Ceph-fs should not need kernel module, like ? _

Re: [ceph-users] Kernel Module

2013-05-07 Thread Gregory Farnum
To access CephFS you need to either use the kernel client or a userspace client. The userspace CephFS client is called ceph-fuse; if you want to use the kernel's built-in access then obviously you need it on your machine... -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Tue

Re: [ceph-users] scrub error: found clone without head

2013-05-07 Thread Dzianis Kahanovich
Dzianis Kahanovich пишет: > I have 4 scrub errors (3 PGs - "found clone without head"), on one OSD. Not > repairing. How to repair it exclude re-creating of OSD? > > Now it "easy" to clean+create OSD, but in theory - in case there are multiple > OSDs - it may cause data lost. OOPS! After re-creat

Re: [ceph-users] Kernel Module

2013-05-07 Thread Gandalf Corvotempesta
Any performance penalty forma both solutions? Il giorno 07/mag/2013 19:40, "Gregory Farnum" ha scritto: > To access CephFS you need to either use the kernel client or a > userspace client. The userspace CephFS client is called ceph-fuse; if > you want to use the kernel's built-in access then obvi

Re: [ceph-users] Kernel Module

2013-05-07 Thread Gregory Farnum
It actually depends on what your accesses look like; they have different strengths and weaknesses. In general they perform about the same, though. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Tue, May 7, 2013 at 10:53 AM, Gandalf Corvotempesta wrote: > Any performance pe

Re: [ceph-users] Dell R515 performance and specification question

2013-05-07 Thread Igor Laskovy
If I currently understand idea, when this 1 SSD will fail whole node with that SSD will fail. Correct? What scenario for node recovery in this case? Playing with "ceph-osd --flush-journal" and "ceph-osd --mkjournal" for each osd? On Tue, May 7, 2013 at 4:17 PM, Mark Nelson wrote: > On 05/07/201

Re: [ceph-users] Dell R515 performance and specification question

2013-05-07 Thread Barry O'Rourke
Hi, > With so few disks and the inability to do 10GbE, you may want to > consider doing something like 5-6 R410s or R415s and just using the > on-board controller with a couple of SATA disks and 1 SSD for the > journal. That should give you better aggregate performance since in > your case yo

Re: [ceph-users] Dell R515 performance and specification question

2013-05-07 Thread Barry O'Rourke
Hi, On Tue, 2013-05-07 at 21:07 +0300, Igor Laskovy wrote: > If I currently understand idea, when this 1 SSD will fail whole node > with that SSD will fail. Correct? Only OSDs that use that SSD for the journal will fail as they will lose any writes still in the journal. If I only have 2 OSDs sha

Re: [ceph-users] Dell R515 performance and specification question

2013-05-07 Thread Barry O'Rourke
Hi, > Here's a quick performance display with various block sizes on a host > with 1 public 1Gbe link and 1 1Gbe link on the same vlan as the ceph > cluster. Thanks for taking the time to look into this for me, I'll compare it with my existing set-up in the morning. Thanks, Barry -- The Univ

Re: [ceph-users] Dell R515 performance and specification question

2013-05-07 Thread Mark Nelson
On 05/07/2013 03:36 PM, Barry O'Rourke wrote: Hi, With so few disks and the inability to do 10GbE, you may want to consider doing something like 5-6 R410s or R415s and just using the on-board controller with a couple of SATA disks and 1 SSD for the journal. That should give you better aggregat

Re: [ceph-users] Rados Gateway Pools

2013-05-07 Thread Jeppesen, Nelson
Now the .61 is out I have tried getting a second radosgw farm working but into an issue using a custom root/zone pool. The 'radosgw-admin zone set' and ' radosgw-admin zone info' commands are working fine except it keeps defaulting to using .rgw.root. I've tried the two settings, the one you g

Re: [ceph-users] Rados Gateway Pools

2013-05-07 Thread Yehuda Sadeh
On Tue, May 7, 2013 at 2:26 PM, Jeppesen, Nelson wrote: > Now the .61 is out I have tried getting a second radosgw farm working but > into an issue using a custom root/zone pool. > > > > The ‘radosgw-admin zone set’ and ‘ radosgw-admin zone info’ commands are > working fine except it keeps defaul

Re: [ceph-users] Rados Gateway Pools

2013-05-07 Thread Jeppesen, Nelson
The settings are under the the rgw client settings [client.radosgw.internal.01] rgw root zone pool = .rgw.zone2 rgw cluster root pool = .rgw.zone2 I tried 'radosgw-admin zone set --rgw-root-zone-pool=.rgw.zone2 < zone2' and 'radosgw-admin zone info --rgw-root-zone-pool=.rgw.

Re: [ceph-users] Rados Gateway Pools

2013-05-07 Thread Yehuda Sadeh
On Tue, May 7, 2013 at 2:54 PM, Jeppesen, Nelson wrote: > The settings are under the the rgw client settings > > [client.radosgw.internal.01] > rgw root zone pool = .rgw.zone2 > rgw cluster root pool = .rgw.zone2 > > I tried 'radosgw-admin zone set --rgw-root-zone-pool=.rgw.zone

[ceph-users] CentOs kernel for ceph server

2013-05-07 Thread MinhTien MinhTien
Dear all, I deploy ceph with Centos 6.3. When I upgrade kernel 3.9.0, I having few problems with card raid. I want to deploy ceph with default kernel 2.6.32. *This is good isn't it*? Ceph client will use the lartest kernel (3.9.0). Thanks and Regard

[ceph-users] ceph-deploy documentation fixes

2013-05-07 Thread Bryan Stillwell
With the release of cuttlefish, I decided to try out ceph-deploy and ran into some documentation errors along the way: http://ceph.com/docs/master/rados/deployment/preflight-checklist/ Under 'CREATE A USER' it has the following line: To provide full privileges to the user, add the following to

Re: [ceph-users] Rados Gateway Pools

2013-05-07 Thread Jeppesen, Nelson
Figured it out; On your post last month you were using 'rgw-zone-root-pool' but today you're using ' rgw-root-zone-pool.' I didn't notice that root and zone had switched and was using your old syntax. It's working now though. Thank you for your help again! Nelson Jeppesen -Original Messag

Re: [ceph-users] CentOs kernel for ceph server

2013-05-07 Thread Gregory Farnum
On Tue, May 7, 2013 at 4:45 PM, MinhTien MinhTien wrote: > Dear all, > > I deploy ceph with Centos 6.3. When I upgrade kernel 3.9.0, I having few > problems with card raid. > > I want to deploy ceph with default kernel 2.6.32. This is good isn't it? > > Ceph client will use the lartest kernel (3.9

Re: [ceph-users] Cluster unable to finish balancing

2013-05-07 Thread Berant Lemmenes
So just a little update... after replacing the original failed drive things seem to be progressing a little better however I noticed something else odd. Looking at a 'rados df' it looks like the system thinks that the data pool has 32 TB of data, this is only a 18TB raw system. pool name cat

[ceph-users] HELP: raid 6 - osd low reqest

2013-05-07 Thread Lenon Join
Dear all, I use raid 6 deployment ceph. I have 1 SSD partition (raid 0). I use SSD make journal for OSD. Raid 6 containt 60TB, divided into 4 OSD.. When I deploy, OSD usually reflects the slow request. 001080 [write 0~4194304 [5@0]] 0.72bf90bf snapc 1=[]) v4 currently commit sent 2013-05-08 10: