Re: [ceph-users] Can't start OSD

2014-08-08 Thread Karan Singh
Try to make these OSD IN ceph osd in osd.12 osd.13 osd.14 osd.15 Then restart osd services - Karan Singh - On 08 Aug 2014, at 00:55, O'Reilly, Dan wrote: > # idweight type name up/down reweight > -1 7.2 root default > -2 1.8 host tm1cldosdl01 > 0 0

[ceph-users] nf_conntrack overflow crashes OSDs

2014-08-08 Thread Christian Kauhaus
Hi, today I'd like to share a severe problem we've found (and fixed) on our Ceph cluster. We're running 48 OSDs (8 per host). While restarting all OSDs on a host, the kernel's nf_conntrack table was overflown. This rendered all OSDs on that machine unusable. The symptoms were as follows. In the k

Re: [ceph-users] nf_conntrack overflow crashes OSDs

2014-08-08 Thread Dan Van Der Ster
Hi Christian, This is good advice. Presumably we saw this issue before, since we have the following in our cluster’s puppet manifest: sysctl { "net.netfilter.nf_conntrack_max": val => "1024000", } sysctl { "net.nf_conntrack_max": val => "1024000", } But I don’t remember when or how we discov

[ceph-users] Show IOps per VM/client to find heavy users...

2014-08-08 Thread Andrija Panic
Hi, we just had some new clients, and have suffered very big degradation in CEPH performance for some reasons (we are using CloudStack). I'm wondering if there is way to monitor OP/s or similar usage by client connected, so we can isolate the heavy client ? Also, what is the general best practic

Re: [ceph-users] Show IOps per VM/client to find heavy users...

2014-08-08 Thread Wido den Hollander
On 08/08/2014 01:51 PM, Andrija Panic wrote: Hi, we just had some new clients, and have suffered very big degradation in CEPH performance for some reasons (we are using CloudStack). I'm wondering if there is way to monitor OP/s or similar usage by client connected, so we can isolate the heavy c

Re: [ceph-users] Show IOps per VM/client to find heavy users...

2014-08-08 Thread Andrija Panic
Thanks Wido, yes I'm aware of CloudStack in that sense, but would prefer some precise OP/s per ceph Image at least... Will check CloudStack then... Thx On 8 August 2014 13:53, Wido den Hollander wrote: > On 08/08/2014 01:51 PM, Andrija Panic wrote: > >> Hi, >> >> we just had some new clients,

Re: [ceph-users] nf_conntrack overflow crashes OSDs

2014-08-08 Thread Robert van Leeuwen
> today I'd like to share a severe problem we've found (and fixed) on our Ceph > cluster. We're running 48 OSDs (8 per host). While restarting all OSDs on a > host, the kernel's nf_conntrack table was overflown. This rendered all OSDs on > that machine unusable. It is also possible to specifically

Re: [ceph-users] Show IOps per VM/client to find heavy users...

2014-08-08 Thread Wido den Hollander
On 08/08/2014 02:02 PM, Andrija Panic wrote: Thanks Wido, yes I'm aware of CloudStack in that sense, but would prefer some precise OP/s per ceph Image at least... Will check CloudStack then... Ceph doesn't really know that since RBD is just a layer on top of RADOS. In the end the CloudStack h

Re: [ceph-users] Show IOps per VM/client to find heavy users...

2014-08-08 Thread Andrija Panic
Hm, true... One final question, I might be a noob... 13923 B/s rd, 4744 kB/s wr, 1172 op/s what does this op/s represent - is it classic IOps (4k reads/writes) or something else ? how much is too much :) - I'm familiar with SATA/SSD IO/s specs/tests, etc, but not sure what CEPH menas by op/s - cou

Re: [ceph-users] Show IOps per VM/client to find heavy users...

2014-08-08 Thread Dan Van Der Ster
Hi, Here’s what we do to identify our top RBD users. First, enable log level 10 for the filestore so you can see all the IOs coming from the VMs. Then use a script like this (used on a dumpling cluster): https://github.com/cernceph/ceph-scripts/blob/master/tools/rbd-io-stats.pl to summarize t

Re: [ceph-users] Can't start OSD

2014-08-08 Thread O'Reilly, Dan
Nope. Nothing works. This is VERY frustrating. What happened: - I rebooted the box, simulating a system failure. - When the system came back up, ceph wasn't started, and the osd volumes weren't mounted. - I did a "service ceph start osd" and the ceph processes don

Re: [ceph-users] Show IOps per VM/client to find heavy users...

2014-08-08 Thread Andrija Panic
Hi Dan, thank you very much for the script, will check it out...no thortling so far, but I guess it will have to be done... This seems to read only gziped logs? so since read only I guess it is safe to run it on proudction cluster now... ? The script will also check for mulitply OSDs as far as I

Re: [ceph-users] Show IOps per VM/client to find heavy users...

2014-08-08 Thread Dan Van Der Ster
Hi, On 08 Aug 2014, at 15:55, Andrija Panic mailto:andrija.pa...@gmail.com>> wrote: Hi Dan, thank you very much for the script, will check it out...no thortling so far, but I guess it will have to be done... This seems to read only gziped logs? Well it’s pretty simple, and it zcat’s each inp

Re: [ceph-users] Show IOps per VM/client to find heavy users...

2014-08-08 Thread Andrija Panic
Thanks again, and btw, beside being Friday I'm also on vacation - so double the joy of troubleshooting performance problmes :))) Thx :) On 8 August 2014 16:01, Dan Van Der Ster wrote: > Hi, > > On 08 Aug 2014, at 15:55, Andrija Panic wrote: > > Hi Dan, > > thank you very much for the scri

Re: [ceph-users] Show IOps per VM/client to find heavy users...

2014-08-08 Thread Wido den Hollander
On 08/08/2014 03:44 PM, Dan Van Der Ster wrote: Hi, Here’s what we do to identify our top RBD users. First, enable log level 10 for the filestore so you can see all the IOs coming from the VMs. Then use a script like this (used on a dumpling cluster): https://github.com/cernceph/ceph-scripts/bl

Re: [ceph-users] nf_conntrack overflow crashes OSDs

2014-08-08 Thread Christian Kauhaus
Am 08.08.2014 um 14:05 schrieb Robert van Leeuwen: > It is also possible to specifically not conntrack certain connections. > e.g. > iptables -t raw -A PREROUTING -p tcp --dport 6789 -j CT --notrack Thanks Robert. This is really an interesting approach. We will test it. Regards Christian -- Di

Re: [ceph-users] Can't start OSD

2014-08-08 Thread German Anders
How about the logs? Is something there? ls /var/log/ceph/ German Anders --- Original message --- Asunto: Re: [ceph-users] Can't start OSD De: "O'Reilly, Dan" Para: Karan Singh Cc: ceph-users@lists.ceph.com Fecha: Friday, 08/08/2014 10:53 Nope. Nothing works. This is V

Re: [ceph-users] Show IOps per VM/client to find heavy users...

2014-08-08 Thread Andrija Panic
Will do so definitively, thanks Wido and Dan... Cheers guys On 8 August 2014 16:13, Wido den Hollander wrote: > On 08/08/2014 03:44 PM, Dan Van Der Ster wrote: > >> Hi, >> Here’s what we do to identify our top RBD users. >> >> First, enable log level 10 for the filestore so you can see all the

Re: [ceph-users] Can't start OSD

2014-08-08 Thread O'Reilly, Dan
I’m afraid I don’t know exactly how to interpret this, but after a reboot: 2014-08-08 08:48:44.616005 7f0c3b1447a0 0 ceph version 0.80.1 (a38fe1169b6d2ac98b427334c12d7cf81f809b74), process ceph-osd, pid 2978 2014-08-08 08:48:44.635680 7f0c3b1447a0 0 filestore(/var/lib/ceph/osd/ceph-10) mount d

Re: [ceph-users] Ceph runs great then falters

2014-08-08 Thread Chris Kitzmiller
On Aug 4, 2014, at 10:53 PM, Christian Balzer wrote: > On Mon, 4 Aug 2014 15:11:39 -0400 Chris Kitzmiller wrote: >> On Aug 2, 2014, at 12:03 AM, Christian Balzer wrote: >>> On Fri, 1 Aug 2014 14:23:28 -0400 Chris Kitzmiller wrote: I have 3 nodes each running a MON and 30 OSDs. ... W

[ceph-users] Apache on Trusty

2014-08-08 Thread Craig Lewis
Is anybody running Ubuntu Trusty, but using Ceph's apache 2.2 and fastcgi packages? I'm a bit of a Ubuntu noob. I can't figure out the correct /etc/apt/preferences.d/ configs to prioritize Ceph's version of the packages. I keep getting Ubuntu's apache 2.4 packages. Can somebody that has this w

[ceph-users] PGs stuck creating

2014-08-08 Thread Brian Rak
ceph version 0.80.4 (7c241cfaa6c8c068bc9da8578ca00b9f4fc7567f) I recently managed to cause some problems for one of our clusters, we had 1/3 of the OSDs fail and lose all the data. I removed all the failed OSDs from the crush map, and did 'ceph osd rm'. Once it finished recovering, I was lef

Re: [ceph-users] PGs stuck creating

2014-08-08 Thread Brian Rak
Ahh figured it out. I hadn't removed the dead OSDs from the crush map, which was apparently confusing ceph. I just did 'ceph osd crush rm XXX' for all of them, restarted all the online OSDs, and the pg got created! On 8/8/2014 4:51 PM, Brian Rak wrote: ceph version 0.80.4 (7c241cfaa6c8c068b

Re: [ceph-users] [Ceph-community] working ceph.conf file?

2014-08-08 Thread Andrew Woodward
Dan, It is not necessary to specify the OSD data in ceph.conf anymore. Ceph has two auto-start functions besides this method. udev rules: ceph uses a udev rule to scan and attempt to mount (and activate) partitions with specific GUID set for the partition typecode > sgdisk --typecode=<>:<> /dev/

Re: [ceph-users] [Ceph-community] OSD won't restart after system boot

2014-08-08 Thread Andrew Woodward
Dan, I replied to your other thread, Please note that ceph-users is a better list for this type of subject matter. Also please pop on IRC #ceph on oftc.net as you are more likely going to be able to hash it out with folks there if you are so desperate for help. On Fri, Aug 8, 2014 at 8:55 AM, O'

[ceph-users] Introductions

2014-08-08 Thread Zach Hill
Hi all, I'm Zach Hill, the storage lead at Eucalyptus . We're working on adding Ceph RBD support for our scale-out block storage (EBS API). Things are going well, and we've been happy with Ceph thus far. We are a RHEL/CentOS shop mostly, so any other tips there would be

Re: [ceph-users] Can't start OSD

2014-08-08 Thread Matt Harlum
Hi, Can you run ls -lah /var/lib/ceph/osd/ceph-10/journal It’s saying it can’t find the journal Regards, Matt On 9 Aug 2014, at 12:51 am, O'Reilly, Dan wrote: > I’m afraid I don’t know exactly how to interpret this, but after a reboot: > > 2014-08-08 08:48:44.616005 7f0c3b1447a0 0 ceph ve

Re: [ceph-users] [Ceph-community] working ceph.conf file?

2014-08-08 Thread Matt Harlum
One thing to add, I had a similar issue with manually created OSD’s not coming back up with a reboot, they were being mounted but not started To resolve this I had to create a file on each OSD called sysvinit Regards, Matt On 9 Aug 2014, at 7:57 am, Andrew Woodward wrote: > Dan, > > It is no

Re: [ceph-users] Introductions

2014-08-08 Thread debian Only
As i konw , it is not recommend to run Ceph OSD (RBD server) same as the VM host like KVM. in another hand, more service in same host, it is hard to maintenance, and not good performance for each service. 2014-08-09 7:33 GMT+07:00 Zach Hill : > Hi all, > > I'm Zach Hill, the storage lead at Euc

[ceph-users] CRUSH map advice

2014-08-08 Thread John Morris
Our experimental Ceph cluster is performing terribly (with the operator to blame!), and while it's down to address some issues, I'm curious to hear advice about the following ideas. The cluster: - two disk nodes (6 * CPU, 16GB RAM each) - 8 OSDs (4 each) - 3 monitors - 10Gb front + back network