Re: [ceph-users] Fwd: Small fix for ceph.spec

2013-07-30 Thread Danny Al-Gaaf
Hi,

I think this is a bug in packaging of the leveldb package in this case
since the spec-file already sets dependencies on on leveldb-devel.

leveldb depends on snappy, therefore the leveldb package should set a
dependency on snappy-devel for leveldb-devel (check the SUSE spec file
for leveldb:
https://build.opensuse.org/package/view_file/home:dalgaaf:ceph:extra/leveldb/leveldb.spec?expand=1).
This way the RPM build process will pick up the correct packages needed
to build ceph.

Which distro do you use?

Danny

Am 30.07.2013 01:33, schrieb Patrick McGarry:
> -- Forwarded message --
> From: Erik Logtenberg 
> Date: Mon, Jul 29, 2013 at 7:07 PM
> Subject: [ceph-users] Small fix for ceph.spec
> To: ceph-users@lists.ceph.com
> 
> 
> Hi,
> 
> The spec file used for building rpm's misses a build time dependency on
> snappy-devel. Please see attached patch to fix.
> 
> Kind regards,
> 
> Erik.
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Fwd: Small fix for ceph.spec

2013-07-30 Thread Erik Logtenberg
Hi,

Fedora, in this case Fedora 19, x86_64.

Kind regards,

Erik.


On 07/30/2013 09:29 AM, Danny Al-Gaaf wrote:
> Hi,
> 
> I think this is a bug in packaging of the leveldb package in this case
> since the spec-file already sets dependencies on on leveldb-devel.
> 
> leveldb depends on snappy, therefore the leveldb package should set a
> dependency on snappy-devel for leveldb-devel (check the SUSE spec file
> for leveldb:
> https://build.opensuse.org/package/view_file/home:dalgaaf:ceph:extra/leveldb/leveldb.spec?expand=1).
> This way the RPM build process will pick up the correct packages needed
> to build ceph.
> 
> Which distro do you use?
> 
> Danny
> 
> Am 30.07.2013 01:33, schrieb Patrick McGarry:
>> -- Forwarded message --
>> From: Erik Logtenberg 
>> Date: Mon, Jul 29, 2013 at 7:07 PM
>> Subject: [ceph-users] Small fix for ceph.spec
>> To: ceph-users@lists.ceph.com
>>
>>
>> Hi,
>>
>> The spec file used for building rpm's misses a build time dependency on
>> snappy-devel. Please see attached patch to fix.
>>
>> Kind regards,
>>
>> Erik.
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] [PATCH] Add missing buildrequires for Fedora

2013-07-30 Thread Erik Logtenberg
Hi,

This patch adds two buildrequires to the ceph.spec file, that are needed
to build the rpms under Fedora. Danny Al-Gaaf commented that the
snappy-devel dependency should actually be added to the leveldb-devel
package. I will try to get that fixed too, in the mean time, this patch
does make sure Ceph builds on Fedora.

Signed-off-by: Erik Logtenberg 
---
--- ceph.spec-orig	2013-07-30 00:24:54.70500 +0200
+++ ceph.spec	2013-07-30 01:20:23.59300 +0200
@@ -42,6 +42,8 @@
 BuildRequires:  libxml2-devel
 BuildRequires:  libuuid-devel
 BuildRequires:  leveldb-devel > 1.2
+BuildRequires:  snappy-devel
+BuildRequires:  junit
 
 #
 # specific
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Fwd: Small fix for ceph.spec

2013-07-30 Thread Danny Al-Gaaf
Hi,

then the Fedora package is broken. If you check the spec file of:

http://dl.fedoraproject.org/pub/fedora/linux/updates/19/SRPMS/leveldb-1.12.0-3.fc19.src.rpm


You can see the spec-file sets a:

BuildRequires:  snappy-devel

But not the corresponding "Requires: snappy-devel" for the devel package.

You should report this issue to your distribution, it needs to be fixed
there instead of adding a workaround to the ceph spec.

Regards,

Danny

Am 30.07.2013 09:42, schrieb Erik Logtenberg:
> Hi,
> 
> Fedora, in this case Fedora 19, x86_64.
> 
> Kind regards,
> 
> Erik.
> 
> 
> On 07/30/2013 09:29 AM, Danny Al-Gaaf wrote:
>> Hi,
>>
>> I think this is a bug in packaging of the leveldb package in this case
>> since the spec-file already sets dependencies on on leveldb-devel.
>>
>> leveldb depends on snappy, therefore the leveldb package should set a
>> dependency on snappy-devel for leveldb-devel (check the SUSE spec file
>> for leveldb:
>> https://build.opensuse.org/package/view_file/home:dalgaaf:ceph:extra/leveldb/leveldb.spec?expand=1).
>> This way the RPM build process will pick up the correct packages needed
>> to build ceph.
>>
>> Which distro do you use?
>>
>> Danny
>>
>> Am 30.07.2013 01:33, schrieb Patrick McGarry:
>>> -- Forwarded message --
>>> From: Erik Logtenberg 
>>> Date: Mon, Jul 29, 2013 at 7:07 PM
>>> Subject: [ceph-users] Small fix for ceph.spec
>>> To: ceph-users@lists.ceph.com
>>>
>>>
>>> Hi,
>>>
>>> The spec file used for building rpm's misses a build time dependency on
>>> snappy-devel. Please see attached patch to fix.
>>>
>>> Kind regards,
>>>
>>> Erik.
>>>
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
> 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] "rbd ls -l" hangs

2013-07-30 Thread Jeff Moskow
This is the same issue as yesterday, but I'm still searching for a 
solution.  We have a lot of data on the cluster that we need and can't 
get to it reasonably (It took over 12 hours to export a 2GB image).


The only thing that status reports as wrong is:

   health HEALTH_WARN 1 pgs incomplete; 1 pgs stuck inactive; 1 pgs 
stuck unclean


FYI - this happened after we added a fifth node and two mons (total now 
5) to our cluster.


Thanks for any help!

--

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] "rbd ls -l" hangs

2013-07-30 Thread Jens Kristian Søgaard

Hi,

This is the same issue as yesterday, but I'm still searching for a 
solution.  We have a lot of data on the cluster that we need and can't 
   health HEALTH_WARN 1 pgs incomplete; 1 pgs stuck inactive; 1 pgs 


I'm not claiming to have an answer, but I have a suggestion you can try.

Try running "ceph pg dump" to list all the pgs. Grep for ones that are 
inactive / incomplete. Note which osds they are on - it is listed in the 
square brackets with the primary being the first in the list.


Now try restarting the primary osd for the stuck pg and see if that 
could possible shift things into place.


--
Jens Kristian Søgaard, Mermaid Consulting ApS,
j...@mermaidconsulting.dk,
http://www.mermaidconsulting.com/
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] "rbd ls -l" hangs

2013-07-30 Thread Jeff Moskow
Thanks!  I tried restarting osd.11 (the primary osd for the incomplete pg) and
that helped a LOT.   We went from 0/1 op/s to 10-800+ op/s!

We still have "HEALTH_WARN 1 pgs incomplete; 1 pgs stuck inactive; 1 pgs stuck 
unclean", but at least we can
use our cluster :-)

ceph pg dump_stuck inactive
ok
pg_stat objects mip degrunf bytes   log disklog state   
state_stamp v   reportedup acting  last_scrub  scrub_stamp  
   last_deep_scrub deep_scrub_stamp
2.1f6   118 0   0   0   403118080   0   0   
incomplete  2013-07-30 06:08:18.883179 11127'11658123  12914'1506  
[11,9]  [11,9]  10321'11641837  2013-07-28 00:59:09.552640  10321'11641837

Thanks again!
Jeff


On Tue, Jul 30, 2013 at 11:44:58AM +0200, Jens Kristian S?gaard wrote:
> Hi,
>
>> This is the same issue as yesterday, but I'm still searching for a  
>> solution.  We have a lot of data on the cluster that we need and can't  
>>health HEALTH_WARN 1 pgs incomplete; 1 pgs stuck inactive; 1 pgs 
>
> I'm not claiming to have an answer, but I have a suggestion you can try.
>
> Try running "ceph pg dump" to list all the pgs. Grep for ones that are  
> inactive / incomplete. Note which osds they are on - it is listed in the  
> square brackets with the primary being the first in the list.
>
> Now try restarting the primary osd for the stuck pg and see if that  
> could possible shift things into place.
>
> -- 
> Jens Kristian S?gaard, Mermaid Consulting ApS,
> j...@mermaidconsulting.dk,
> http://www.mermaidconsulting.com/

-- 
===
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] "rbd ls -l" hangs

2013-07-30 Thread Jeff Moskow
OK - so while things are definitely better, we still are not where we 
were and "rbd ls -l" still hangs.


Any suggestions?

--

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] FW: Issues with ceph-deploy

2013-07-30 Thread John Wilkins
Matthew,

I think one of the central differences is that mkcephfs read the
ceph.conf file, and generated the OSDs from the ceph.conf file. It
also generated the fsid, and placed it into the cluster map, but
didn't modify the ceph.conf file itself.

By contrast, "ceph-deploy new" generates the fsid, the initial monitor
host(s), the initial monitor host(s) address, turns authentication on,
sets a journal size and assumes you'll be using omap for xattrs (i.e.,
typically used with ext4), and places these settings into an initial
ceph.conf file. ceph-deploy mon create uses this when creating the
monitors (i.e., bootstrapping requires creating the monitor keys).

http://ceph.com/docs/master/rados/configuration/mon-config-ref/#bootstrapping-monitors

You need at least one monitor for a ceph cluster. No monitor, no
cluster. You also need at least two OSDs for peering, heartbeats, etc.

With mkcephfs, the OSD map was generated from the ceph.conf file and
you had to specify the domain name of an OSD host in your ceph.conf
file. You'd simply mount drives under the default osd data path--one
disk for each OSD, and often one SSD disk or partition for each
journal for added performance. Just as mkcephfs did not put the fsid,
and mon initial members into ceph.conf, ceph-deploy doesn't put the
osd configuration into ceph.conf.  Personally, I'd rather it be there
for edification purposes. However, one reason to defer to maps is that
sometimes people don't keep config files updated across the cluster
(e.g., part of the rationale for ceph-deploy admin and ceph-deploy
push | pull config). For example, changing monitor IP addresses was an
issue that came up that wasn't particularly intuitive to end users,
because you can't just change the config file and have it update the
cluster map. Have a look here:
http://ceph.com/docs/master/rados/operations/add-or-rm-mons/#changing-a-monitor-s-ip-address

You might also want to look at
http://ceph.com/docs/master/architecture/#cluster-map to see the
contents of each component of a cluster map and its contents. That's
really what's deterministic for the daemons that get started. However,
if you add osd sections to your ceph.conf file and push them out to
the nodes, the new settings will get picked up. See
http://ceph.com/docs/master/rados/configuration/osd-config-ref/ for
OSD settings.  See
http://ceph.com/docs/master/rados/configuration/ceph-conf/ for a
general discussion of the ceph configuration file. You can also make
runtime changes as discussed in that document.

As far as mount options are concerned, I believe they have to be
specified at create time. Can someone correct me if I'm wrong?





On Tue, Jul 30, 2013 at 8:46 AM, Alfredo Deza  wrote:
> There seems to be a bug in `ceph-deploy mon create {node}` where it doesn't
> create the keyrings at all.
>
> Another problem here is that it is *very* difficult to tell what is
> happening remotely as `ceph-deploy` doesn't really tell you even when you
> have verbose flags.
>
> I really need to have my pull request merged
> (https://github.com/ceph/ceph-deploy/pull/24) so I can start working on
> getting better output (and easier to debug).
>
> I wonder what version of ceph-deploy is he using too
>
>
> On Mon, Jul 29, 2013 at 11:32 AM, Ian Colle  wrote:
>>
>> Any ideas?
>>
>> Ian R. Colle
>> Director of Engineering
>> Inktank
>> Cell: +1.303.601.7713 
>> Email: i...@inktank.com
>>
>>
>> Delivering the Future of Storage
>>
>>
>>  
>>
>>  
>>
>>
>>
>>
>> On 7/29/13 9:56 AM, "Matthew Richardson"  wrote:
>>
>> >I'm currently running test pools using mkcephfs, and am now
>> >investigating deploying using ceph-deploy.  I've hit a couple of
>> >conceptual changes which I can't find any documentation for, and was
>> >wondering if someone here could give me some answers as to how things
>> >now work.
>> >
>> >While ceph-deploy creates an initial ceph.conf, it doesn't update this
>> >when I do things like 'osd create' to add new osd sections.  However
>> >when I restart the ceph service, it picks up the new osds quite happily.
>> > How does it 'know' what osds it should be starting, and with what
>> >configuration?
>> >
>> >Since there are no sections corresponding to theses new osds, how do I
>> >go about adding specific configuration for them - such as 'cluster addr'
>> >and then push this new config?  Or is there a way to pass in custom
>> >configuration to the 'osd create' subcommand at the point of osd
>> > creation?
>> >
>> >I have subsequently updated the [osd] section to set
>> >'osd_mount_options_xfs' and done a 'config push' - however the mount
>> >options don't seem to change when I restart the ceph service.  Any clues
>> >why this might be?
>> >
>> >Thanks,
>> >
>> >Matthew
>> >
>> >--
>> >The University of Edinburgh is a charitable body, registered in
>> >Scotland, with registration number SC005336.
>> >
>> >___
>> >ceph-us

[ceph-users] inconsistent pg: no 'snapset' attr

2013-07-30 Thread John Nielsen
I am running a ceph cluster with 24 OSD's across 3 nodes, Cuttlefish 0.61.3. 
Recently an inconsistent PG cropped up:

# ceph health detail
HEALTH_ERR 1 pgs inconsistent; 1 scrub errors
pg 11.2d5 is active+clean+inconsistent, acting [5,22,9]
1 scrub errors

Pool 11 is .rgw.buckets, used by a RADOS gateway on a separate machine.

I tried to repair the pg but it was ineffective. In the OSD log, I see this:

2013-07-30 13:26:00.948312 7f83a9d32700  0 log [ERR] : scrub 11.2d5 
33c382d5/170509.178_fc845fcfbb504f0eb87c9061ebbaf477/head//11 no 'snapset' attr
2013-07-30 13:26:03.358112 7f83a9d32700  0 log [ERR] : 11.2d5 scrub 1 errors

What does it mean, how did it happen and how can I fix it?

Thanks,

JN

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Fwd: Small fix for ceph.spec

2013-07-30 Thread Erik Logtenberg
Hi,

I will report the issue there as well. Please note that Ceph seems to
support Fedora 17, even though that release is considered end-of-life by
Fedora. This issue with the leveldb package cannot be fixed for Fedora
17, only for 18 and 19.
So if Ceph wants to continue supporting Fedora 17, adding this
workaround seems to be the only way to get this (rather minor) bug fixed.

Kind regards,

Erik.


On 07/30/2013 09:56 AM, Danny Al-Gaaf wrote:
> Hi,
> 
> then the Fedora package is broken. If you check the spec file of:
> 
> http://dl.fedoraproject.org/pub/fedora/linux/updates/19/SRPMS/leveldb-1.12.0-3.fc19.src.rpm
> 
> 
> You can see the spec-file sets a:
> 
> BuildRequires:  snappy-devel
> 
> But not the corresponding "Requires: snappy-devel" for the devel package.
> 
> You should report this issue to your distribution, it needs to be fixed
> there instead of adding a workaround to the ceph spec.
> 
> Regards,
> 
> Danny
> 
> Am 30.07.2013 09:42, schrieb Erik Logtenberg:
>> Hi,
>>
>> Fedora, in this case Fedora 19, x86_64.
>>
>> Kind regards,
>>
>> Erik.
>>
>>
>> On 07/30/2013 09:29 AM, Danny Al-Gaaf wrote:
>>> Hi,
>>>
>>> I think this is a bug in packaging of the leveldb package in this case
>>> since the spec-file already sets dependencies on on leveldb-devel.
>>>
>>> leveldb depends on snappy, therefore the leveldb package should set a
>>> dependency on snappy-devel for leveldb-devel (check the SUSE spec file
>>> for leveldb:
>>> https://build.opensuse.org/package/view_file/home:dalgaaf:ceph:extra/leveldb/leveldb.spec?expand=1).
>>> This way the RPM build process will pick up the correct packages needed
>>> to build ceph.
>>>
>>> Which distro do you use?
>>>
>>> Danny
>>>
>>> Am 30.07.2013 01:33, schrieb Patrick McGarry:
 -- Forwarded message --
 From: Erik Logtenberg 
 Date: Mon, Jul 29, 2013 at 7:07 PM
 Subject: [ceph-users] Small fix for ceph.spec
 To: ceph-users@lists.ceph.com


 Hi,

 The spec file used for building rpm's misses a build time dependency on
 snappy-devel. Please see attached patch to fix.

 Kind regards,

 Erik.

 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

>>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] rbd read write very slow for heavy I/O operations

2013-07-30 Thread johnu
Hi,
 I have an openstack cluster which runs on ceph . I tried running
hadoop inside VM's and I noticed that map tasks take long time to complete
with time and finally it fails. RDB read/writes are getting slower with
time. Is it because of too many objects in ceph per volume?

I have 8 node cluster with 24* 1TB disk for each node.

master : mon
slave1: 1 osd per disk, ie 23
slave 2: 1 osd per disk ie 23
.
.
slave 7 : 1 osd per isk  ie 23

replication factor :2
pg nums in default pool: 128

In openstack,I have 14 instances. 14  5TB volumes are created and each one
is attached to an instance.I am using default stripe settings.

rpd -p volumes info volume-1
  size 5000GB in 128 objects
  order 22(4096kb objects)



1.I couldn't find the documentation for stripe settings which can be used
for volume creation  in openstack. Can it be exposed through any
configuration files? . (http://ceph.com/docs/master/rbd/rbd-openstack/).
Like 64MB default block size in hdfs, how do we set layout for the
objects,?.Can we change it after volume creation ?. Will this affect
performance for huge I/O applications like MapReduce?

2. How can RBD caching improve the performance?

3. Like in hdfs which gives priority over localized writes, how can we
implement the same feature because rbd volumes are striped across the
cluster. I am not sure of crush rulesets which can help this situation

Can someone give me debug points and ideas related to this?. I have not
used cephfs for now .
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] "rbd ls -l" hangs

2013-07-30 Thread Gregory Farnum
You'll want to figure out why the cluster isn't healthy to begin with.
Is the incomplete/inactive PG staying constant? Track down which OSDs
it's on and make sure the acting set is the right size, or if you've
somehow lost data on it. I believe the docs have some content on doing
this but I don't have a link handy.

You might also try opening up "ceph -w" in one terminal, running "ceph
osd bench" in another, and then waiting for the results to come back
in via the central log and make sure your OSDs are comparable to each
other. It sort of sounds like you've added a bunch of bad disks to the
cluster which aren't performing and are dragging everything else down
with them.
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com


On Tue, Jul 30, 2013 at 9:44 AM, Jeff Moskow  wrote:
> OK - so while things are definitely better, we still are not where we were
> and "rbd ls -l" still hangs.
>
> Any suggestions?
>
> --
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com