[ceph-users] monitor can not rejoin the cluster

2014-01-30 Thread Daniel Schwager
Hi all, my monitor3 is not able to rejoin the cluster (containing mon1, mon2 and mon3 - running stable emperor). I try to recreate/inject a new monmap to all 3 mon's - but only mon1 and mon2 are up and joined. Now, enabling debugging on mon3, I got the following: 2014-01-30 08:51:03.823669 7f3

Re: [ceph-users] feature set mismatch

2014-01-30 Thread Markus Goldberg
Hello Ilya, thank you for your help. I first tried 'ceph osd crush tunables legacy' on each server with rebooting. That didn't help anything. I though, that running this command the feature should be disabled, but the error remains. In the past i only did rolling ceph-upgrades with 'apt-get up

Re: [ceph-users] monitor can not rejoin the cluster

2014-01-30 Thread Daniel Schwager
OK - found the problem: > mon_status > { "name": "ceph-mon3", .. > "mons": [ > { "rank": 2, > "name": "mon.ceph-mon3", NAME is wrong > "addr": "192.168.135.33:6789\/0"}]}} > In the docu http://ceph.com/docs/master/man/8/monmaptool/ the creation of t

[ceph-users] poor data distribution

2014-01-30 Thread Dominik Mostowiec
Hi, I have problem with data distribution. Smallest disk usage 40% vs highest 82%. All PGS: 6504. Almost all data is in '.rgw.buckets' pool with pg_num 4800. The best way to better data distribution is increese pg_num in this pool? Is thre another way? ( eg crush tunables, or something like that ..

[ceph-users] Ceph Community Berlin founded

2014-01-30 Thread Robert Sander
Hi, The inaugural meeting of the Ceph Berlin community took place on Monday, January 27th. A total of 14 "Cephalopods" found their way to the Heinlein Offices. That was a great response to a group that was formed just under three weeks ago. And we even have around 30 members right now. In the org

[ceph-users] Synnefo + Ceph @ FOSDEM'14

2014-01-30 Thread Constantinos Venetsanopoulos
Hello everybody, in case you haven't noticed already, this weekend at FOSDEM'14, we will be presenting the Synnefo stack and it's integration with Google Ganeti, Archipelago and RADOS to provide advanced, unified cloud storage with unique features. You can find the official announcement here: ht

[ceph-users] clock skew

2014-01-30 Thread Gandalf Corvotempesta
Hi. I'm using ntpd on each ceph server and is syncing properly but every time that I reboot, ceph starts in degraded mode with "clock skew" warning. The only way that I have to solve this is manually restart ceph on each node (without resyncing clock) Any suggestion ?

[ceph-users] EINVAL: (22) Invalid argument when starting osds

2014-01-30 Thread Ingo Ebel
Hi, yesterday i tried to fix my "pgs stuck unclean" problem by using tunables and edited my crushmap. today i can't start my osds anymore. /etc/init.d/ceph restart osd.0 === osd.0 === === osd.0 === Stopping Ceph osd.0 on rokix...done === osd.0 === Error EINVAL: (22) Invalid argument failed: 'tim

[ceph-users] udev names /dev/sd* - what happens if they change ?

2014-01-30 Thread Daniel Schwager
Hi, just a small question: Createing a new OSD i use e.g. ceph-deploy osd create ceph-node1:sdg:/dev/sdb5 Question: What happens if the mapping of my disks changes (e.g. because adding new disks to the server) sdg becomes sgh sdb becomes sdc Is this handled (how?) by ce

[ceph-users] During copy new rbd image is totally thick

2014-01-30 Thread Igor Laskovy
Hello list, Is it correct behavior during copy to thicking rbd image? igor@hv03:~$ rbd create rbd/test -s 1024 igor@hv03:~$ rbd diff rbd/test | awk '{ SUM += $2 } END { print SUM/1024/1024 " MB" }' 0 MB igor@hv03:~$ rbd copy rbd/test rbd/cloneoftest Image copy: 100% complete...done. igor@hv03:~$

[ceph-users] s3 downloaded file verification

2014-01-30 Thread Dominik Mostowiec
Hi, I'm looking for solution how to verify file downloaded from s3 where ETag is multiparted ( with '-' ) and don't know how is part size. When part size is known, it is possible eg do it with scrip: https://github.com/Teachnova/s3md5/blob/master/s3md5 In aws doc i found that there is only lower

Re: [ceph-users] EINVAL: (22) Invalid argument when starting osds

2014-01-30 Thread Christian Kauhaus
Am 30.01.2014 13:03, schrieb Ingo Ebel: > /etc/init.d/ceph restart osd.0 > [...] > Error EINVAL: (22) Invalid argument > failed: 'timeout 10 /usr/bin/ceph --name=osd.0 > --keyring=/var/lib/ceph/osd/ceph-0/keyringosd crush > create-or-move

Re: [ceph-users] poor data distribution

2014-01-30 Thread Dominik Mostowiec
Hi, I found something else. 'ceph pg dump' shows PGs: - with zero or near zero objects count - with ~6,5k objects, size ~1,4G - with ~13k objects, size ~2,8G This can be a reason of wrong data distribution on OSD's? --- Regards Dominik 2014-01-30 Dominik Mostowiec : > Hi, > I have problem with

Re: [ceph-users] clock skew

2014-01-30 Thread Emmanuel Lacour
On Thu, Jan 30, 2014 at 12:53:22PM +0100, Gandalf Corvotempesta wrote: > Hi. > I'm using ntpd on each ceph server and is syncing properly but every > time that I reboot, ceph starts in degraded mode with "clock skew" > warning. > > The only way that I have to solve this is manually restart ceph on

Re: [ceph-users] poor data distribution

2014-01-30 Thread Dominik Mostowiec
Hi, I found something else what I think can help. PG distribution it seems isn't ok. Graph: http://dysk.onet.pl/link/AVzTe All PGS is from 70 to 140 per OSD. Primary 15 to 58 per OSD. Is there some way to fix it? -- Regards Dominik 2014-01-30 Dominik Mostowiec : > Hi, > I found something else.

Re: [ceph-users] clock skew

2014-01-30 Thread Markus Goldberg
you can run 'ntpdate -b ' read ntpdate-manual for the parameters. Markus Am 30.01.2014 16:05, schrieb Emmanuel Lacour: On Thu, Jan 30, 2014 at 12:53:22PM +0100, Gandalf Corvotempesta wrote: Hi. I'm using ntpd on each ceph server and is syncing properly but every time that I reboot, ceph starts

Re: [ceph-users] s3 downloaded file verification

2014-01-30 Thread Yehuda Sadeh
On Thu, Jan 30, 2014 at 5:58 AM, Dominik Mostowiec wrote: > Hi, > I'm looking for solution how to verify file downloaded from s3 where > ETag is multiparted ( with '-' ) and don't know how is part size. > > When part size is known, it is possible eg do it with scrip: > https://github.com/Teachnova

Re: [ceph-users] EINVAL: (22) Invalid argument when starting osds

2014-01-30 Thread Ingo Ebel
Am 30.01.14 15:09, schrieb Christian Kauhaus: > Am 30.01.2014 13:03, schrieb Ingo Ebel: >> /etc/init.d/ceph restart osd.0 >> [...] >> Error EINVAL: (22) Invalid argument >> failed: 'timeout 10 /usr/bin/ceph--name=osd.0 >> --keyring=/var/lib/ceph/osd/ceph-0/keyring

Re: [ceph-users] clock skew

2014-01-30 Thread Gandalf Corvotempesta
2014-01-30 Emmanuel Lacour : > here, I just wait until the skew is finished, without touching ceph. It > doesn't seems to do anything bad ... I've waited more than 1 hour with no success. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists

[ceph-users] Questions about monitors load balancing

2014-01-30 Thread Sławek Kapłoński
Hello, I'm just started using ceph and I'm trying to understand how works connections between monitors, client and OSDs. I have test infrastructure with two monitors and 3 OSDs. I create pool which is mapped to host (client). Now I'm not quite sure how exactly connections works when for examp

[ceph-users] Deleting an image after a crash

2014-01-30 Thread Jean-Tiare LE BIGOT
During on of tests, my test client painfully crashed with a couple of mapped images. When it came back online I have not been able to rm formerly mapped images as watchers were still registered. Host A > rbd create test-image --size 10240 Host A > rbd map test-image # # after some time > 3

Re: [ceph-users] feature set mismatch

2014-01-30 Thread Sage Weil
Hi Markus, You need to remove the erasure rule from the CRUSH map that was erroneously put in the default map for this version. The command you need is ceph osd crush rule rm you can list rules with ceph osd crush rule ls and the one you want to remove is named 'erasure-default' or some

Re: [ceph-users] poor data distribution

2014-01-30 Thread Sage Weil
On Thu, 30 Jan 2014, Dominik Mostowiec wrote: > Hi, > I found something else. > 'ceph pg dump' shows PGs: > - with zero or near zero objects count These are probably for a different pool than the big ones, right? The PG id is basically $pool.$shard. > - with ~6,5k objects, size ~1,4G > - with

[ceph-users] Radosgw link/unlink problem

2014-01-30 Thread Mikhail Krotyuk
Hi, I have successfully upgraded production cluster to: ceph version 0.67.5 (a60ac9194718083a4b6a225fc17cad6096c69bd1) 1. after this upgrade i see in radosgw.log: WARNING: couldn't find acl header for bucket, generating default i found this http://tracker.ceph.com/issues/5691 and checked the revi

Re: [ceph-users] poor data distribution

2014-01-30 Thread Dominik Mostowiec
Hi, Thaks for Your response. > - with ~6,5k objects, size ~1,4G > - with ~13k objects, size ~2,8G is on one the biggest pool 5 '.rgw.buckets' > This is because pg_num is not a power of 2 This is for all PGs (sum of all pools) or for pool 5 '.rgw.buckets' where i have almost all data ? > Did you

Re: [ceph-users] poor data distribution

2014-01-30 Thread Sage Weil
On Thu, 30 Jan 2014, Dominik Mostowiec wrote: > Hi, > Thaks for Your response. > > > - with ~6,5k objects, size ~1,4G > > - with ~13k objects, size ~2,8G > is on one the biggest pool 5 '.rgw.buckets' > > > This is because pg_num is not a power of 2 > This is for all PGs (sum of all pools) or for

[ceph-users] Starting radosgw - RHEL6.4

2014-01-30 Thread alistair.whittle
Hi, I am busy configuring the object gateway and have reached the point where I need to start the radosgw. The documentation calls for the following command: /etc/init.d/radosgw start On my deployment, radosgw seems to be located in /usr/bin and trying to start it from there results in the f

Re: [ceph-users] EINVAL: (22) Invalid argument when starting osds

2014-01-30 Thread Ingo Ebel
Am 30.01.14 16:52, schrieb Ingo Ebel: > Am 30.01.14 15:09, schrieb Christian Kauhaus: >> Am 30.01.2014 13:03, schrieb Ingo Ebel: >>> /etc/init.d/ceph restart osd.0 >>> [...] >>> Error EINVAL: (22) Invalid argument >>> failed: 'timeout 10 /usr/bin/ceph --name=osd.0 >>>

Re: [ceph-users] clock skew

2014-01-30 Thread Eric Eastman
I have this problem on some of my Ceph clusters, and I think it is due to the older hardware the I am using does not have the best clocks. To fix the problem, I setup one server in my lab to be my local NTP time server, and then on each of my Ceph monitors, in the /etc/ntp.conf file, I put in

Re: [ceph-users] Synnefo + Ceph @ FOSDEM'14

2014-01-30 Thread Loic Dachary
Hi Constantinos, Count me in https://fosdem.org/2014/schedule/event/virtiaas02/ :-) If you're in Brussels tomorrow (friday), you're welcome to join the Ceph meetup http://www.meetup.com/Ceph-Brussels/ ! Cheers On 30/01/2014 12:51, Constantinos Venetsanopoulos wrote: > Hello everybody, > > in

Re: [ceph-users] Radosgw link/unlink problem

2014-01-30 Thread mykr0t
fix: it was not python boto, it was https://github.com/dyarnell/rgwadmin but it doesn`t matter -- Regards, Mikhail On Thu, 30 Jan 2014 19:54:16 +0300 Mikhail Krotyuk wrote: > Hi, > I have successfully upgraded production cluster to: > ceph version 0.67.5 (a60ac9194718083a4b6a225fc17cad6096c69

Re: [ceph-users] Radosgw link/unlink problem

2014-01-30 Thread Derek Yarnell
Hi Mikhail, Sorry did you find an issue with the library I wrote? Let me test the bucket linking as I didn't have a good unit test for it yet. Thanks, derek On 1/30/14, 12:45 PM, myk...@gmail.com wrote: > fix: it was not python boto, it was https://github.com/dyarnell/rgwadmin > but it doesn`t

Re: [ceph-users] Radosgw link/unlink problem

2014-01-30 Thread mykr0t
no its not in the library, the problem is with radosgw -- Regards, Mikhail On Thu, 30 Jan 2014 13:17:41 -0500 Derek Yarnell wrote: > Hi Mikhail, > > Sorry did you find an issue with the library I wrote? Let me test the > bucket linking as I didn't have a good unit test for it yet. > > Th

[ceph-users] stuck unclean/stuck inactive

2014-01-30 Thread Derek Yarnell
Hi, So I am trying to remove OSDs from one of our 6 ceph OSDs, this is a brand new cluster and no data is yet on it. I was following the manual procedure[1] with the following script. I removed OSDs 0-3 but I am seeing ceph not fully recovering. #!/bin/bash ceph osd out ${1} /etc/init.d/ceph st

Re: [ceph-users] Starting radosgw - RHEL6.4

2014-01-30 Thread Derek Yarnell
On 1/30/14, 12:16 PM, alistair.whit...@barclays.com wrote: > radosgw: must specify 'rgw socket path' or 'rgw port' to run as a daemon > This is defined in your /etc/ceph/ceph.conf. For example if the host you are running the radosgw is named 'myhostname' then your ceph.conf should have a secti

Re: [ceph-users] Gentoo & ceph 0.67 & pg stuck After fresh Installation

2014-01-30 Thread Aaron Ten Clay
Philipp, I have had issues with clock sync on machines before that I could usually alleviate by tweaking the kernel config. Changing CONFIG_HZ to 300 instead of 1000 can help. If you ever reboot the machines, making sure your init system writes the current software clock to the hardware clock on s

Re: [ceph-users] poor data distribution

2014-01-30 Thread Dominik Mostowiec
Hi, For this cluster 198x100/3 = 6600 If i bump up pool .rgw.buckets to 8192 (now it is 4800 ), it'll be 9896. It is not to much? Mabye better way is to destroy eg '.log' pool and create it with lower pool count (it is safe?)? -- Regards Dominik 2014-01-30 Sage Weil : > On Thu, 30 Jan 2014, Domi