[ceph-users] How to fix an incomplete PG on an 2 copy ceph-cluster?

2014-02-16 Thread Udo Lembke
Hi, I switch some disks from manual format to ceph-deploy (because slightly different xfs-parameters) - all disks are on a single node of an 4-node cluster. After rebuilding the osd-disk one PG are incomplete: ceph -s cluster 591db070-15c1-4c7a-b107-67717bdb87d9 health HEALTH_WARN 1 pgs in

Re: [ceph-users] slow requests from rados bench with small writes

2014-02-16 Thread Dan van der Ster
After some further digging I realized that updatedb was running over the pgs, indexing all the objects. (According to iostat, updatedb was keeping the indexed disk 100% busy!) Oops! Since the disks are using the deadline elevator (which by default prioritizes reads over writes, and gives writes a d

Re: [ceph-users] change order of an rbd image ?

2014-02-16 Thread Wido den Hollander
On 02/16/2014 12:45 AM, Daniel Schwager wrote: Hi, I created a 1TB rbd-image formated with vmfs (vmware) for an ESX server - but with a wrong order (25 instead of 22 ...). The rbd man page tells me for export/import/cp, rbd will use the order of the source image. Is there a way to change the

[ceph-users] Sudden RADOS Gateway issues caused by missing xattrs

2014-02-16 Thread Wido den Hollander
Hi, Yesterday I got a notification that a RGW setup was having issues with objects suddenly giving errors (403 and 404) when trying to access them. I started digging and after cranking up the logs with 'debug rados' and 'debug rgw' set to 20 I found what caused RGW to throw a error: librado

Re: [ceph-users] slow requests from rados bench with small writes

2014-02-16 Thread Sage Weil
Good catch! It sounds like what is needed here is for the deb and rpm packages to add /var/lib/ceph to the PRUNEPATHS in /etc/updatedb.conf. Unfortunately there isn't a /etc/updatedb.conf.d type file, so that promises to be annoying. Has anyone done this before? sage On Sun, 16 Feb 2014, D

Re: [ceph-users] slow requests from rados bench with small writes

2014-02-16 Thread Dietmar Maurer
Some projects manually modify PRUNEPATHS in the init script, for example: http://git.openvz.org/?p=vzctl;a=commitdiff;h=47334979b9b5340f84d84639b2d77a8a1f0bb7cf > It sounds like what is needed here is for the deb and rpm packages to add > /var/lib/ceph to the PRUNEPATHS in /etc/updatedb.conf. Un

Re: [ceph-users] How to fix an incomplete PG on an 2 copy ceph-cluster?

2014-02-16 Thread Gregory Farnum
Check out http://ceph.com/docs/master/rados/operations/placement-groups/#get-statistics-for-stuck-pgs and http://ceph.com/docs/master/rados/troubleshooting/troubleshooting-pg/. What does the dump of the PG say is going on? -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Sun

Re: [ceph-users] Sudden RADOS Gateway issues caused by missing xattrs

2014-02-16 Thread Gregory Farnum
Did you maybe upgrade that box to v0.67.6? This sounds like one of the bugs Sage mentioned in it. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Sun, Feb 16, 2014 at 4:23 AM, Wido den Hollander wrote: > Hi, > > Yesterday I got a notification that a RGW setup was having iss

[ceph-users] osd down

2014-02-16 Thread Pavel V. Kaygorodov
Hi, All! I am trying to setup ceph from scratch, without dedicated drive, with one mon and one osd. After all, I see following output of ceph osd tree: # idweight type name up/down reweight -1 1 root default -2 1 host host1 0 1

[ceph-users] ReAsk: how to tell ceph-mon to listen on a specific address only

2014-02-16 Thread Ron Gage
Hi everyone: I am still trying unsuccessfully to implement a test array for a POC. It is still failing to set up - specifically, the admin keyring is not getting set up. Setup is 4 x OSD, 1 x Mon/Mgr. The Mon machine is the only one that is multi-homed - eth0 on a private subnet for intern

Re: [ceph-users] Sudden RADOS Gateway issues caused by missing xattrs

2014-02-16 Thread Wido den Hollander
On 02/16/2014 06:49 PM, Gregory Farnum wrote: Did you maybe upgrade that box to v0.67.6? This sounds like one of the bugs Sage mentioned in it. No, I checked it again. Version is: ceph version 0.67.5 (a60ac9194718083a4b6a225fc17cad6096c69bd1) All machines in the cluster are on that version.

Re: [ceph-users] How to fix an incomplete PG on an 2 copy ceph-cluster?

2014-02-16 Thread Udo Lembke
Hi Greg, This is the output of the two commands (at 8:10 I have stoped and started the osd.42): root@ceph-04:~# ceph pg dump_stuck inactive ok pg_stat objects mip degrunf bytes log disklog state state_stamp v reportedup acting last_scrub scrub_stamp

[ceph-users] Important note for sender / Важное сообщение для отправителя (was: ceph-users Digest, Vol 13, Issue 16)

2014-02-16 Thread kudryavtsev_ia
Dear sender, If you wish I read and respond to this e-mail for sure, please, build subject like KUDRYAVTSEV/Who wrote/Subject. for example, KUDRYAVTSEV/Bitworks/Some subject there... Best wishes, Ivan Kudryavtsev __

Re: [ceph-users] Sudden RADOS Gateway issues caused by missing xattrs

2014-02-16 Thread Sage Weil
Hi Wido, On Sun, 16 Feb 2014, Wido den Hollander wrote: > On 02/16/2014 06:49 PM, Gregory Farnum wrote: > > Did you maybe upgrade that box to v0.67.6? This sounds like one of the > > bugs Sage mentioned in it. > > No, I checked it again. Version is: ceph version 0.67.5 > (a60ac9194718083a4b6a225f

Re: [ceph-users] osd down

2014-02-16 Thread Karan Singh
Hi Pavel Try to add at least 1 more OSD ( bare minimum ) and set pool replication to 2 after that. For osd.0 try , # ceph osd in osd.0 , once the osd is IN , try to bring up osd.0 services up Finally your both the OSD should be IN and UP , so that your cluster can store data. Regard

Re: [ceph-users] osd down

2014-02-16 Thread Pavel V. Kaygorodov
Hi! I have tried, but situation not changed significantly: # ceph -w cluster e90dfd37-98d1-45bb-a847-8590a5ed8e71 health HEALTH_WARN 192 pgs stuck inactive; 192 pgs stuck unclean; 2/2 in osds are down monmap e1: 1 mons at {host1=172.17.0.4:6789/0}, election epoch 1, quorum 0 host1

Re: [ceph-users] osd down

2014-02-16 Thread Jean-Charles LOPEZ
Hi Pavel, It looks like you have deployed your 2 OSDs on the same host. By default, in the CRUSH map, each object is going to be assigned ti 2 OSDs that are on different host. If you want this to work for testing, you’ll have to adapt your CRUSH map so that each copy is dispatch on a bucket of