[ceph-users] Remove RBD Image

2015-07-29 Thread Christian Eichelmann
Hi all, I am trying to remove several rbd images from the cluster. Unfortunately, that doesn't work: $ rbd info foo rbd image 'foo': size 1024 GB in 262144 objects order 22 (4096 kB objects) block_name_prefix: rb.0.919443.238e1f29 format: 1 $ rbd rm foo 2015-07-2

[ceph-users] small cluster reboot fail

2015-07-29 Thread pixelfairy
have a small test cluster (vmware fusion, 3 mon+osd nodes) all run ubuntu trusty. tried rebooting all 3 nodes and this happend. root@ubuntu:~# ceph --version ceph version 0.94.2 (5fb85614ca8f354284c713a2f9c610860720bbf3) root@ubuntu:~# ceph health 2015-07-29 02:08:31.360516 7f5bd711a700 -1 asok(0

Re: [ceph-users] Remove RBD Image

2015-07-29 Thread Ilya Dryomov
On Wed, Jul 29, 2015 at 11:30 AM, Christian Eichelmann wrote: > Hi all, > > I am trying to remove several rbd images from the cluster. > Unfortunately, that doesn't work: > > $ rbd info foo > rbd image 'foo': > size 1024 GB in 262144 objects > order 22 (4096 kB objects) > b

Re: [ceph-users] Remove RBD Image

2015-07-29 Thread Christian Eichelmann
Hi Ilya, that worked for me and actually pointed out that one of my collegues currently had the rbd pool locally mounted via fuse-rbd, which obviously locks all images in this pool. Problem solved! Thanks! Regards, Christian Am 29.07.2015 um 11:48 schrieb Ilya Dryomov: > On Wed, Jul 29, 2015 at

Re: [ceph-users] OSD RAM usage values

2015-07-29 Thread Kenneth Waegeman
On 07/28/2015 04:04 PM, Dan van der Ster wrote: On Tue, Jul 28, 2015 at 12:07 PM, Gregory Farnum wrote: On Tue, Jul 28, 2015 at 11:00 AM, Kenneth Waegeman wrote: On 07/17/2015 02:50 PM, Gregory Farnum wrote: On Fri, Jul 17, 2015 at 1:13 PM, Kenneth Waegeman wrote: Hi all, I've read

Re: [ceph-users] OSD RAM usage values

2015-07-29 Thread Kenneth Waegeman
On 07/28/2015 04:21 PM, Mark Nelson wrote: On 07/17/2015 07:50 AM, Gregory Farnum wrote: On Fri, Jul 17, 2015 at 1:13 PM, Kenneth Waegeman wrote: Hi all, I've read in the documentation that OSDs use around 512MB on a healthy cluster.(http://ceph.com/docs/master/start/hardware-recommendati

[ceph-users] Unable to mount Format 2 striped RBD image

2015-07-29 Thread Daleep Bais
Hi, I have created a format 2 striped image, however, I am not able to mount it on client machine.. # rbd -p foo info strpimg rbd image 'strpimg': size 2048 MB in 513 objects order 22 (4096 kB objects) block_name_prefix: rbd_data.20c942ae8944a format: 2 features: striping flags: stripe unit: 6553

Re: [ceph-users] Unable to mount Format 2 striped RBD image

2015-07-29 Thread Ilya Dryomov
On Wed, Jul 29, 2015 at 1:45 PM, Daleep Bais wrote: > Hi, > > I have created a format 2 striped image, however, I am not able to mount it > on client machine.. > > # rbd -p foo info strpimg > rbd image 'strpimg': > size 2048 MB in 513 objects > order 22 (4096 kB objects) > block_name_prefix: rbd_d

Re: [ceph-users] small cluster reboot fail

2015-07-29 Thread pixelfairy
disregard. i did this on a cluster of test vms and didnt bother setting different hostnames, thus confusing ceph. On Wed, Jul 29, 2015 at 2:24 AM pixelfairy wrote: > have a small test cluster (vmware fusion, 3 mon+osd nodes) all run ubuntu > trusty. tried rebooting all 3 nodes and this happend.

[ceph-users] rbd-fuse Transport endpoint is not connected

2015-07-29 Thread pixelfairy
client debian wheezy, server ubuntu trusty. both running ceph 0.94.2 rbd-fuse seems to work, but cant access, saying "Transport endpoint is not connected" when i try to ls the mount point. on the ceph server, (a virtual machine, as its a test cluster) root@c3:/etc/ceph# ceph -s cluster 35ef5

Re: [ceph-users] rbd-fuse Transport endpoint is not connected

2015-07-29 Thread Ilya Dryomov
On Wed, Jul 29, 2015 at 2:52 PM, pixelfairy wrote: > client debian wheezy, server ubuntu trusty. both running ceph 0.94.2 > > rbd-fuse seems to work, but cant access, saying "Transport endpoint is > not connected" when i try to ls the mount point. > > on the ceph server, (a virtual machine, as its

[ceph-users] Migrate OSDs to different backend

2015-07-29 Thread Kenneth Waegeman
Hi all, We are considering to migrate all our OSDs of our EC pool from KeyValue to Filestore. Does someone has experience with this? What would be a good procedure? We have Erasure Code using k+m: 10+3, with host-level failure domain on 14 servers. Our pool is 30% filled. I was thinking: W

Re: [ceph-users] Migrate OSDs to different backend

2015-07-29 Thread Haomai Wang
I think option 2 should be reliable. On Wed, Jul 29, 2015 at 9:00 PM, Kenneth Waegeman wrote: > Hi all, > > We are considering to migrate all our OSDs of our EC pool from KeyValue to > Filestore. Does someone has experience with this? What would be a good > procedure? > > We have Erasure Code usi

Re: [ceph-users] Ceph 0.94 (and lower) performance on >1 hosts ??

2015-07-29 Thread Jake Young
On Tue, Jul 28, 2015 at 11:48 AM, SCHAER Frederic wrote: > > Hi again, > > So I have tried > - changing the cpus frequency : either 1.6GHZ, or 2.4GHZ on all cores > - changing the memory configuration, from "advanced ecc mode" to "performance mode", boosting the memory bandwidth from 35GB/s to 40G

Re: [ceph-users] Ceph 0.94 (and lower) performance on >1 hosts ??

2015-07-29 Thread Mark Nelson
On 07/29/2015 10:13 AM, Jake Young wrote: On Tue, Jul 28, 2015 at 11:48 AM, SCHAER Frederic mailto:frederic.sch...@cea.fr>> wrote: > > Hi again, > > So I have tried > - changing the cpus frequency : either 1.6GHZ, or 2.4GHZ on all cores > - changing the memory configuration, from "advanced

[ceph-users] Recovery question

2015-07-29 Thread Peter Hinman
I've got a situation that seems on the surface like it should be recoverable, but I'm struggling to understand how to do it. I had a cluster of 3 monitors, 3 osd disks, and 3 journal ssds. After multiple hardware failures, I pulled the 3 osd disks and 3 journal ssds and am attempting to bring

Re: [ceph-users] Ceph 0.94 (and lower) performance on >1 hosts ??

2015-07-29 Thread Jake Young
On Wed, Jul 29, 2015 at 11:23 AM, Mark Nelson wrote: > On 07/29/2015 10:13 AM, Jake Young wrote: > >> On Tue, Jul 28, 2015 at 11:48 AM, SCHAER Frederic >> mailto:frederic.sch...@cea.fr>> wrote: >> > >> > Hi again, >> > >> > So I have tried >> > - changing the cpus frequency : either 1.6GHZ,

Re: [ceph-users] Configuring MemStore in Ceph

2015-07-29 Thread Aakanksha Pudipeddi-SSI
Hello Haomai, Thanks for your response. Yes, I cannot write more than 1GB of data to it. I am using the latest version deployed by ceph-deploy so I am assuming the fix must be a part of it. Also, while creating osds using ceph-deploy, I just use a local directory such as /var/local/osd0. Is the

Re: [ceph-users] Configuring MemStore in Ceph

2015-07-29 Thread Haomai Wang
On Thu, Jul 30, 2015 at 1:09 AM, Aakanksha Pudipeddi-SSI wrote: > Hello Haomai, > > Thanks for your response. Yes, I cannot write more than 1GB of data to it. I > am using the latest version deployed by ceph-deploy so I am assuming the fix > must be a part of it. Also, while creating osds using

Re: [ceph-users] Configuring MemStore in Ceph

2015-07-29 Thread Aakanksha Pudipeddi-SSI
Hello Haomai, The issue was with me not mentioning the value correctly in ceph.conf. I mistakenly used 5*1024*1024*1024 instead of the value 5368709120. Thanks for your help! :) Aakanksha -Original Message- From: Haomai Wang [mailto:haomaiw...@gmail.com] Sent: Wednesday, July 29, 2015

Re: [ceph-users] Recovery question

2015-07-29 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Did you use ceph-depoy or ceph-disk to create the OSDs? If so, it should use udev to start he OSDs. In that case, a new host that has the correct ceph.conf and osd-bootstrap key should be able to bring up the OSDs into the cluster automatically. Just

Re: [ceph-users] Recovery question

2015-07-29 Thread Peter Hinman
Thanks for the guidance. I'm working on building a valid ceph.conf right now. I'm not familiar with the osd-bootstrap key. Is that the standard filename for it? Is it the keyring that is stored on the osd? I'll see if the logs turn up anything I can decipher after I rebuild the ceph.conf fi

Re: [ceph-users] Recovery question

2015-07-29 Thread Steve Taylor
I recently migrated 240 OSDs to new servers this way in a single cluster, and it worked great. There are two additional items I would note based on my experience though. First, if you're using dmcrypt then of course you need to copy the dmcrypt keys for the OSDs to the new host(s). I had to do

Re: [ceph-users] Recovery question

2015-07-29 Thread Gregory Farnum
This sounds odd. Can you create a ticket in the tracker with all the details you can remember or reconstruct? -Greg On Wed, Jul 29, 2015 at 8:34 PM Steve Taylor wrote: > I recently migrated 240 OSDs to new servers this way in a single cluster, > and it worked great. There are two additional item

Re: [ceph-users] Recovery question

2015-07-29 Thread Gregory Farnum
This sounds like you're trying to reconstruct a cluster after destroying the monitors. That is...not going to work well. The monitors define the cluster and you can't move OSDs into different clusters. We have ideas for how to reconstruct monitors and it can be done manually with a lot of hassle, b

Re: [ceph-users] rbd-fuse Transport endpoint is not connected

2015-07-29 Thread pixelfairy
copied ceph.conf from the servers. hope this helps. should this be concidered an unsupported feature? # rbd-fuse /cmnt -c /etc/ceph/ceph.conf -d FUSE library version: 2.9.2 nullpath_ok: 0 nopath: 0 utime_omit_ok: 0 unique: 1, opcode: INIT (26), nodeid: 0, insize: 56, pid: 0 INIT: 7.22 flags=0x

Re: [ceph-users] Recovery question

2015-07-29 Thread Peter Hinman
Hi Greg - So at the moment, I seem to be trying to resolve a permission error. === osd.3 === Mounting xfs on stor-2:/var/lib/ceph/osd/ceph-3 2015-07-29 13:35:08.809536 7f0a0262e700 0 librados: osd.3 authentication error (1) Operation not permitted Error connecting to cluster: PermissionEr

Re: [ceph-users] Recovery question

2015-07-29 Thread Gregory Farnum
On Wednesday, July 29, 2015, Peter Hinman wrote: > Hi Greg - > > So at the moment, I seem to be trying to resolve a permission error. > > === osd.3 === > Mounting xfs on stor-2:/var/lib/ceph/osd/ceph-3 > 2015-07-29 13:35:08.809536 7f0a0262e700 0 librados: osd.3 authentication > error (1) Ope

Re: [ceph-users] Recovery question

2015-07-29 Thread Peter Hinman
The end goal is to recover the data. I don't need to re-implement the cluster as it was - that just appeared to the the natural way to recover the data. What monitor data would be required to re-implement the cluster? -- Peter Hinman International Bridge / ParcelPool.com On 7/29/2015 2:55 PM

Re: [ceph-users] Recovery question

2015-07-29 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 If you built new monitors, this will not work. You would have to recover the monitor data (database) from at least one monitor and rebuild the monitor. The new monitors would not have any information about pools, OSDs, PGs, etc to allow an OSD to be

Re: [ceph-users] Recovery question

2015-07-29 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 The default is /var/lib/ceph/mon/- (/var/lib/ceph/mon/ceph-mon1 for me). You will also need the information from /etc/ceph/ to reconstruct the data. I *think* you should be able to just copy this to a new box with the same name and IP address and sta

Re: [ceph-users] Recovery question

2015-07-29 Thread Peter Hinman
Thanks Robert - Where would that monitor data (database) be found? -- Peter Hinman On 7/29/2015 3:39 PM, Robert LeBlanc wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA256 If you built new monitors, this will not work. You would have to recover the monitor data (database) from at least one

Re: [ceph-users] Recovery question

2015-07-29 Thread Peter Hinman
Ok - that is encouraging. I've believe I've got data from a previous monitor. I see files in a store.db dated yesterday, with a MANIFEST- file that is significantly greater than the MANIFEST-07 file listed for the current monitors. I've actually found data for two previous monitor

Re: [ceph-users] Recovery question

2015-07-29 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 If you had multiple monitors, you should recover if possible more than 50% of them (they will need to form a quorum). If you can't, it is messy but, you can manually remove enough monitors to start a quorum. >From /etc/ceph/ you will want the keyring

[ceph-users] injectargs not working?

2015-07-29 Thread Quentin Hartman
I'm running a 0.87.1 cluster, and my "ceph tell" seems to not be working: # ceph tell osd.0 injectargs '--osd-scrub-begin-hour 1' failed to parse arguments: --osd-scrub-begin-hour,1 I've also tried the daemon config set variant and it also fails: # ceph daemon osd.0 config set osd_scrub_begin_

Re: [ceph-users] injectargs not working?

2015-07-29 Thread Travis Rhoden
Hi Quentin, It may be the specific option you are trying to tweak. osd-scrub-begin-hour was first introduced in development release v0.93, which means it would be in 0.94.x (Hammer), but your cluster is 0.87.1 (Giant). Cheers, - Travis On Wed, Jul 29, 2015 at 4:28 PM, Quentin Hartman wrote: >

Re: [ceph-users] injectargs not working?

2015-07-29 Thread Quentin Hartman
well, that would certainly do it. I _always_ forget to twiddle the little thing on the web page that changes the version of the docs I'm looking at. So I guess then my question becomes, "How do i prevent deep scrubs from happening in the middle of the day and ruining everything?" QH On Wed, Jul

Re: [ceph-users] ceph-mon cpu usage

2015-07-29 Thread Quentin Hartman
I just had my ceph cluster exhibit this behavior (two of three mons eat all CPU, cluster becomes unusably slow) which is running 0.87.1 It seems to be tied to deep scrubbing, as the behavior almost immediately surfaces if that is turned on, but if it is off the behavior eventually seems to return

Re: [ceph-users] injectargs not working?

2015-07-29 Thread 池信泽
hi, ceph osd set noscrub(or nodeep-scrub) would stop the scrub forever. and ceph osd unset noscrub would continue to reschedule scrub. so maybe you could use this two command in crontab to schedule scrub manually. 2015-07-30 7:59 GMT+08:00 Quentin Hartman : > well, that would certainly do it. I _

Re: [ceph-users] injectargs not working?

2015-07-29 Thread Christian Balzer
Hello, On Wed, 29 Jul 2015 17:59:10 -0600 Quentin Hartman wrote: > well, that would certainly do it. I _always_ forget to twiddle the little > thing on the web page that changes the version of the docs I'm looking > at. > > So I guess then my question becomes, "How do i prevent deep scrubs from

Re: [ceph-users] which kernel version can help avoid kernel client deadlock

2015-07-29 Thread van
> On Jul 29, 2015, at 12:40 AM, Ilya Dryomov wrote: > > On Tue, Jul 28, 2015 at 7:20 PM, van > wrote: >> >>> On Jul 28, 2015, at 7:57 PM, Ilya Dryomov wrote: >>> >>> On Tue, Jul 28, 2015 at 2:46 PM, van wrote: Hi, Ilya, In the dmesg, there is al

Re: [ceph-users] injectargs not working?

2015-07-29 Thread Quentin Hartman
So it looks like the scrub was not actually the root of the problem. It seems that I have some hardware that is failing that I'm now trying to run down. QH On Wed, Jul 29, 2015 at 8:22 PM, Christian Balzer wrote: > > Hello, > > On Wed, 29 Jul 2015 17:59:10 -0600 Quentin Hartman wrote: > > > wel

[ceph-users] questions on editing crushmap for ceph cache tier

2015-07-29 Thread van
Hi, list, Ceph cache tier seems very promising for performance. According to http://ceph.com/docs/master/rados/operations/crush-map/#placing-different-pools-on-different-osds , I need to cr

Re: [ceph-users] which kernel version can help avoid kernel client deadlock

2015-07-29 Thread Z Zhang
We also hit the similar issue from time to time on centos with 3.10.x kernel. By iostat, we can see kernel rbd client's util is 100%, but no r/w io, and we can't umount/unmap this rbd client. After restarting OSDs, it will become normal. @Ilya, could you pls point us the possible fixes on 3.18.1

Re: [ceph-users] which kernel version can help avoid kernel client deadlock

2015-07-29 Thread van
> > On Jul 30, 2015, at 12:48 PM, Z Zhang wrote: > > We also hit the similar issue from time to time on centos with 3.10.x kernel. > By iostat, we can see kernel rbd client's util is 100%, but no r/w io, and we > can't umount/unmap this rbd client. After restarting OSDs, it will become > norm

[ceph-users] Elastic-sized RBD planned?

2015-07-29 Thread Shneur Zalman Mattern
Hi to all! Perhaps, somebody already thought about, but my Googling had no results. How can I do RBD that will grow on demand of VM/client disk space. Are there in Ceph some options for this? Is it planned to do? Is it utopic idea? Is this client need CephFS

Re: [ceph-users] fuse mount in fstab

2015-07-29 Thread Alvaro Simon Garcia
Hi More info about this issue, we have opened a ticket to redhat here is the feedback: https://bugzilla.redhat.com/show_bug.cgi?id=1248003 Cheers Alvaro On 16/07/15 15:19, Alvaro Simon Garcia wrote: > Hi > > I have tested a bit this with different ceph-fuse versions and linux > distros and it s

[ceph-users] ceph osd mounting issue with ocfs2

2015-07-29 Thread gjprabu
Hi All, We are using ceph with two OSD and three clients. Clients try to mount with OCFS2 file system. Here when i start mounting only two clients i can able to mount properly and third client giving below errors. Some time i can able to mount third client but data not sync to third client