[ceph-users] influxdb - telegraf : ceph metrics collector

2016-05-19 Thread Alexandre DERUMIER
Hi, if you use influxdb / telegraf, a new input plugin for ceph metrics has been pushed in telegraf git https://github.com/influxdata/telegraf/pull/1172/commits/d9c8e11a078158a7f6bc7284abd5f453c2f0308a I'm going to do some nice grafana dashboard Regards, Alexandre _

[ceph-users] Help...my cephfs client often occur error when mount -t ceph...

2016-05-19 Thread 易明
Hi All, my cluster is Jewel Ceph.It is often that something wrong goes up when mount cephfs, but no error message can be found on logs the following are some infos: [root@ceph2 ~]# mount /mnt/cephfs_stor/ mount error 5 = Input/output error [root@ceph2 ~]# cat /etc/fstab | grep 6789 ceph2:6789:

Re: [ceph-users] Can Jewel read Hammer radosgw buckets?

2016-05-19 Thread Luis Periquito
I've upgraded our test cluster from 9.2.1 to 10.2.1 and I still had these issues. As before the script did fix the issue and the cluster is now working. Is the correct fix in 10.2.1 or was it still expected to run the fix? If it makes a difference I'm running trusty, the cluster was created on ha

Re: [ceph-users] OSD node memory sizing

2016-05-19 Thread Dietmar Rieder
Hello, On 05/19/2016 03:36 AM, Christian Balzer wrote: > > Hello again, > > On Wed, 18 May 2016 15:32:50 +0200 Dietmar Rieder wrote: > >> Hello Christian, >> >>> Hello, >>> >>> On Wed, 18 May 2016 13:57:59 +0200 Dietmar Rieder wrote: >>> Dear Ceph users, I've a question regarding

[ceph-users] Installing ceph monitor on Ubuntu denial: segmentation fault

2016-05-19 Thread Daniel Wilhelm
Hi I am trying to install ceph with the ceph ansible role: https://github.com/shieldwed/ceph-ansible. I had to fix some ansible tasks to work correctly with ansible 2.0.2.0 but now it seems to work quite well. Sadly I have now come across a bug, I cannot solve myself: When ansible is starting

Re: [ceph-users] dd testing from within the VM

2016-05-19 Thread Oliver Dzombic
Hi Ken, dd is ok, but you should consider the fact that dd is a squence of writing. So if you have random writes in your later productive usage, then this test is basically only good to meassure the maximum squential write performance in idle state. And 250 MB for 200 HDD's is quiet evil bad as

Re: [ceph-users] dd testing from within the VM

2016-05-19 Thread Ken Peng
Oliver, Thanks for the info. We then run sysbench for random IO testing, the result is even worse (757 KB/s). each object has 3 replicas. Both networks are 10Gbps, I don't think there are issues with network. Maybe lacking of SSD cache, and miscorrect configure to the cluster are the reason.

Re: [ceph-users] dd testing from within the VM

2016-05-19 Thread Oliver Dzombic
Hi Ken, wow thats quiet worst. That means you can not use this cluster like that. How does your ceph.conf look like ? How looks ceph -s ? -- Mit freundlichen Gruessen / Best regards Oliver Dzombic IP-Interactive mailto:i...@ip-interactive.de Anschrift: IP Interactive UG ( haftungsbeschrae

Re: [ceph-users] mark out vs crush weight 0

2016-05-19 Thread Oliver Dzombic
Hi, a sparedisk is a nice idea. But i think thats something you can also do with a shellscript. Checking if an osd is down or out and just using your spare disk. Maybe the programming ressources should not be used for something most of us can do with a simple shell script checking every 5 secon

Re: [ceph-users] OSD node memory sizing

2016-05-19 Thread Christian Balzer
Hello, On Thu, 19 May 2016 10:51:20 +0200 Dietmar Rieder wrote: > Hello, > > On 05/19/2016 03:36 AM, Christian Balzer wrote: > > > > Hello again, > > > > On Wed, 18 May 2016 15:32:50 +0200 Dietmar Rieder wrote: > > > >> Hello Christian, > >> > >>> Hello, > >>> > >>> On Wed, 18 May 2016 13:57

Re: [ceph-users] dense storage nodes

2016-05-19 Thread Mark Nelson
FWIW, we ran tests back in the dumpling era that more or less showed the same thing. Increasing the merge/split thresholds does help. We suspect it's primarily due to the PG splitting being spread out over a longer period of time so the effect lessens. We're looking at some options to introd

[ceph-users] pure SSD ceph - journal placement

2016-05-19 Thread George Shuklin
Hello. I'm curious how to get maximum performance without loosing significant space. OSD+its journal on SSD is good solution? Or using separate SSD for journal for the few others SSD-based OSD is better? Thanks. ___ ceph-users mailing list ceph-user

Re: [ceph-users] OSD process doesn't die immediately after device disappears

2016-05-19 Thread Marcel Lauhoff
Hi Somnath, Somnath Roy writes: > FileStore doesn't subscribe for any such event from the device. Presently, it > is relying on filesystem (for the FileStore assert) to return back error > during IO and based on the error it is giving an assert. > FileJournal assert you are getting in the aio

Re: [ceph-users] Help...my cephfs client often occur error when mount -t ceph...

2016-05-19 Thread Yan, Zheng
On Thu, May 19, 2016 at 3:35 PM, 易明 wrote: > Hi All, > > my cluster is Jewel Ceph.It is often that something wrong goes up when mount > cephfs, but no error message can be found on logs > > > the following are some infos: > [root@ceph2 ~]# mount /mnt/cephfs_stor/ > mount error 5 = Input/output err

Re: [ceph-users] dense storage nodes

2016-05-19 Thread Benjeman Meekhof
Hi Christian, Thanks for your insights. To answer your question the NVMe devices appear to be some variety of Samsung: Model: Dell Express Flash NVMe 400GB Manufacturer: SAMSUNG Product ID: a820 regards, Ben On Wed, May 18, 2016 at 10:01 PM, Christian Balzer wrote: > > Hello, > > On Wed, 18 M

[ceph-users] Maximum RBD image name length

2016-05-19 Thread Jason Dillaman
As of today, neither the rbd CLI nor librbd imposes any limit on the maximum length of an RBD image name, whereas krbd has roughly a 100 character limit and the OSDs have a default object name limit of roughly 2000 characters. While there is a patch under review to increase the krbd limit, it would

[ceph-users] Enabling hammer rbd features on cluster with a few dumpling clients

2016-05-19 Thread Dan van der Ster
Hi, We want to enable the hammer rbd features on newly created Cinder volumes [1], but we still have a few VMs running with super old librbd running (dumpling). Perhaps its academic, but does anyone know the expected behaviour if an old dumpling-linked qemu-kvm tries to attach an rbd with exclusi

[ceph-users] cluster [ERR] osd.NN: inconsistent clone_overlap found for oid xxxxxxxx/rbd_data and OSD crashes

2016-05-19 Thread Frode Nordahl
Hello, We recently had a outage on our Ceph storage cluster caused by what I believe to be a bug in Ceph. At the time of the incident all MONs, OSDs and clients (except for one) were running Ceph Hammer 0.94.6. To start describing the incident I will portray a hierarchy of rbd volumes/snaptsho

Re: [ceph-users] Enabling hammer rbd features on cluster with a few dumpling clients

2016-05-19 Thread Jason Dillaman
On Thu, May 19, 2016 at 12:15 PM, Dan van der Ster wrote: > I hope it will just refuse to > attach, rather than attach but allow bad stuff to happen. You are correct -- older librbd/krbd clients will refuse to open images that have unsupported features enabled. -- Jason

Re: [ceph-users] ceph hang on pg list_unfound

2016-05-19 Thread Samuel Just
Restart osd.1 with debugging enabled debug osd = 20 debug filestore = 20 debug ms = 1 Then, run list_unfound once the pg is back in active+recovering. If it still hangs, post osd.1's log to the list along with the output of ceph osd dump and ceph pg dump. -Sam On Wed, May 18, 2016 at 6:20 PM, D

[ceph-users] ceph storage capacity does not free when deleting contents from RBD volumes

2016-05-19 Thread Albert Archer
Hello All. I am newbie in ceph. and i use jewel release for testing purpose. it seems every thing is OK, HEALTH_OK , all of OSDs are in UP and IN state. I create some RBD images (rbd create ) and map to some ubuntu host . I can read and write data to my volume , but when i delete some content

Re: [ceph-users] ceph storage capacity does not free when deleting contents from RBD volumes

2016-05-19 Thread Edward R Huyer
That is normal behavior. Ceph has no understanding of the filesystem living on top of the RBD, so it doesn’t know when space is freed up. If you are running a sufficiently current kernel, you can use fstrim to cause the kernel to tell Ceph what blocks are free. More details here: http://www

Re: [ceph-users] ceph storage capacity does not free when deleting contents from RBD volumes

2016-05-19 Thread Udo Lembke
Hi Albert, to free unused space you must enable trim (or do an fstrim) in the vm - and all things in the storage chain must support this. The normal virtio-driver don't support trim, but if you use scsi-disks with virtio-scsi-driver you can use it. Work well but need some time for huge filesystems.

Re: [ceph-users] ceph storage capacity does not free when deleting contents from RBD volumes

2016-05-19 Thread Albert Archer
Thank you for your great support . Best Regards Albert On Thu, May 19, 2016 at 10:41 PM, Udo Lembke wrote: > Hi Albert, > to free unused space you must enable trim (or do an fstrim) in the vm - > and all things in the storage chain must support this. > The normal virtio-driver don't support tri

Re: [ceph-users] ceph storage capacity does not free when deleting contents from RBD volumes

2016-05-19 Thread David Turner
You can also mount the rbd with the discard option. It works the same way as you would mount an ssd to free up the space when you delete things. I use the discard option on my ext4 rbds on Ubuntu and it frees up the used Ceph space immediately. Sent from my iPhone On May 19, 2016, at 12:30 PM,

Re: [ceph-users] ceph storage capacity does not free when deleting contents from RBD volumes

2016-05-19 Thread Christian Balzer
Hello, On Fri, 20 May 2016 00:11:02 + David Turner wrote: > You can also mount the rbd with the discard option. It works the same > way as you would mount an ssd to free up the space when you delete > things. I use the discard option on my ext4 rbds on Ubuntu and it frees > up the used Ceph

[ceph-users] Do you see a data loss if a SSD hosting several OSD journals crashes

2016-05-19 Thread EP Komarla
* We are trying to assess if we are going to see a data loss if an SSD that is hosting journals for few OSDs crashes. In our configuration, each SSD is partitioned into 5 chunks and each chunk is mapped as a journal drive for one OSD. What I understand from the Ceph documentation: "Consisten

Re: [ceph-users] Do you see a data loss if a SSD hosting several OSD journals crashes

2016-05-19 Thread Christian Balzer
Hello, first of all, wall of text. Don't do that. Use returns and paragraphs liberally to make reading easy. I'm betting at least half of the people who could have answered you question took a look at this blob of text and ignored it. Secondly, search engines are your friend. The first hit when

Re: [ceph-users] Do you see a data loss if a SSD hosting several OSD journals crashes

2016-05-19 Thread Dyweni - Ceph-Users
Hi, Yes and no, for the actual data loss. This depends on your crush map. If you're using the original map (which came with the installation), then your smallest failure domain will be the host. If you have replica size and 3 hosts and 5 OSDs per host (15 OSDs total), then loosing the journ

Re: [ceph-users] mark out vs crush weight 0

2016-05-19 Thread Christian Balzer
Hello, On Thu, 19 May 2016 13:26:33 +0200 Oliver Dzombic wrote: > Hi, > > a sparedisk is a nice idea. > > But i think thats something you can also do with a shellscript. > Definitely, but you're then going to have a very likely possibility of getting in conflict with your MONs and what they

Re: [ceph-users] Do you see a data loss if a SSD hosting several OSD journals crashes

2016-05-19 Thread Christian Balzer
Hello, On Fri, 20 May 2016 03:44:52 + EP Komarla wrote: > Thanks Christian. Point noted. Going forward I will write text to make > it easy to read. > > Thanks for your response. Losing a journal drive seems expensive as I > will have to rebuild 5 OSDs in this eventuality. > Potentially,