[ceph-users] what's the meaning of 'removed_snaps' of `ceph osd pool ls detail`?

2016-07-07 Thread ????
Hi,All:) i have made a cache-tier, but i do not know message 'removed_snaps [1~1,3~6,b~6,13~c,21~4,26~1,28~1a,4e~4,53~5,5c~5,63~1,65~4,6b~4]'. i have not snapped any thing yet. ceph> osd pool ls detail pool 0 'rbd' replicated size 3 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 64

Re: [ceph-users] multiple journals on SSD

2016-07-07 Thread Nick Fisk
Just to add if you really want to go with lots of HDD's to Journals then go NVME. They are not a lot more expensive than the equivalent SATA based 3700's, but the latency is low low low. Here is an example of a node I have just commissioned with 12 HDD's to one P3700 Device: rrqm/s wrqm/

[ceph-users] layer3 network

2016-07-07 Thread Matyas Koszik
Hi, My setup uses a layer3 network, where each node has two connections (/31s), equipped with a loopback address and redundancy is provided via OSPF. In this setup it is important to use the loopback address as source for outgoing connections, since the interface addresses are not protected from

Re: [ceph-users] multiple journals on SSD

2016-07-07 Thread George Shuklin
The are two problems I found so far: 1) You can not alter parition table if it is in the use. That means you need to stop all ceph-osd who use journals on given OSD to change anything on it. Worse: you can change, but you can not force kernel to reread partition table. 2) I found udev bug with

Re: [ceph-users] layer3 network

2016-07-07 Thread Nick Fisk
> -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Matyas Koszik > Sent: 07 July 2016 11:26 > To: ceph-users@lists.ceph.com > Subject: [ceph-users] layer3 network > > > > Hi, > > My setup uses a layer3 network, where each node has two conn

[ceph-users] Calamari doesn't detect a running cluster despite of connected ceph servers

2016-07-07 Thread Pieroth.N
Hello, i know there are lots of users with the same problem and if this is the wrong mailinglist, please tell me. I've tried to fix this but this calamari is driving me nuts. Problem: Running Ceph Cluster with healty state. calamari gui once tried to connect the ceph-nodes, but after the 120 Sec.

Re: [ceph-users] multiple journals on SSD

2016-07-07 Thread Christian Balzer
Hello Nick, On Thu, 7 Jul 2016 09:45:58 +0100 Nick Fisk wrote: > Just to add if you really want to go with lots of HDD's to Journals then > go NVME. They are not a lot more expensive than the equivalent SATA based > 3700's, but the latency is low low low. Here is an example of a node I > have ju

Re: [ceph-users] layer3 network

2016-07-07 Thread George Shuklin
I found no options about source IP for ceph. Probably you may try to use network namespaces to isolate ceph services with desired interfaces. This would require a bit more setup though. You would need to create namespace, add some kind of patch (veth?) interface between namespace and host, but

[ceph-users] RBD - Deletion / Discard - IO Impact

2016-07-07 Thread Nick Fisk
Hi All, Does anybody else see a massive (ie 10x) performance impact when either deleting a RBD or running something like mkfs.xfs against an existing RBD, which would zero/discard all blocks? In the case of deleting a 4TB RBD, I'm seeing latency in some cases rise up to 10s. It looks

Re: [ceph-users] layer3 network

2016-07-07 Thread Luis Periquito
If, like me, you have several different networks, or they overlap for whatever reason, I just have the options: mon addr = IP:port osd addr = IP in the relevant sections. However I use puppet to deploy ceph, and all files are "manually" created. So it becomes something like this: [mon.mon1] h

Re: [ceph-users] RBD - Deletion / Discard - IO Impact

2016-07-07 Thread Anand Bhat
These are known problem. Are you doing mkfs.xfs on SSD? If so, please check SSD data sheets whether UNMAP is supported. To avoid unmap during mkfs, use mkfs.xfs -K Regards, Anand On Thu, Jul 7, 2016 at 5:23 PM, Nick Fisk wrote: > Hi All, > > > > Does anybody else see a massive (ie 10x) perform

Re: [ceph-users] multiple journals on SSD

2016-07-07 Thread Nick Fisk
Hi Christian, > -Original Message- > From: Christian Balzer [mailto:ch...@gol.com] > Sent: 07 July 2016 12:57 > To: ceph-users@lists.ceph.com > Cc: Nick Fisk > Subject: Re: [ceph-users] multiple journals on SSD > > > Hello Nick, > > On Thu, 7 Jul 2016 09:45:58 +0100 Nick Fisk wrote: >

Re: [ceph-users] RBD - Deletion / Discard - IO Impact

2016-07-07 Thread Nick Fisk
> -Original Message- > From: Anand Bhat [mailto:anand.b...@gmail.com] > Sent: 07 July 2016 13:46 > To: n...@fisk.me.uk > Cc: ceph-users > Subject: Re: [ceph-users] RBD - Deletion / Discard - IO Impact > > These are known problem. > > Are you doing mkfs.xfs on SSD? If so, please check SSD

Re: [ceph-users] layer3 network

2016-07-07 Thread Matyas Koszik
Setting 'osd addr' in the osd configuration section unfortunately also does not influence source address selection, the outgoing interface IP is used like before. On Thu, 7 Jul 2016, Luis Periquito wrote: > If, like me, you have several different networks, or they overlap for > whatever reason

Re: [ceph-users] layer3 network

2016-07-07 Thread Nick Fisk
> -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Matyas Koszik > Sent: 07 July 2016 14:01 > To: ceph-users@lists.ceph.com > Subject: Re: [ceph-users] layer3 network > > > Setting 'osd addr' in the osd configuration section unfortunately

Re: [ceph-users] (no subject)

2016-07-07 Thread Gaurav Goyal
Hello Mr. Kees, Thanks for your response! My setup is Openstack Node 1 -> controller + network + compute1 (Liberty Version) Openstack node 2 --> Compute2 Ceph version Hammer I am using dell storage with following status DELL SAN storage is attached to both hosts as [root@OSKVM1 ~]# iscsiadm

Re: [ceph-users] ceph-fuse segfaults ( jewel 10.2.2)

2016-07-07 Thread Patrick Donnelly
On Thu, Jul 7, 2016 at 2:01 AM, Goncalo Borges wrote: > Unfortunately, the other user application breaks ceph-fuse again (It is a > completely different application then in my previous test). > > We have tested it in 4 machines with 4 cores. The user is submitting 16 > single core jobs which are a

Re: [ceph-users] is it time already to move from hammer to jewel?

2016-07-07 Thread Shain Miley
+1 on looking for some thoughts on this. We are in the same boat and looking for some guidance as well. Thanks, Shain On 07/06/2016 01:47 PM, Zoltan Arnold Nagy wrote: Hey, Those out there who are running production clusters: have you upgraded already to Jewel? I usually wait until .2 is ou

[ceph-users] How to check consistency of File / Block Data

2016-07-07 Thread Venkata Manojawa Paritala
Hi, Is there any way we can check/verify data consistency for block and file data in Ceph. I need to develop a script to the same. WRT object data, I am checking the consistency with the below method. 1. Create a file and calculate md5 checksum for it. 2. Push the file to a ceph pool. 3. Get the

Re: [ceph-users] is it time already to move from hammer to jewel?

2016-07-07 Thread Alexandre DERUMIER
For a new cluster, it still missing this udev rules https://github.com/ceph/ceph/commit/35004a628b2969d8b2f1c02155bb235165a1d809 but it's not a problem on existing cluster, as the old udev rules till exist I think. Anyway, you can copy it manually. I'm running jewel without any problem since

[ceph-users] Monitor question

2016-07-07 Thread Fran Barrera
Hi all, I have a cluster setup AIO with only one monitor and now I've created another monitor in other server following this doc http://docs.ceph.com/docs/master/rados/operations/add-or-rm-mons/ but my problem is if I stop the AIO monitor, the cluster stop working. It seems like the ceph is not up

Re: [ceph-users] Monitor question

2016-07-07 Thread Matyas Koszik
Hi, That error message is normal, it just says your monitor is down (which it is). If you have added the second monitor in your ceph.conf, then it'll try contacting that, and if it's up and reachable, this will succeed, so after that scary error message you should see the normal reply as well. T

Re: [ceph-users] Monitor question

2016-07-07 Thread Joao Eduardo Luis
On 07/07/2016 04:17 PM, Fran Barrera wrote: Hi all, I have a cluster setup AIO with only one monitor and now I've created another monitor in other server following this doc http://docs.ceph.com/docs/master/rados/operations/add-or-rm-mons/ but my problem is if I stop the AIO monitor, the cluster

Re: [ceph-users] Monitor question

2016-07-07 Thread Fran Barrera
Hello, Yes I've added two monitors but the error persist. In the error I see only the IP of the first mon, why not appears the second? I had only one monitors before and running good because I have installed AIO. Thanks. 2016-07-07 17:22 GMT+02:00 Joao Eduardo Luis : > On 07/07/2016 04:17 PM,

Re: [ceph-users] Monitor question

2016-07-07 Thread Joao Eduardo Luis
On 07/07/2016 04:31 PM, Fran Barrera wrote: Hello, Yes I've added two monitors but the error persist. In the error I see only the IP of the first mon, why not appears the second? The description you offered on the initial email appears to state the following: - You initially had one monitor

Re: [ceph-users] (no subject)

2016-07-07 Thread Fran Barrera
Hello, Are you configured these two paremeters in cinder.conf? rbd_user rbd_secret_uuid Regards. 2016-07-07 15:39 GMT+02:00 Gaurav Goyal : > Hello Mr. Kees, > > Thanks for your response! > > My setup is > > Openstack Node 1 -> controller + network + compute1 (Liberty Version) > Openstack node

Re: [ceph-users] Monitor question

2016-07-07 Thread Fran Barrera
Yes, this is the problem. 2016-07-07 17:34 GMT+02:00 Joao Eduardo Luis : > On 07/07/2016 04:31 PM, Fran Barrera wrote: > >> Hello, >> >> Yes I've added two monitors but the error persist. In the error I see >> only the IP of the first mon, why not appears the second? >> > > The description you of

Re: [ceph-users] Monitor question

2016-07-07 Thread Joao Eduardo Luis
On 07/07/2016 04:39 PM, Fran Barrera wrote: Yes, this is the problem. Well, you lose quorum once you stop A. As the docs clearly state, you cannot tolerate failures if you have just two monitors. If your cluster only has two monitors, you cannot form quorum with just one monitor: you need

Re: [ceph-users] Monitor question

2016-07-07 Thread Fran Barrera
Ok, I understand, so I'll create a new mon to permit me stop the mon.a. Thanks, Fran. 2016-07-07 17:46 GMT+02:00 Joao Eduardo Luis : > On 07/07/2016 04:39 PM, Fran Barrera wrote: > >> Yes, this is the problem. >> > > Well, you lose quorum once you stop A. > > As the docs clearly state, you canno

[ceph-users] repomd.xml: [Errno 14] HTTP Error 404 - Not Found on download.ceph.com for rhel7

2016-07-07 Thread Martin Palma
Hi All, it seems that the "rhel7" folder/symlink on "download.ceph.com/rpm-hammer" does not exist anymore therefore ceph-deploy fails to deploy a new cluster. Just tested it by setting up a new lab environment. We have the same issue on our production cluster currently, which keeps us of updating

[ceph-users] radosgw live upgrade hammer -> jewel

2016-07-07 Thread Luis Periquito
Hi all, I have (some) ceph clusters running hammer and they are serving S3 data. There are a few radosgw serving requests, in a load balanced form (actually OSPF anycast IPs). Usually upgrades go smoothly whereby I upgrade a node at a time, and traffic just gets redirected around the nodes that a

Re: [ceph-users] (no subject)

2016-07-07 Thread Gaurav Goyal
Hi Fran, Here is my cinder.conf file. Please help to analyze it. Do i need to create volume group as mentioned in this link http://docs.openstack.org/liberty/install-guide-rdo/cinder-storage-install.html [root@OSKVM1 ~]# grep -v "^#" /etc/cinder/cinder.conf|grep -v ^$ [DEFAULT] rpc_backend =

[ceph-users] RBD Watch Notify for snapshots

2016-07-07 Thread Nick Fisk
Hi All, I have a RBD mounted to a machine via the kernel client and I wish to be able to take a snapshot and mount it to another machine where it can be backed up. The big issue is that I need to make sure that the process writing on the source machine is finished and the FS is sync'd before ta

Re: [ceph-users] what's the meaning of 'removed_snaps' of `ceph osd pool ls detail`?

2016-07-07 Thread Gregory Farnum
On Thu, Jul 7, 2016 at 1:07 AM, 秀才 wrote: > Hi,All:) > > i have made a cache-tier, > but i do not know message 'removed_snaps > [1~1,3~6,b~6,13~c,21~4,26~1,28~1a,4e~4,53~5,5c~5,63~1,65~4,6b~4]'. > i have not snapped any thing yet. When you take snapshots, it generally creates a lot of tracking da

[ceph-users] Ceph Social Media

2016-07-07 Thread Patrick McGarry
Hey cephers, Just wanted to remind everyone that our Ceph social media channels are for all upstream consumption. If you are doing something cool with Ceph, have a new feature/integration to announce, or just some piece of news that would be of-interest to the Ceph community, please send it my way

Re: [ceph-users] multiple journals on SSD

2016-07-07 Thread Zoltan Arnold Nagy
Hi Nick, How large NVMe drives are you running per 12 disks? In my current setup I have 4xP3700 per 36 disks but I feel like I could get by with 2… Just looking for community experience :-) Cheers, Zoltan > On 07 Jul 2016, at 10:45, Nick Fisk wrote: > > Just to add if you really want to go w

[ceph-users] Failing to Activate new OSD ceph-deploy

2016-07-07 Thread Scottix
Hey, This is the first time I have had a problem with ceph-deploy I have attached the log but I can't seem to activate the osd. I am running ceph version 10.2.0 (3a9fba20ec743699b69bd0181dd6c54dc01c64b9) I did upgrade from Infernalis->Jewel I haven't changed ceph ownership but I do have the conf

Re: [ceph-users] Failing to Activate new OSD ceph-deploy

2016-07-07 Thread Scottix
I played with it enough to make it work. Basically i created the directory it was going to put the data in mkdir /var/lib/ceph/osd/ceph-22 Then I ran ceph-deploy activate which then did a little bit more into putting it in the cluster but it still didn't start because of permissions with the jour

Re: [ceph-users] ceph-fuse segfaults ( jewel 10.2.2)

2016-07-07 Thread Brad Hubbard
Hi Goncalo, If possible it would be great if you could capture a core file for this with full debugging symbols (preferably glibc debuginfo as well). How you do that will depend on the ceph version and your OS but we can offfer help if required I'm sure. Once you have the core do the following.

[ceph-users] ceph/daemon mon not working and status exit (1)

2016-07-07 Thread Rahul Talari
I am trying to use Ceph in Docker. I have built the ceph/base and ceph/daemon DockeFiles. I am trying to deploy a Ceph monitor according to the instructions given in the tutorial but when I execute the command without KV store and type: sudo docker ps I am not able to keep the monitor up. What mi

Re: [ceph-users] multiple journals on SSD

2016-07-07 Thread Christian Balzer
Hello, On Thu, 7 Jul 2016 23:19:35 +0200 Zoltan Arnold Nagy wrote: > Hi Nick, > > How large NVMe drives are you running per 12 disks? > > In my current setup I have 4xP3700 per 36 disks but I feel like I could > get by with 2… Just looking for community experience :-) > This is funny, because

Re: [ceph-users] (no subject)

2016-07-07 Thread Gaurav Goyal
Hi Kees/Fran, Do you find any issue in my cinder.conf file? it says Volume group "cinder-volumes" not found. When to configure this volume group? I have done ceph configuration for nova creation. But i am still facing the same error . */var/log/cinder/volume.log* 2016-07-07 16:20:13.765 136

Re: [ceph-users] RBD - Deletion / Discard - IO Impact

2016-07-07 Thread Christian Balzer
On Thu, 7 Jul 2016 12:53:33 +0100 Nick Fisk wrote: > Hi All, > > > > Does anybody else see a massive (ie 10x) performance impact when either > deleting a RBD or running something like mkfs.xfs against an existing > RBD, which would zero/discard all blocks? > > > > In the case of deleting a

Re: [ceph-users] (no subject)

2016-07-07 Thread Jason Dillaman
These lines from your log output indicates you are configured to use LVM as a cinder backend. > 2016-07-07 16:20:31.966 32549 INFO cinder.volume.manager [req-f9371a24-bb2b-42fb-ad4e-e2cfc271fe10 - - - - -] Starting volume driver LVMVolumeDriver (3.0.0) > 2016-07-07 16:20:32.067 32549 ERROR cinder.

Re: [ceph-users] (no subject)

2016-07-07 Thread Gaurav Goyal
Thanks for the verification! Yeah i didnt find additional section for [ceph] in my cinder.conf file. Should i create that manually? As i didnt find [ceph] section so i modified same parameters in [DEFAULT] section. I will change that as per your suggestion. Moreoevr checking some other links i go

Re: [ceph-users] RBD Watch Notify for snapshots

2016-07-07 Thread Jason Dillaman
librbd pseudo-automatically handles this by flushing the cache to the snapshot when a new snapshot is created, but I don't think krbd does the same. If it doesn't, it would probably be a nice addition to the block driver to support the general case. Baring that (or if you want to involve something

[ceph-users] 5 pgs of 712 stuck in active+remapped

2016-07-07 Thread Nathanial Byrnes
Hello, I've got a Jewel Cluster (3 nodes, 15 OSD's) running with bobtail tunables (my xenserver cluster uses 3.10 as the kernel and there's no upgrading that). I started the cluster out on Hammer, upgraded to Jewel, discovered that optimal tunables would not work, and then set the tuna

Re: [ceph-users] (no subject)

2016-07-07 Thread Kees Meijs
Hi Gaurav, The following snippets should suffice (for Cinder, at least): > [DEFAULT] > enabled_backends=rbd > > [rbd] > volume_driver = cinder.volume.drivers.rbd.RBDDriver > rbd_pool = cinder-volumes > rbd_ceph_conf = /etc/ceph/ceph.conf > rbd_flatten_volume_from_snapshot = false > rbd_max_clone_d