[ceph-users] LibRBD_Show Real Size of RBD Image

2016-11-29 Thread Sam Huracan
Hi all, I'm trying to use LIBRBD (Python) http://docs.ceph.com/docs/jewel/rbd/librbdpy/ Is there a way to find real size of RBD Image through LIBRBD?? I saw I can get it by CMD: http://ceph.com/planet/real-size-of-a-ceph-rbd-image/ Thanks ___ ceph-user

[ceph-users] Is there a setting on Ceph that we can use to fix the minimum read size?

2016-11-29 Thread Thomas Bennett
Hi, We have a use case where we are reading 128MB objects off spinning disks. We've benchmarked a number of different hard drive and have noticed that for a particular hard drive, we're experiencing slow reads by comparison. This occurs when we have multiple readers (even just 2) reading objects

Re: [ceph-users] Is there a setting on Ceph that we can use to fix the minimum read size?

2016-11-29 Thread Kate Ward
What filesystem do you use on the OSD? Have you considered a different filesystem that is better at combining requests before they get to the drive? k8 On Tue, Nov 29, 2016 at 9:52 AM Thomas Bennett wrote: > Hi, > > We have a use case where we are reading 128MB objects off spinning disks. > > W

Re: [ceph-users] Is there a setting on Ceph that we can use to fix the minimum read size?

2016-11-29 Thread Thomas Bennett
Hi Kate, Thanks for your reply. We currently use xfs as created by ceph-deploy. What would you recommend we try? Kind regards, Tom On Tue, Nov 29, 2016 at 11:14 AM, Kate Ward wrote: > What filesystem do you use on the OSD? Have you considered a different > filesystem that is better at combin

[ceph-users] pgs unfound

2016-11-29 Thread Xabier Elkano
Hi all, my cluster is in WARN state because apparently there are some pgs unfound. I think that I reached this situation because the metadata pool, this pool was in default root but without any use because I don't use cephfs, I only use rbd for VMs. I don't have OSDs in the default root, they are

Re: [ceph-users] High ops/s with kRBD and "--object-size 32M"

2016-11-29 Thread Nick Fisk
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Alex Gorbachev Sent: 29 November 2016 04:24 To: Francois Blondel ; Ilya Dryomov Cc: ceph-us...@ceph.com Subject: Re: [ceph-users] High ops/s with kRBD and "--object-size 32M" On Mon, Nov 28, 2016 at 2:59 PM I

Re: [ceph-users] - cluster stuck and undersized if at least one osd is down

2016-11-29 Thread Piotr Dzionek
Hi, You are right I missed that there is default time out for changing state from in to out for down osd. "mon osd down out interval" : 300and I didn't wait long enough before starting it again. Kind regards, Piotr Dzionek W dniu 28.11.2016 o 16:12, David Turner pisze: In the cluster you

[ceph-users] renaming ceph server names

2016-11-29 Thread Andrei Mikhailovsky
Hello. As a part of the infrastructure change we are planning to rename the servers running ceph-osd, ceph-mon and radosgw services. The IP addresses will be the same, it's only the server names which will need to change. I would like to find out the steps required to perform these changes? W

Re: [ceph-users] Production System Evaluation / Problems

2016-11-29 Thread ulembke
Am 2016-11-28 10:29, schrieb Strankowski, Florian: Hey guys, ... I simply cant get osd.0 back up. I took it offline, out, reinserterd, resetup, deleted the osd configs, remade them, no success whatsoever. IMHO the documentation on this part is a bit "lousy" so im missing some points of informa

Re: [ceph-users] - cluster stuck and undersized if at least one osd is down

2016-11-29 Thread Piotr Dzionek
Hi, As far as I understand if I set pool size 2, there is a chance to loose data when another osd dies while there is rebuild ongoing. However, it has to occur on the different host, because my crushmap forbids to store replicas on the same physical node. I am not sure what would change if I

[ceph-users] Regarding loss of heartbeats

2016-11-29 Thread Trygve Vea
Since Jewel, we've seen quite a bit of funky behaviour in Ceph. I've written about it a few times to the mailing list. Higher CPU utilization after the upgrade / Loss of heartbeats. We've looked at our network setup, and we've optimized some potential bottlenecks some places. Interesting thin

Re: [ceph-users] Regarding loss of heartbeats

2016-11-29 Thread Nick Fisk
> -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Trygve Vea > Sent: 29 November 2016 14:07 > To: ceph-users > Subject: [ceph-users] Regarding loss of heartbeats > > Since Jewel, we've seen quite a bit of funky behaviour in Ceph. I've wr

Re: [ceph-users] Regarding loss of heartbeats

2016-11-29 Thread Trygve Vea
- Den 29.nov.2016 15:20 skrev Nick Fisk n...@fisk.me.uk: >> -Original Message- >> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of >> Trygve >> Vea >> Sent: 29 November 2016 14:07 >> To: ceph-users >> Subject: [ceph-users] Regarding loss of heartbeats >> >> Si

Re: [ceph-users] Regarding loss of heartbeats

2016-11-29 Thread Nick Fisk
> -Original Message- > From: Trygve Vea [mailto:trygve@redpill-linpro.com] > Sent: 29 November 2016 14:36 > To: n...@fisk.me.uk > Cc: ceph-users > Subject: Re: Regarding loss of heartbeats > > - Den 29.nov.2016 15:20 skrev Nick Fisk n...@fisk.me.uk: > >> -Original Message---

Re: [ceph-users] export-diff behavior if an initial snapshot is NOT specified

2016-11-29 Thread Jason Dillaman
You are correct that there is an issue with the Hammer version when calculating diffs. If the clone has an object that obscures an extent within the parent image but didn't exist within the first snapshot of the clone, the diff results from the parent image won't be included in the result. I'll ope

Re: [ceph-users] LibRBD_Show Real Size of RBD Image

2016-11-29 Thread Jason Dillaman
The rbd CLI has a built-in disk usage command with the Jewel release that no longer requires the awk example. If you wanted to implement something similar using the Python API, you would need to use the "diff_iterate" API method to locate all used extents within an object and add them up to calcula

[ceph-users] Keep previous versions of ceph in the APT repository

2016-11-29 Thread Francois Lafont
Hi @all, Ceph teaem, could it be possible to keep the previous versions of ceph* packages in the APT repository? Indeed, for instance for Ubuntu Trusty, currently we have: ~$ curl -s http://download.ceph.com/debian-jewel/dists/trusty/main/binary-amd64/Packages | grep -A 1 '^Package: ceph$'

[ceph-users] Ceph Maintenance

2016-11-29 Thread Mike Jacobacci
Hello, I would like to install OS updates on the ceph cluster and activate a second 10gb port on the OSD nodes, so I wanted to verify the correct steps to perform maintenance on the cluster. We are only using rbd to back our xenserver vm's at this point, and our cluster consists of 3 OSD nodes, 3

Re: [ceph-users] Ceph Maintenance

2016-11-29 Thread David Turner
Everything is correct except for shutting down the VM's. There is no need for downtime during this upgrade. As long as your cluster comes back to health_ok (or just showing that the noout flag is set and nothing else), then you are free to move on to the next node. ___

[ceph-users] New to ceph - error running create-initial

2016-11-29 Thread Oleg Kolosov
Hi I've recently started working with ceph for a university project I have. I'm working on Amazon EC2 servers. I've used 4 instances: one is admin/mon + 3 OSDs. Right from the start I've encountered a problem. When running the following command: ceph-deploy --username ubuntu mon create-initial I'

Re: [ceph-users] New to ceph - error running create-initial

2016-11-29 Thread Vasu Kulkarni
If you are using 'master' build there is an issue workaround 1) before mon create-initial just run 'ceph-deploy admin mon-node' to push the admin key on mon nodes and then rerun mon create-initial 2) or use jewel build which is stable and if you dont need latest master ceph-deploy install --s

[ceph-users] Build version question

2016-11-29 Thread McFarland, Bruce
Using the ceph version string, for example ceph version 10.2.2-118-g894a5f8 (894a5f8d878d4b267f80b90a4bffce157f2b4ba7), how would I determine the versions of the various dependancies used in this build? For instance tcmalloc? Thanks, Bruce ___ ceph-use

Re: [ceph-users] Is there a setting on Ceph that we can use to fix the minimum read size?

2016-11-29 Thread Kate Ward
I have no experience with XFS, but wouldn't expect poor behaviour with it. I use ZFS myself and know that it would combine writes, but btrfs might be an option. Do you know what block size was used to create the XFS filesystem? It looks like 4k is the default (reasonable) with a max of 64k. Perhap

Re: [ceph-users] Is there a setting on Ceph that we can use to fix the minimum read size?

2016-11-29 Thread Steve Taylor
We configured XFS on our OSDs to use 1M blocks (our use case is RBDs with 1M blocks) due to massive fragmentation in our filestores a while back. We were having to defrag all the time and cluster performance was noticeably degraded. We also create and delete lots of RBD snapshots on a daily basi

Re: [ceph-users] Ceph Maintenance

2016-11-29 Thread Mike Jacobacci
OK I am in some trouble now and would love some help! After updating none of the OSDs on the node will come back up: ● ceph-disk@dev-sdb1.service loaded failed failedCeph disk activation: /dev/sdb1 ● ceph-disk@dev-sdb2.service loaded failed failedCeph disk activation: /dev/sdb2 ● ceph-d

Re: [ceph-users] Ceph Maintenance

2016-11-29 Thread Mike Jacobacci
I was able to bring the osd's up by looking at my other OSD node which is the exact same hardware/disks and finding out which disks map. But I still cant bring up any of the start ceph-disk@dev-sd* services... When I first installed the cluster and got the OSD's up, I had to run the following: #

Re: [ceph-users] Ceph Maintenance

2016-11-29 Thread John Petrini
What command are you using to start your OSD's? ___ John Petrini NOC Systems Administrator // *CoreDial, LLC* // coredial.com // [image: Twitter] [image: LinkedIn] [image: Google Plus]

Re: [ceph-users] Ceph Maintenance

2016-11-29 Thread John Petrini
Also, don't run sgdisk again; that's just for creating the journal partitions. ceph-disk is a service used for prepping disks, only the OSD services need to be running as far as I know. Are the ceph-osd@x. services running now that you've mounted the disks? ___ John Petrini NOC Systems Administr

Re: [ceph-users] Ceph Maintenance

2016-11-29 Thread Mike Jacobacci
Hi John, Thanks I wasn't sure if something happened to the journal partitions or not. Right now, the ceph-osd.0-9 services are back up and the cluster health is good, but none of the ceph-disk@dev-sd* services are running. How can I get the Journal partitions mounted again? Cheers, Mike On Tu

Re: [ceph-users] Build version question

2016-11-29 Thread Brad Hubbard
On Wed, Nov 30, 2016 at 6:20 AM, McFarland, Bruce wrote: > Using the ceph version string, for example ceph version > 10.2.2-118-g894a5f8 (894a5f8d878d4b267f80b90a4bffce157f2b4ba7), how would > I determine the versions of the various dependancies used in this build? > For instance tcmalloc? tcm

Re: [ceph-users] Ceph Maintenance

2016-11-29 Thread Mike Jacobacci
So it looks like the journal partition is mounted: ls -lah /var/lib/ceph/osd/ceph-0/journal lrwxrwxrwx. 1 ceph ceph 9 Oct 10 16:11 /var/lib/ceph/osd/ceph-0/journal -> /dev/sdb1 Here is the output of journalctl -xe when I try to start the ceph-diak@dev-sdb1 service: sh[17481]: mount_activate: Fai

Re: [ceph-users] Ceph Maintenance

2016-11-29 Thread Mike Jacobacci
I forgot to add: On Tue, Nov 29, 2016 at 6:28 PM, Mike Jacobacci wrote: > So it looks like the journal partition is mounted: > > ls -lah /var/lib/ceph/osd/ceph-0/journal > lrwxrwxrwx. 1 ceph ceph 9 Oct 10 16:11 /var/lib/ceph/osd/ceph-0/journal > -> /dev/sdb1 > > Here is the output of journalctl

Re: [ceph-users] Ceph Maintenance

2016-11-29 Thread Mike Jacobacci
Sorry about that... Here is the output of ceph-disk list: ceph-disk list /dev/dm-0 other, xfs, mounted on / /dev/dm-1 swap, swap /dev/dm-2 other, xfs, mounted on /home /dev/sda : /dev/sda2 other, LVM2_member /dev/sda1 other, xfs, mounted on /boot /dev/sdb : /dev/sdb1 ceph journal /dev/sdb2 cep

[ceph-users] undefined symbol: rados_nobjects_list_next

2016-11-29 Thread
| Hi , the error information is : [root@localhost pybind]# ceph Traceback (most recent call last): File "/usr/local/bin/ceph", line 118, in import rados ImportError: /usr/lib64/python2.7/rados.so: undefined symbol: rados_nobjects_list_next system: centos7 ceph : ceph-10.2.2.tar.gz I

Re: [ceph-users] Ceph Maintenance

2016-11-29 Thread Mike Jacobacci
Found some more info, but getting weird... All three OSD nodes shows the same unknown cluster message on all the OSD disks. I don't know where it came from, all the nodes were configured using ceph-deploy on the admin node. In any case, the OSD's seem to be up and running, the health is ok. no c

Re: [ceph-users] Ceph Maintenance

2016-11-29 Thread Vasu Kulkarni
you can ignore that, its a known issue http://tracker.ceph.com/issues/15990 regardless waht version of ceph are you running and what are the details of os version you updated to ? On Tue, Nov 29, 2016 at 7:12 PM, Mike Jacobacci wrote: > Found some more info, but getting weird... All three OSD n

Re: [ceph-users] Ceph Maintenance

2016-11-29 Thread Mike Jacobacci
Hi Vasu, Thank you that is good to know! I am running ceph version 10.2.3 and CentOS 7.2.1511 (Core) minimal. Cheers, Mike On Tue, Nov 29, 2016 at 7:26 PM, Vasu Kulkarni wrote: > you can ignore that, its a known issue http://tracker.ceph.com/ > issues/15990 > > regardless waht version of ceph

Re: [ceph-users] - cluster stuck and undersized if at least one osd is down

2016-11-29 Thread Brad Hubbard
On Tue, Nov 29, 2016 at 11:37 PM, Piotr Dzionek wrote: > Hi, > > As far as I understand if I set pool size 2, there is a chance to loose data > when another osd dies while there is rebuild ongoing. However, it has to > occur on the different host, because my crushmap forbids to store replicas >

Re: [ceph-users] - cluster stuck and undersized if at least one osd is down

2016-11-29 Thread Christian Balzer
Hello, On Wed, 30 Nov 2016 13:39:50 +1000 Brad Hubbard wrote: > > > On Tue, Nov 29, 2016 at 11:37 PM, Piotr Dzionek > wrote: > > Hi, > > > > As far as I understand if I set pool size 2, there is a chance to loose data > > when another osd dies while there is rebuild ongoing. However, it has

Re: [ceph-users] - cluster stuck and undersized if at least one osd is down

2016-11-29 Thread Brad Hubbard
On Wed, Nov 30, 2016 at 1:54 PM, Christian Balzer wrote: > > Hello, > > On Wed, 30 Nov 2016 13:39:50 +1000 Brad Hubbard wrote: > >> >> >> On Tue, Nov 29, 2016 at 11:37 PM, Piotr Dzionek >> wrote: >> > Hi, >> > >> > As far as I understand if I set pool size 2, there is a chance to loose >> > d

Re: [ceph-users] export-diff behavior if an initial snapshot is NOT specified

2016-11-29 Thread Zhongyan Gu
Jason, I test Jewel and confirmed Jewel has no such issue. Could you tell me what is the specific pull that can be backported to hammer to fix this issue?? Zhongyan On Tue, Nov 29, 2016 at 10:52 PM, Jason Dillaman wrote: > You are correct that there is an issue with the Hammer version when > ca

Re: [ceph-users] export-diff behavior if an initial snapshot is NOT specified

2016-11-29 Thread Venky Shankar
On 16-11-30 12:07:33, Zhongyan Gu wrote: > Jason, > I test Jewel and confirmed Jewel has no such issue. > Could you tell me what is the specific pull that can be backported to > hammer to fix this issue?? PR: https://github.com/ceph/ceph/pull/12218 > > Zhongyan > > On Tue, Nov 29, 2016 at 10:52

Re: [ceph-users] export-diff behavior if an initial snapshot is NOT specified

2016-11-29 Thread Zhongyan Gu
Thanks for the quick confirm and fix. I reviewed the pull request. Seems that this fix is going into Jewel and Hammer. So does this mean this issue also exists in Jewel? why I test Jewel and cannot reproduce this issue?? Zhongyan On Wed, Nov 30, 2016 at 12:32 PM, Venky Shankar wrote: > On 16-11

[ceph-users] Mount of CephFS hangs

2016-11-29 Thread Jens Offenbach
Hi, I am confronted with a persistent problem during mounting of the CephFS. I am using Ubuntu 16.04 and solely ceph-fuse. The CephFS gets mounted by muliple machines and very ofen (not always, but in most cases) the mount process hangs and does not continue. "df -h" also hangs and nothing happe