date:20170706

Re: [ceph-users] ceph-mon leader election problem, should it be improved ?

2017-07-06 Thread Z Will

Hi Joao : Thanks for thorough analysis . My initial concern is that , I think in some cases , network failure will make low rank monitor see little siblings (not enough to form a quorum ) , but some high rank mointor can see more siblings, so I want to try to choose the one who can see the most

Re: [ceph-users] OSD Full Ratio Luminous - Unset

2017-07-06 Thread Ashley Merrick

Anyone have some feedback on this? Happy to log a bug ticket if it is one, but want to make sure not missing something Luminous change related. ,Ashley Sent from my iPhone On 4 Jul 2017, at 3:30 PM, Ashley Merrick mailto:ash...@amerrick.co.uk>> wrote: Okie noticed their is a new command to s

Re: [ceph-users] How to force "rbd unmap"

2017-07-06 Thread Ilya Dryomov

On Thu, Jul 6, 2017 at 1:28 PM, Stanislav Kopp wrote: > Hi, > > 2017-07-05 20:31 GMT+02:00 Ilya Dryomov : >> On Wed, Jul 5, 2017 at 7:55 PM, Stanislav Kopp wrote: >>> Hello, >>> >>> I have problem that sometimes I can't unmap rbd device, I get "sysfs >>> write failed rbd: unmap failed: (16) Devic

Re: [ceph-users] How to set up bluestore manually?

2017-07-06 Thread Martin Emrich

Hi! I changed the partitioning scheme to use a "real" primary partition instead of a logical volume. Ceph-deploy seems run fine now, but the OSD does not start. I see lots of these in the journal: Jul 06 13:53:42 sh[9768]: 0> 2017-07-06 13:53:42.794027 7fcf9918fb80 -1 *** Caught signal (Abort

[ceph-users] Note about rbd_aio_write usage

2017-07-06 Thread Piotr Dałek

Hi, If you're using "rbd_aio_write()" in your code, be aware of the fact that before Luminous release, this function expects buffer to remain unchanged until write op ends, and on Luminous and later this function internally copies the buffer, allocating memory where needed, freeing it once wri

Re: [ceph-users] How to force "rbd unmap"

2017-07-06 Thread Ilya Dryomov

On Thu, Jul 6, 2017 at 2:23 PM, Stanislav Kopp wrote: > 2017-07-06 14:16 GMT+02:00 Ilya Dryomov : >> On Thu, Jul 6, 2017 at 1:28 PM, Stanislav Kopp wrote: >>> Hi, >>> >>> 2017-07-05 20:31 GMT+02:00 Ilya Dryomov : On Wed, Jul 5, 2017 at 7:55 PM, Stanislav Kopp wrote: > Hello, > >

Re: [ceph-users] Note about rbd_aio_write usage

2017-07-06 Thread Jason Dillaman

Pre-Luminous also copies the provided buffer when using the C API -- it just copies it at a later point and not immediately. The eventual goal is to eliminate the copy completely, but that requires some additional plumbing work deep down within the librados messenger layer. On Thu, Jul 6, 2017 at

Re: [ceph-users] Note about rbd_aio_write usage

2017-07-06 Thread Piotr Dałek

On 17-07-06 03:03 PM, Jason Dillaman wrote: On Thu, Jul 6, 2017 at 8:26 AM, Piotr Dałek wrote: Hi, If you're using "rbd_aio_write()" in your code, be aware of the fact that before Luminous release, this function expects buffer to remain unchanged until write op ends, and on Luminous and later

Re: [ceph-users] Note about rbd_aio_write usage

2017-07-06 Thread Jason Dillaman

The correct (POSIX-style) program behavior should treat the buffer as immutable until the IO operation completes. It is never safe to assume the buffer can be re-used while the IO is in-flight. You should not add any logic to assume the buffer is safely copied prior to the completion of the IO. On

[ceph-users] Adding storage to exiting clusters with minimal impact

2017-07-06 Thread bruno.canning

Hi Ceph Users, We plan to add 20 storage nodes to our existing cluster of 40 nodes, each node has 36 x 5.458 TiB drives. We plan to add the storage such that all new OSDs are prepared, activated and ready to take data but not until we start slowly increasing their weightings. We also expect thi

[ceph-users] Speeding up backfill after increasing PGs and or adding OSDs

2017-07-06 Thread george.vasilakakos

Hey folks, We have a cluster that's currently backfilling from increasing PG counts. We have tuned recovery and backfill way down as a "precaution" and would like to start tuning it to bring up to a good balance between that and client I/O. At the moment we're in the process of bumping up PG nu

Re: [ceph-users] Note about rbd_aio_write usage

2017-07-06 Thread Piotr Dałek

On 17-07-06 03:43 PM, Jason Dillaman wrote: I've learned the hard way that pre-luminous, even if it copies the buffer, it does so too late. In my specific case, my FUSE module does enter the write call and issues rbd_aio_write there, then exits the write - expecting the buffer provided by FUSE to

Re: [ceph-users] ceph-mon leader election problem, should it be improved ?

2017-07-06 Thread Sage Weil

On Thu, 6 Jul 2017, Z Will wrote: > Hi Joao : > > Thanks for thorough analysis . My initial concern is that , I think > in some cases , network failure will make low rank monitor see little > siblings (not enough to form a quorum ) , but some high rank mointor > can see more siblings, so I want

Re: [ceph-users] Note about rbd_aio_write usage

2017-07-06 Thread Jason Dillaman

On Thu, Jul 6, 2017 at 10:22 AM, Piotr Dałek wrote: > So I really see two problems here: lack of API docs and > backwards-incompatible change in API behavior. Docs are always in need of update, so any pull requests would be greatly appreciated. However, I disagree that the behavior has substanti

Re: [ceph-users] Speeding up backfill after increasing PGs and or adding OSDs

2017-07-06 Thread David Turner

Just a quick place to start is osd_max_backfills. You have this set to 1. Each PG is on 11 OSDs. When you have a PG moving, it is on the original 11 OSDs and the new X number of OSDs that it is going to. For each of your PGs that is moving, an OSD can only move 1 at a time (your osd_max_backfill

Re: [ceph-users] Note about rbd_aio_write usage

2017-07-06 Thread Piotr Dałek

On 17-07-06 04:40 PM, Jason Dillaman wrote: On Thu, Jul 6, 2017 at 10:22 AM, Piotr Dałek wrote: So I really see two problems here: lack of API docs and backwards-incompatible change in API behavior. Docs are always in need of update, so any pull requests would be greatly appreciated. However

Re: [ceph-users] Adding storage to exiting clusters with minimal impact

2017-07-06 Thread Gregory Farnum

On Thu, Jul 6, 2017 at 7:04 AM wrote: > Hi Ceph Users, > > > > We plan to add 20 storage nodes to our existing cluster of 40 nodes, each > node has 36 x 5.458 TiB drives. We plan to add the storage such that all > new OSDs are prepared, activated and ready to take data but not until we > start sl

Re: [ceph-users] Degraded objects while OSD is being added/filled

2017-07-06 Thread Gregory Farnum

On Tue, Jul 4, 2017 at 10:47 PM Eino Tuominen wrote: > Hello, > > > I noticed the same behaviour in our cluster. > > > ceph version 10.2.7 (50e863e0f4bc8f4b9e31156de690d765af245185) > > > > cluster 0a9f2d69-5905-4369-81ae-e36e4a791831 > > health HEALTH_WARN > > 1 pgs backfil

Re: [ceph-users] New cluster - configuration tips and reccomendation - NVMe

2017-07-06 Thread Massimiliano Cuttini

WOW! Thanks to everybody! A tons of suggestion and good tips! At the moment we are already using 100Gb/s cards and we are already adopted 100Gb/s switch so we can go with 40Gb/s that are fully compatible with our SWITCH. About CPU I was wrong, the model that we are seeing is not 2603 but 2630

Re: [ceph-users] New cluster - configuration tips and reccomendation - NVMe

2017-07-06 Thread Wido den Hollander

> Op 6 juli 2017 om 18:27 schreef Massimiliano Cuttini : > > > WOW! > > Thanks to everybody! > A tons of suggestion and good tips! > > At the moment we are already using 100Gb/s cards and we are already > adopted 100Gb/s switch so we can go with 40Gb/s that are fully > compatible with our SW

Re: [ceph-users] Watch for fstrim running on your Ubuntu systems

2017-07-06 Thread Reed Dier

Hi Wido, I came across this ancient ML entry with no responses and wanted to follow up with you to see if you recalled any solution to this. Copying the ceph-users list to preserve any replies that may result for archival. I have a couple of boxes with 10x Micron 5100 SATA SSD’s, journaled on M

[ceph-users] krbd journal support

2017-07-06 Thread Maged Mokhtar

Hi all, Are there any plans to support rbd journal feature in kernel krbd ? Cheers /Maged ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Adding storage to exiting clusters with minimal impact

2017-07-06 Thread Brian Andrus

On Thu, Jul 6, 2017 at 9:18 AM, Gregory Farnum wrote: > On Thu, Jul 6, 2017 at 7:04 AM wrote: > >> Hi Ceph Users, >> >> >> >> We plan to add 20 storage nodes to our existing cluster of 40 nodes, each >> node has 36 x 5.458 TiB drives. We plan to add the storage such that all >> new OSDs are prep

[ceph-users] Removing very large buckets

2017-07-06 Thread Eric Beerman

Hello, We have a bucket that has 60 million + objects in it, and are trying to delete it. To do so, we have tried doing: radosgw-admin bucket list --bucket= and then cycling through the list of object names and deleting them, 1,000 at a time. However, after ~3-4k objects deleted, the list call

Re: [ceph-users] Speeding up backfill after increasing PGs and or adding OSDs

2017-07-06 Thread george.vasilakakos

Thanks for your response David. What you've described has been what I've been thinking about too. We have 1401 OSDs in the cluster currently and this output is from the tail end of the backfill for +64 PG increase on the biggest pool. The problem is we see this cluster do at most 20 backfills a

Re: [ceph-users] Note about rbd_aio_write usage

2017-07-06 Thread Jason Dillaman

On Thu, Jul 6, 2017 at 11:46 AM, Piotr Dałek wrote: > How about a hybrid solution? Keep the old rbd_aio_write contract (don't copy > the buffer with the assumption that it won't change) and instead of > constructing bufferlist containing bufferptr to copied data, construct a > bufferlist containin

Re: [ceph-users] Speeding up backfill after increasing PGs and or adding OSDs

2017-07-06 Thread David Turner

ceph pg dump | grep backfill Look through the output of that command and see the acting (osds the pg is on/moving off of) and current (where the pg will end up). All it takes is a single osd being listed on a pg currently backfilling and any other PGs it's listed on will be backfill+wait and have

Re: [ceph-users] Watch for fstrim running on your Ubuntu systems

2017-07-06 Thread Peter Maloney

Hey, I have some SAS Micron S630DC-400 which came with firmware M013 which did the same or worse (takes very long... 100% blocked for about 5min for 16GB trimmed), and works just fine with firmware M017 (4s for 32GB trimmed). So maybe you just need an update. Peter On 07/06/17 18:39, Reed Dier

Re: [ceph-users] Adding storage to exiting clusters with minimal impact

2017-07-06 Thread Peter Maloney

Here's my possibly unique method... I had 3 nodes with 12 disks each, and when adding 2 more nodes, I had issues with the common method you describe, totally blocking clients for minutes, but this worked great for me: > my own method > - osd max backfills = 1 and osd recovery max active = 1 > - cr

Re: [ceph-users] Degraded objects while OSD is being added/filled

2017-07-06 Thread Andras Pataki

Hi Greg, At the moment our cluster is all in balance. We have one failed drive that will be replaced in a few days (the OSD has been removed from ceph and will be re-added with the replacement drive). I'll document the state of the PGs before the addition of the drive and during the recover

Re: [ceph-users] How to set up bluestore manually?

2017-07-06 Thread Vasu Kulkarni

I recommend you file a tracker issue at http://tracker.ceph.com/ with all details( ceph version, steps you ran and output hiding out anything you dont want to put), I doubt its a ceph-deploy issue but we can try in our lab to replicate it. On Thu, Jul 6, 2017 at 5:25 AM, Martin Emrich wrote: >

Re: [ceph-users] Note about rbd_aio_write usage

2017-07-06 Thread Jason Dillaman

On Thu, Jul 6, 2017 at 3:25 PM, Piotr Dałek wrote: > Is that deep copy an equivalent of what > Jewel librbd did at unspecified point of time, or extra one? It's equivalent / replacement -- not an additional copy. This was changed to support scatter/gather IO API methods which the latest version o

Re: [ceph-users] krbd journal support

2017-07-06 Thread Jason Dillaman

There are no immediate plans to support the RBD journaling in krbd. The journaling feature requires a lot of code and, with limited resources, the priority has been to provide alternative block device options that pass-through to librbd for such use-cases and to optimize the performance of librbd /

[ceph-users] How to set Ceph client operation priority (ionice)

2017-07-06 Thread Su, Zhan

Hi, We are running a Ceph cluster serving both batch workload (e.g. data import / export, offline processing) and latency-sensitive workload. Currently batch traffic causes a huge slow down in serving latency-sensitive requests (e.g. streaming). When that happens, network is not the bottleneck (50

Re: [ceph-users] Watch for fstrim running on your Ubuntu systems

2017-07-06 Thread Reed Dier

I could easily see that being the case, especially with Micron as a common thread, but it appears that I am on the latest FW for both the SATA and the NVMe: > $ sudo ./msecli -L | egrep 'Device|FW' > Device Name : /dev/sda > FW-Rev : D0MU027 > Device Name : /dev/s

Re: [ceph-users] Speeding up backfill after increasing PGs and or adding OSDs

2017-07-06 Thread Christian Balzer

Hello, On Thu, 6 Jul 2017 17:57:06 + george.vasilaka...@stfc.ac.uk wrote: > Thanks for your response David. > > What you've described has been what I've been thinking about too. We have > 1401 OSDs in the cluster currently and this output is from the tail end of > the backfill for +64 PG

Re: [ceph-users] Removing very large buckets

2017-07-06 Thread Blair Bethwaite

How did you even get 60M objects into the bucket...?! The stuck requests are only likely to be impacting the PG in which the bucket index is stored. Hopefully you are not running other pools on those OSDs? You'll need to upgrade to Jewel and gain the --bypass-gc radosgw-admin flag, that speeds up

Re: [ceph-users] How to set Ceph client operation priority (ionice)

2017-07-06 Thread Christian Balzer

Hello, On Thu, 6 Jul 2017 14:34:41 -0700 Su, Zhan wrote: > Hi, > > We are running a Ceph cluster serving both batch workload (e.g. data import > / export, offline processing) and latency-sensitive workload. Currently > batch traffic causes a huge slow down in serving latency-sensitive requests

[ceph-users] Ceph @ OpenStack Sydney Summit

2017-07-06 Thread Blair Bethwaite

Hi all, Are there any "official" plans to have Ceph events co-hosted with OpenStack Summit Sydney, like in Boston? The call for presentations closes in a week. The Forum will be organised throughout September and (I think) that is the most likely place to have e.g. Ceph ops sessions like we have

Re: [ceph-users] Ceph @ OpenStack Sydney Summit

2017-07-06 Thread Blair Bethwaite

Oops, this time plain text... On 7 July 2017 at 13:47, Blair Bethwaite wrote: > > Hi all, > > Are there any "official" plans to have Ceph events co-hosted with OpenStack > Summit Sydney, like in Boston? > > The call for presentations closes in a week. The Forum will be organised > throughout Se

Re: [ceph-users] Note about rbd_aio_write usage

2017-07-06 Thread Piotr Dałek

On 17-07-06 09:39 PM, Jason Dillaman wrote: On Thu, Jul 6, 2017 at 3:25 PM, Piotr Dałek wrote: Is that deep copy an equivalent of what Jewel librbd did at unspecified point of time, or extra one? It's equivalent / replacement -- not an additional copy. This was changed to support scatter/gath

Re: [ceph-users] OSD Full Ratio Luminous - Unset

2017-07-06 Thread Ashley Merrick

After looking into this further it seem's none of the : ceph osd set-{full,nearfull,backfillfull}-ratio Commands seem to be taking any effect on the cluster including the backfillfull ratio, this command looks to have been added/changed since Jewel, and a different way of setting the above. H

42 matches

Mail list logo