[ceph-users] how to fix X is an unexpected clone

2017-08-07 Thread Stefan Priebe - Profihost AG
Hello, how can i fix this one: 2017-08-08 08:42:52.265321 osd.20 [ERR] repair 3.61a 3:58654d3d:::rbd_data.106dd406b8b4567.018c:9d455 is an unexpected clone 2017-08-08 08:43:04.914640 mon.0 [INF] HEALTH_ERR; 1 pgs inconsistent; 1 pgs repair; 1 scrub errors 2017-08-08 08:43:33.470246 os

Re: [ceph-users] How to reencode an object with ceph-dencoder

2017-08-07 Thread Stefan Priebe - Profihost AG
Hello, OK i missed the reverse Order of the little endian value. Now it worked. Greets, Stefan Am 08.08.2017 um 08:22 schrieb Stefan Priebe - Profihost AG: > Hello, > > i want to modify an object_info_t xattr value. I grepped ceph._ and > ceph._@1 and decoded the object to json: > ceph-dencode

[ceph-users] How to reencode an object with ceph-dencoder

2017-08-07 Thread Stefan Priebe - Profihost AG
Hello, i want to modify an object_info_t xattr value. I grepped ceph._ and ceph._@1 and decoded the object to json: ceph-dencoder type object_info_t import /tmp/a decode dump_json After modifying the json how can i encode the json to binary? I also found this old post https://www.spinics.net/lis

Re: [ceph-users] **** SPAM **** jewel - recovery keeps stalling (continues after restarting OSDs)

2017-08-07 Thread Nikola Ciprich
Hi, I tried balancing number of OSDs per node, set their weights the same, increased op recovery priority, but it still takes ages to recover.. I've got my cluster OK now, so I'll try switching to kraken to see if it behaves better.. nik On Mon, Aug 07, 2017 at 11:36:10PM +0800, cgxu wrote: >

[ceph-users] ceph cluster experiencing major performance issues

2017-08-07 Thread Mclean, Patrick
High CPU utilization and inexplicably slow I/O requests We have been having similar performance issues across several ceph clusters. When all the OSDs are up in the cluster, it can stay HEALTH_OK for a while, but eventually performance worsens and becomes (at first intermittently, but eventually c

[ceph-users] implications of losing the MDS map

2017-08-07 Thread Daniel K
I finally figured out how to get the ceph-monstore-tool (compiled from source) and am ready to attemp to recover my cluster. I have one question -- in the instructions, http://docs.ceph.com/docs/master/rados/troubleshooting/troubleshooting-mon/ under Recovery from OSDs, Known limitations: ->

Re: [ceph-users] hammer(0.94.5) librbd dead lock, i want to how to resolve

2017-08-07 Thread Jason Dillaman
I am not sure what you mean by "I stop ceph" (stopped all the OSDs?) -- and I am not sure how you are seeing ETIMEDOUT errors on a "rbd_write" call since it should just block assuming you are referring to stopping the OSDs. What is your use-case? Are you developing your own application on top of li

Re: [ceph-users] FAILED assert(last_e.version.version < e.version.version) - Or: how to use ceph-kvstore-tool?

2017-08-07 Thread Ricardo J. Barberis
Sorry, forgot to mention it, it's Hammer 0.94.10. But I already marked the OSDs as lost, after rebalancing finished. I saw a bug report at http://tracker.ceph.com/issues/14471 I can post some debug logs there but I don't know if it'll be useful at this point. Thank you, El Miércoles 02/08/2

Re: [ceph-users] broken parent/child relationship

2017-08-07 Thread Jason Dillaman
Correct -- deep-flatten can only be enabled at image creation time. If you do still have snapshots on that image and you wish to delete the parent, you will need to delete the snapshots. On Mon, Aug 7, 2017 at 4:52 PM, Shawn Edwards wrote: > Nailed it. Did not have deep-flatten feature turned on

Re: [ceph-users] broken parent/child relationship

2017-08-07 Thread Shawn Edwards
Nailed it. Did not have deep-flatten feature turned on for that image. Deep-flatten cannot be added to an rbd after creation, correct? What are my options here? On Mon, Aug 7, 2017 at 3:32 PM Jason Dillaman wrote: > Does the image "tyr-p0/a56eae5f-fd35-4299-bcdc-65839a00f14c" have > snapshots

Re: [ceph-users] broken parent/child relationship

2017-08-07 Thread Jason Dillaman
Does the image "tyr-p0/a56eae5f-fd35-4299-bcdc-65839a00f14c" have snapshots? If the deep-flatten feature isn't enabled, the flatten operation is not able to dissociate child images from parents when those child images have one or more snapshots. On Fri, Aug 4, 2017 at 2:30 PM, Shawn Edwards wrote

Re: [ceph-users] download.ceph.com rsync errors

2017-08-07 Thread David Galloway
Thanks for bringing this to our attention. I've removed the lockfiles from download.ceph.com. On 08/06/2017 11:10 PM, Matthew Taylor wrote: > Hi, > > The rsync target (rsync://download.ceph.com/ceph/) has been throwing the > following errors for a while: > >> rsync: send_files failed to open "/

Re: [ceph-users] expanding cluster with minimal impact

2017-08-07 Thread Bryan Stillwell
Dan, We recently went through an expansion of an RGW cluster and found that we needed 'norebalance' set whenever making CRUSH weight changes to avoid slow requests. We were also increasing the CRUSH weight by 1.0 each time which seemed to reduce the extra data movement we were seeing with smal

Re: [ceph-users] 1 pg inconsistent, 1 pg unclean, 1 pg degraded

2017-08-07 Thread Etienne Menguy
Hi, Removing the whole OSD will work but it's overkill (if the inconsistent is not caused by a faulty disk [😉] ) Which ceph version are you running? If you have a recent version you can check http://docs.ceph.com/docs/master/rados/troubleshooting/troubleshooting-pg/#pgs-inconsistent rados

[ceph-users] 1 pg inconsistent, 1 pg unclean, 1 pg degraded

2017-08-07 Thread Marc Roos
I tried to fix a 1 pg inconsistent by taking the osd 12 out, hoping for the data to be copied to a different osd, and that one would be used as 'active?'. - Would deleting the whole image in the rbd pool solve this? (or would it fail because of this status) - Should I have done this rather w

Re: [ceph-users] CephFS: concurrent access to the same file from multiple nodes

2017-08-07 Thread Andras Pataki
I've filed a tracker bug for this: http://tracker.ceph.com/issues/20938 Andras On 08/01/2017 10:26 AM, Andras Pataki wrote: Hi John, Sorry for the delay, it took a bit of work to set up a luminous test environment. I'm sorry to have to report that the 12.1.1 RC version also suffers from th

[ceph-users] how to migrate cached erasure pool to another type of erasure?

2017-08-07 Thread Малков Петр Викторович
Hello! Luminous v 12.1.2 Rgw ssd tiering over EC pool works fine. But I want to change type of erasure (now and in the future). Type of erasure code is not allowed for on-fly changing Only new pool with new coding First Idea was to add second tiering level Ssd - EC - ISA and to evict all down,

Re: [ceph-users] jewel: bug? forgotten rbd files?

2017-08-07 Thread Stefan Priebe - Profihost AG
ceph-dencoder type object_info_t import /tmp/a decode dump_json results in: error: buffer::malformed_input: void object_info_t::decode(ceph::buffer::list::iterator&) decode past end of struct encoding Greets, Stefan Am 05.08.2017 um 21:43 schrieb Gregory Farnum: > is OSD 20 actually a member of

Re: [ceph-users] jewel: bug? forgotten rbd files?

2017-08-07 Thread Stefan Priebe - Profihost AG
Hello Greg, if i remove the files manually from the primary - it does not help either. The primary osd is than crashing that trim_object can't find the files. Is there any chance that i manually correct the omap digest so that it just matches the files? Greets, Stefan Am 05.08.2017 um 21:43 sc