from:"Mike Lovell"

Re: [ceph-users] OSDs crash after deleting unfound object in Luminous 12.2.8

2018-10-18 Thread Mike Lovell

count > while scrubbing due to the missing object, but I don't think so. > > Anyway, I just wanted to thank you for your help! > > Best wishes, > > Lawrence > > On 10/13/2018 02:00 AM, Mike Lovell wrote: > > what was the object name that you marked lost? was it

Re: [ceph-users] OSDs crash after deleting unfound object in Luminous 12.2.8

2018-10-12 Thread Mike Lovell

what was the object name that you marked lost? was it one of the cache tier hit_sets? the trace you have does seem to be failing when the OSD is trying to remove a hit set that is no longer needed. i ran into a similar problem which might have been why that bug you listed was created. maybe provid

Re: [ceph-users] All pools full after one OSD got OSD_FULL state

2018-03-29 Thread Mike Lovell

On Thu, Mar 29, 2018 at 1:17 AM, Jakub Jaszewski wrote: > Many thanks Mike, that justifies stopped IOs. I've just finished adding > new disks to cluster and now try to evenly reweight OSD by PG. > > May I ask you two more questions? > 1. As I was in a hurry I did not check if only write ops were

Re: [ceph-users] PG mapped to OSDs on same host although 'chooseleaf type host'

2018-02-22 Thread Mike Lovell

was the pg-upmap feature used to force a pg to get mapped to a particular osd? mike On Thu, Feb 22, 2018 at 10:28 AM, Wido den Hollander wrote: > Hi, > > I have a situation with a cluster which was recently upgraded to Luminous > and has a PG mapped to OSDs on the same host. > > root@man:~# cep

Re: [ceph-users] Many concurrent drive failures - How do I activate pgs?

2018-02-22 Thread Mike Lovell

mike On Thu, Feb 22, 2018 at 3:58 PM, Hans Chris Jones < chris.jo...@lambdastack.io> wrote: > Interesting. This does not inspire confidence. What SSDs (2TB or 4TB) do > people have good success with in high use production systems with bluestore? > > Thanks > > On Thu, F

Re: [ceph-users] Many concurrent drive failures - How do I activate pgs?

2018-02-22 Thread Mike Lovell

t; > I returned the lot and am done with Intel SSDs, will advise as many > customers and peers to do the same… > > > > > > Regards > > David Herselman > > > > > > *From:* ceph-users [mailto:ceph-users-boun...@lists.ceph.com] *On Behalf > Of *Mike Lovell > *Se

Re: [ceph-users] Many concurrent drive failures - How do I activate pgs?

2018-02-22 Thread Mike Lovell

has anyone tried with the most recent firmwares from intel? i've had a number of s4600 960gb drives that have been waiting for me to get around to adding them to a ceph cluster. this as well as having 2 die almost simultaneously in a different storage box is giving me pause. i noticed that David li

Re: [ceph-users] Removing cache tier for RBD pool

2018-01-19 Thread Mike Lovell

On Tue, Jan 16, 2018 at 9:25 AM, Jens-U. Mozdzen wrote: > Hello Mike, > > Zitat von Mike Lovell : > >> On Mon, Jan 8, 2018 at 6:08 AM, Jens-U. Mozdzen wrote: >> >>> Hi *, >>> [...] >>> 1. Does setting the cache mode to "forward" lead

Re: [ceph-users] Removing cache tier for RBD pool

2018-01-15 Thread Mike Lovell

On Mon, Jan 8, 2018 at 6:08 AM, Jens-U. Mozdzen wrote: > Hi *, > > trying to remove a caching tier from a pool used for RBD / Openstack, we > followed the procedure from http://docs.ceph.com/docs/mast > er/rados/operations/cache-tiering/#removing-a-writeback-cache and ran > into problems. > > The

Re: [ceph-users] cephfs cache tiering - hitset

2017-03-20 Thread Mike Lovell

On Mon, Mar 20, 2017 at 4:20 PM, Nick Fisk wrote: > Just a few corrections, hope you don't mind > > > -Original Message- > > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > > Mike Lovell > > Sent: 20 March 2017 20:30 > &g

Re: [ceph-users] cephfs cache tiering - hitset

2017-03-20 Thread Mike Lovell

i'm not an expert but here is my understanding of it. a hit_set keeps track of whether or not an object was accessed during the timespan of the hit_set. for example, if you have a hit_set_period of 600, then the hit_set covers a period of 10 minutes. the hit_set_count defines how many of the hit_se

[ceph-users] hammer to jewel upgrade experiences? cache tier experience?

2017-03-06 Thread Mike Lovell

has anyone on the list done an upgrade from hammer (something later than 0.94.6) to jewel with a cache tier configured? i tried doing one last week and had a hiccup with it. i'm curious if others have been able to successfully do the upgrade and, if so, did they take any extra steps related to the

[ceph-users] osds crashing during hit_set_trim and hit_set_remove_all

2017-03-03 Thread Mike Lovell

i started an upgrade process to go from 0.94.7 to 10.2.5 on a production cluster that is using cache tiering. this cluster has 3 monitors, 28 storage nodes, around 370 osds. the upgrade of the monitors completed without issue. i then upgraded 2 of the storage nodes, and after the restarts, the osds

Re: [ceph-users] Issue with upgrade from 0.94.9 to 10.2.5

2017-01-23 Thread Mike Lovell

i was just testing an upgrade of some monitors in a test cluster from hammer (0.94.7) to jewel (10.2.5). after upgrade each of the first two monitors, i stopped and restarted a single osd to cause changes in the maps. the same error messages showed up in ceph -w. i haven't dug into it much but just

Re: [ceph-users] Crashing OSDs (suicide timeout, following a single pool)

2016-06-01 Thread Mike Lovell

On Wed, Jun 1, 2016 at 9:13 AM, Adam Tygart wrote: > Hello all, > > I'm running into an issue with ceph osds crashing over the last 4 > days. I'm running Jewel (10.2.1) on CentOS 7.2.1511. > > A little setup information: > 26 hosts > 2x 400GB Intel DC P3700 SSDs > 12x6TB spinning disks > 4x4TB spi

Re: [ceph-users] help troubleshooting some osd communication problems

2016-04-29 Thread Mike Lovell

On Fri, Apr 29, 2016 at 9:34 AM, Mike Lovell wrote: > On Fri, Apr 29, 2016 at 5:54 AM, Alexey Sheplyakov < > asheplya...@mirantis.com> wrote: > >> Hi, >> >> > i also wonder if just taking 148 out of the cluster (probably just >> marking it out) would

Re: [ceph-users] Backfilling caused RBD corruption on Hammer?

2016-04-29 Thread Mike Lovell

are the new osds running 0.94.5 or did they get the latest .6 packages? are you also using cache tiering? we ran in to a problem with individual rbd objects getting corrupted when using 0.94.6 with a cache tier and min_read_recency_for_promote was > 1. our only solution to corruption that happened

Re: [ceph-users] help troubleshooting some osd communication problems

2016-04-29 Thread Mike Lovell

On Fri, Apr 29, 2016 at 5:54 AM, Alexey Sheplyakov wrote: > Hi, > > > i also wonder if just taking 148 out of the cluster (probably just > marking it out) would help > > As far as I understand this can only harm your data. The acting set of PG > 17.73 is [41, 148], > so after stopping/taking out

Re: [ceph-users] help troubleshooting some osd communication problems

2016-04-29 Thread Mike Lovell

progress we'll need debug ms = 20 on both > sides of the connection when a message is lost. > -Sam > > On Thu, Apr 28, 2016 at 2:38 PM, Mike Lovell > wrote: > > there was a problem on one of the clusters i manage a couple weeks ago > where > > pairs of OSDs would w

[ceph-users] help troubleshooting some osd communication problems

2016-04-28 Thread Mike Lovell

there was a problem on one of the clusters i manage a couple weeks ago where pairs of OSDs would wait indefinitely on subops from the other OSD in the pair. we used a liberal dose of "ceph osd down ##" on the osds and eventually things just sorted them out a couple days later. it seems to have com

Re: [ceph-users] data corruption with hammer

2016-03-19 Thread Mike Lovell

? anyone have a similar problem? mike On Mon, Mar 14, 2016 at 8:51 PM, Mike Lovell wrote: > something weird happened on one of the ceph clusters that i administer > tonight which resulted in virtual machines using rbd volumes seeing > corruption in multiple forms. > > when ever

Re: [ceph-users] data corruption with hammer

2016-03-19 Thread Mike Lovell

set to greater than 1. mike On Wed, Mar 16, 2016 at 4:41 PM, Mike Lovell wrote: > robert and i have done some further investigation the past couple days on > this. we have a test environment with a hard drive tier and an ssd tier as > a cache. several vms were created with volumes from

Re: [ceph-users] data corruption with hammer

2016-03-15 Thread Mike Lovell

close. mike On Mon, Mar 14, 2016 at 9:35 PM, Christian Balzer wrote: > > Hello, > > On Mon, 14 Mar 2016 20:51:04 -0600 Mike Lovell wrote: > > > something weird happened on one of the ceph clusters that i administer > > tonight which resulted in virtual machines using rbd

[ceph-users] data corruption with hammer

2016-03-14 Thread Mike Lovell

something weird happened on one of the ceph clusters that i administer tonight which resulted in virtual machines using rbd volumes seeing corruption in multiple forms. when everything was fine earlier in the day, the cluster was a number of storage nodes spread across 3 different roots in the cru

Re: [ceph-users] osds crashing on Thread::create

2016-03-07 Thread Mike Lovell

12 osds running. it looks like they're creating over 2500 threads each. i don't know the internals of the code but that seems like a lot. oh well. hopefully this fixes it. mike On Mon, Mar 7, 2016 at 1:55 PM, Gregory Farnum wrote: > On Mon, Mar 7, 2016 at 11:04 AM, Mike Lovell

[ceph-users] osds crashing on Thread::create

2016-03-07 Thread Mike Lovell

first off, hello all. this is my first time posting to the list. i have seen a recurring problem that has starting in the past week or so on one of my ceph clusters. osds will crash and it seems to happen whenever backfill or recovery is started. looking at the logs it appears that the the osd is

Re: [ceph-users] OSDs crash after deleting unfound object in Luminous 12.2.8

Re: [ceph-users] OSDs crash after deleting unfound object in Luminous 12.2.8

Re: [ceph-users] All pools full after one OSD got OSD_FULL state

Re: [ceph-users] PG mapped to OSDs on same host although 'chooseleaf type host'

Re: [ceph-users] Many concurrent drive failures - How do I activate pgs?

Re: [ceph-users] Many concurrent drive failures - How do I activate pgs?

Re: [ceph-users] Many concurrent drive failures - How do I activate pgs?

Re: [ceph-users] Removing cache tier for RBD pool

Re: [ceph-users] Removing cache tier for RBD pool

Re: [ceph-users] cephfs cache tiering - hitset

Re: [ceph-users] cephfs cache tiering - hitset

[ceph-users] hammer to jewel upgrade experiences? cache tier experience?

[ceph-users] osds crashing during hit_set_trim and hit_set_remove_all

Re: [ceph-users] Issue with upgrade from 0.94.9 to 10.2.5

Re: [ceph-users] Crashing OSDs (suicide timeout, following a single pool)

Re: [ceph-users] help troubleshooting some osd communication problems

Re: [ceph-users] Backfilling caused RBD corruption on Hammer?

Re: [ceph-users] help troubleshooting some osd communication problems

Re: [ceph-users] help troubleshooting some osd communication problems

[ceph-users] help troubleshooting some osd communication problems

Re: [ceph-users] data corruption with hammer

Re: [ceph-users] data corruption with hammer

Re: [ceph-users] data corruption with hammer

[ceph-users] data corruption with hammer

Re: [ceph-users] osds crashing on Thread::create

[ceph-users] osds crashing on Thread::create

26 matches

Site Navigation

Mail list logo

Footer information