-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
Is this only a problem with EC base tiers or would replicated base
tiers see this too?
-
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
On Thu, Feb 11, 2016 at 6:09 PM, Sage Weil wrote:
> On
the code to resolve this by the
time I'm finished with the queue optimizations I'm doing (hopefully in
a week or two), I plan on looking into this to see if there is
something that can be done to prevent the OPs from being accepted
until the OSD is ready for them.
-
Robert LeBl
in a few weeks time I can have a report on what I find. Hopefully we
can have it fixed for Jewel and Hammer. Fingers crossed.
Robert LeBlanc
Sent from a mobile device please excuse any typos.
On Feb 12, 2016 10:32 PM, "Christian Balzer" wrote:
>
> Hello,
>
> for the record
JjGTgi1xtQu9L74u5KPD
yHbZ
=rnWI
-END PGP SIGNATURE-
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
On Sat, Feb 13, 2016 at 8:51 PM, Tom Christensen wrote:
>> > Next this : > --- > 2016-02-12 01:35:33.915981 7f75be4d57c0
ceXOM
HP9Wi3MrVJtXDLFrnQRglB2dfFWvBlrlBTj3uG7Ebn5DO6glxPEAvzrOgsJ2
O8D5+AMvooc41T74aUcWQK8NHNrrN+eL18yhRfjCgyadA2VYvWeu6K7sIUFo
NKFE66ahsxrNKZUrLjeCo69iP4Zf5+AgY7rCau81vzQNtmFUPjzUKyOzgpsb
Y2fQ
=TGcG
-END PGP SIGNATURE-
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
On Tue
/T8bRYeIzINkkB60k6gSvrF5TO2Kq+x7UiYUQ82KyHE+zlTryXW
0BEj2bK9s4NtAItkx3F7bcmnusOOlb1AMMJFssMQV/LmjDOR9xJUYiuqXxrb
6AB3
=hv6I
-END PGP SIGNATURE-
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
On Tue, Feb 23, 2016 at 3:33 PM, Vickey Singh
wrote
HRiFDl
7cD0IpScVkSFHVn4MfOeB4Z+qw9ow9SwGB75BYm98axxsRdNlPNiQzxRcb5z
Tdal
=iMwX
-END PGP SIGNATURE-
----
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
On Wed, Feb 24, 2016 at 4:09 AM, Vickey Singh
wrote:
> Hello Geeks
>
> Can someone pleas
D PGP SIGNATURE-
--------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
On Wed, Feb 24, 2016 at 4:29 AM, Oliver Dzombic wrote:
> Hi Esta,
>
> how do you know, that its still active ?
>
> --
> Mit freundlichen Gruessen / Best regards
>
> Olive
We have not seen this issue, but we don't run EC pools yet (we are waiting
for multiple layers to be available). We are not running 0.94.6 in
production yet either. We have adopted the policy to only run released
versions in production unless there is a really pressing need to have a
patch. We are
We are moving to the Intel S3610, from our testing it is a good balance
between price, performance and longevity. But as with all things, do your
testing ahead of time. This will be our third model of SSDs for our
cluster. The S3500s didn't have enough life and performance tapers off add
it gets fu
With my S3500 drives in my test cluster, the latest master branch gave me
an almost 2x increase in performance compare to just a month or two ago.
There looks to be some really nice things coming in Jewel around SSD
performance. My drives are now 80-85% busy doing about 10-12K IOPS when
doing 4K fi
My guess would be that if you are already running hammer on the client it
is already using the new watcher API. This would be a fix on the OSDs to
allow the object to be moved because the current client is smart enough to
try again. It would be watchers per object.
Sent from a mobile device, pleas
erall,
but there was some.
Sent from a mobile device, please excuse any typos.
On Feb 25, 2016 9:15 PM, "Christian Balzer" wrote:
>
> Hello,
>
> On Wed, 24 Feb 2016 23:01:43 -0700 Robert LeBlanc wrote:
>
> > With my S3500 drives in my test cluster, the latest master
benchmarks. Some of the data about the S3500s is from my test
cluster that has them.
Sent from a mobile device, please excuse any typos.
On Feb 25, 2016 9:20 PM, "Christian Balzer" wrote:
>
> Hello,
>
> On Wed, 24 Feb 2016 22:56:15 -0700 Robert LeBlanc wrote:
>
>
5GX28
AeraQSHLBtOtyrXBcFCtZv2YVbl2juwwC2lNXHJZBd0b/iUDnrBA358U0crm
+TqyYR7LoZiUjUMI0HZzjeyVIsST201R6uQ1Tv9b6DFAOxDMPWD7ViJLcSIO
yAiI
=vXUO
-END PGP SIGNATURE-
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
On Fri, Feb 26, 2016 at 4:05 PM, Shinobu Kinjo
MQkHs3IKKL/0TgsfN2bszoXbHk1rN1NqMVt9BDqHr
ZGb++dyfjUFaMOM/S8WXfkxV3dtYi7LKGEn4pSQ2IyZ92REwcTWej2TPV5r9
Nq0g
=LM6/
-END PGP SIGNATURE-
--------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
On Fri, Feb 26, 2016 at 5:41 PM, Shinobu Kinjo wrote:
&
+JRJLBY5X
ITts
=yIjM
-END PGP SIGNATURE-----
--------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
On Mon, Feb 29, 2016 at 10:29 PM, Lindsay Mathieson
wrote:
> I was looking at replacing an osd drive in place as per the procedure here:
>
> http
j8JHpSga4WkKqlz
e3Oz9PsDU9Tw2UVyo4zLEqgpcWcbY8E1VAAoirKAGcCqnwzwjvhGM2e1h66L
yYjepiUQ9oLbIct9MXJOSAMwctsrAYgvR1veG+vqND5ZLr+OIR7at9Vpeg8m
+oBVG+4PgxlIEfxVGf+8OjLK9sJUTm+AtLMzsbDqMFX9VQtpoTlsqYGd5gTW
9t/H
=7sfH
-END PGP SIGNATURE-
--------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
+9n4M0jW14n2BXejMZjpKXxNa86N5cF7yO/hILCtz1CVJgNcqT2z+kIDZ3z
50aZva/SHsvxmdwK+UxrB3jnFldhzPUB6nU/xJCQWN+BBTSQByFmAg+JkEuX
13qV0h4yWRfH4uaKYdKuzTVSX0zY8HkAA4ZHTatxiPXiVET+NwNE+4aqdbTz
hw+f
=nLNP
-END PGP SIGNATURE-
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
On Sun
dVbdpnMNRt
bMwQ+yVQ8WB5AQmEqN6p6enBCxpvr42p8Eu484dO0xqjIiEOfsMANT/8V63y
RzjPMOaFKFnl3JoYNm61RGAUYszNBeX/Plm/3mP0qiiGBAeHYoxh7DNYlrs/
gUb/O9V0yNuHQIRTs8ZRyrzZKpmh9YMYo8hCsfIqWZjMwEyQaRFuysQB3NaR
lQCO/o12Khv2cygmTCQxS2L7vp2zrkPaS/KietqQ0gwkV1XbynK0XyLkAVDw
zTLa
=Wk/a
-END PGP SIGNATURE-
----
D6sJ5+
kinZO+CgjbC2AQPdoEKMuvRwBgnftH0YuZJFl0sQPkgBg23r+eCfIxfW/9v/
iLgk
=6It+
-END PGP SIGNATURE-
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
On Thu, Mar 17, 2016 at 11:55 AM, Robert LeBlanc wrote:
> Cherry-picking that commit onto
Also, is this ceph_test_rados rewriting objects quickly? I think that
the issue is with rewriting objects so if we can tailor the
ceph_test_rados to do that, it might be easier to reproduce.
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
On Thu
e executable, or `objdump -rdS ` is
needed to interpret this.
terminate called after throwing an instance of 'ceph::FailedAssertion'
Aborted
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
On Thu, Mar 17, 2016 at 10:39 AM, Sage Weil wrote:
&
Cherry-picking that commit onto v0.94.6 wasn't clean so I'm just
building your branch. I'm not sure what the difference between your
branch and 0.94.6 is, I don't see any commits against
osd/ReplicatedPG.cc in the last 5 months other than the one you did
today.
-------
> What would be *really* great is if you could reproduce this with a
> ceph_test_rados workload (from ceph-tests). I.e., get ceph_test_rados
> running, and then find the sequence of operations that are sufficient to
> trigger a failure.
>
> sage
>
>
>
> >
> >
> >
> >
ERGv94qH/E
CppgbnchgRHuI68rNM6nFYPJa4C3MlyQhu2WmOialAGgQi+IQP/g6h70e0RR
eqLX
=DcjE
-END PGP SIGNATURE-
--------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
On Wed, Mar 16, 2016 at 1:40 PM, Gregory Farnum wrote:
> This tracker ticket happened to go by my eyes
Yep, let me pull and build that branch. I tried installing the dbg
packages and running it in gdb, but it didn't load the symbols.
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
On Thu, Mar 17, 2016 at 11:36 AM, Sage Weil wrote:
> On
eKnxMtPoEcuozEp
Su1Iud2fYdma5w8MFStjp1BAV3osg1WgIM6KYzsSZI1BkCQAqU58ROZ0ZsMb
D05/AEK/A6fp0ROXUczhXDcXlXcGEWyJm1QEtg7cSu3C+9qu5qvQQxyrrwbZ
MK8C5lhVb44sqSVcSIZ+KCrPC+x8UKodDQZCz6O6NrJjZLn2g06583cMFWK8
qLo+
=qgB7
-END PGP SIGNATURE-
--------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
On Thu, Mar 1
fr7w9
V8Mhs7mmkEQtwcvyaYQ0bx0Bs3o4cvTTeYbJUpLWEgMmGAEBZbf7Sx+y3dIp
jUHb2jPEchBb83BGeLvAkCTfouq/J3pzQK96gA2Kh/KOlVJTpFdKUU5x+wpM
ACqD+S/AFkgnfGm4fcgBexhro7GImiO6VIaOdxvTSdQbSsaoKckZOxFhVWih
XyBJ
=EF9A
-END PGP SIGNATURE-
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
On Thu, Apr 7,
Thank you,
Robert LeBlanc
--------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
On Wed, May 22, 2019 at 12:22 AM Burkhard Linke <
burkhard.li...@computational.bio.uni-giessen.de> wrote:
> Hi,
>
> On 5/21/19 9:46 PM, Robert LeBlanc wrote:
> > I'm at a new job working with Ceph again and am excited to back in the
> > community!
> >
> &g
proceed.
1. Do a deep-scrub on each PG that is inconsistent. (This may fix some of
them)
2. Print out the inconsistent report for each inconsistent PG. `rados
list-inconsistent-obj --format=json-pretty`
3. You will want to look at the error messages and see if all
You need to use the first stripe of the object as that is the only one with
the metadata.
Try "rados -p ec31 getxattr 10004dfce92. parent" instead.
Robert LeBlanc
Sent from a mobile device, please excuse any typos.
On Fri, May 24, 2019, 4:42 AM Kevin Flöh wrote:
> Hi,
&
tory
>
> Does this mean that the lost object isn't even a file that appears in the
> ceph directory. Maybe a leftover of a file that has not been deleted
> properly? It wouldn't be an issue to mark the object as lost in that case.
> On 24.05.19 5:08 nachm., Robert LeBlanc wro
On Fri, May 24, 2019 at 2:14 AM Burkhard Linke <
burkhard.li...@computational.bio.uni-giessen.de> wrote:
> Hi,
> On 5/22/19 5:53 PM, Robert LeBlanc wrote:
>
> When you say 'some' is it a fixed offset that the file data starts? Is the
> first stripe just metadata?
&g
h short tests are small
amounts of data, but once the drive started getting full, the performance
dropped off a cliff. Considering that Ceph is really hard on drives, it's
good to test the extreme.
Robert LeBlanc
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
and making
backfills not so disruptive.
--------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
On Thu, Jun 6, 2019 at 1:43 AM BASSAGET Cédric
wrote:
> Hello,
>
> I see messages related to REQUEST_SLOW a few times per day.
>
> here's my ceph
; help in this case ?
> Regards
>
Your disk times look okay, just a lot more unbalanced than I would expect.
I'd give wpq a try, I use it all the time, just be sure to also include the
op_cutoff setting too or it doesn't have much effect. Let me know how it
EQUEST_SLOW) warning, even if my OSD disk usage goes above 95% (fio
>> ran from 4 diffrent hosts)
>>
>> On my prod cluster, release 12.2.9, as soon as I run fio on a single
>> host, I see a lot of REQUEST_SLOW warninr gmessages, but "iostat -xd 1"
>> does
t {osd-num} {weight}```
--------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
On Mon, Jun 24, 2019 at 2:25 AM jinguk.k...@ungleich.ch <
jinguk.k...@ungleich.ch> wrote:
> Hello everyone,
>
> We have some osd on the ceph.
> Some osd&
There may also be more memory coping involved instead of just passing
pointers around as well, but I'm not 100% sure.
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
On Mon, Jun 24, 2019 at 10:28 AM Jeff Layton
wrote:
> On Mon, 2019-
llow IO
to continue. Then when the down timeout expires it will start backfilling
and recovering the PGs that were affected. Double check that size !=
min_size for your pools.
--------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
On Thu, Jun 27, 2019 at 5:2
ugh downtime to move
hundreds of Terabytes, we need something that can be done online, and if it
has a minute or two of downtime would be okay.
--------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
On Fri, Jun 28, 2019 at 9:02 AM Marc Roos wrote:
&g
at is done
and the eviction is done, then you can remove the pool from cephfs and the
overlay. That way the OSDs are the one doing the data movement.
I don't know that part of the code, so I can't quickly propose any patches.
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4
quot; I mention is the "mon osd down out interval".
The rest of what I wrote is correct. Just to make sure I don't confuse
anyone else.
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
if 600 seconds pass with the monitor not hearing from the OSD,
it will mark it down. It 'should' only take 20 seconds to detect a downed
OSD.
Usually, the problem is that an OSD gets too busy and misses heartbeats so
other OSDs wrongly mark them d
I believe he needs to increase the pgp_num first, then pg_num.
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
On Mon, Jul 1, 2019 at 7:21 AM Nathan Fish wrote:
> I ran into this recently. Try running "ceph osd require-osd-release
&g
On Mon, Jul 1, 2019 at 11:57 AM Brett Chancellor
wrote:
> In Nautilus just pg_num is sufficient for both increases and decreases.
>
>
Good to know, I haven't gotten to Nautilus yet.
--------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654
cker have their own IP address and
are bridges created like LXD or does it share the host IP?
Thank you,
Robert LeBlanc
--------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
___
ceph-users mailing
Is this a Ceph specific option? If so, you may need to prefix it with
"ceph.", at least I had to for FUSE to pass it to the Ceph module/code
portion.
--------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
On Thu, Jul 4, 2019 at 7:35 AM s
xtremely well so it "Just
Works".
By not back porting new features, I think it gives more time to bake the
features into the new version and frees up the developers to focus on the
forward direction of the product. If I want a new feature, then
We recently used Croit (https://croit.io/) and they were really good.
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
On Mon, Jul 15, 2019 at 12:53 PM Void Star Nill
wrote:
> Hello,
>
> Other than Redhat and SUSE, are there other
/Leveled-Compaction
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
I'm pretty new to RGW, but I'm needing to get max performance as well. Have
you tried moving your RGW metadata pools to nvme? Carve out a bit of NVMe
space and then pin the pool to the SSD class in CRUSH, that way the small
metadata ops aren't on slow media.
--------
Rob
d_ops, how can we tell
MDS that the inode is lost and to forget about it without trying to do any
checks on it (checking the RADOS objects may be part of the problem)? Once
the inode is out of CephFS, we can clean up the RADOS objects manually or
leave them there to rot.
Thanks,
Robert Le
Thanks, I created a ticket. http://tracker.ceph.com/issues/40906
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
On Mon, Jul 22, 2019 at 11:45 PM Yan, Zheng wrote:
> please create a ticket at http://tracker.ceph.com/projects/cephfs
s lots of
client I/O, but the clients haven't noticed that huge backfills have been
going on.
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
On Tue, Jul 30, 2019 at 2:06 AM Janne Johansson wrote:
> Someone should make a webpage where you can enter that hex-string and get
> a list back.
>
Providing a minimum bitmap would allow someone to do so, and someone like
me to do it manually until then.
----
Robert Le
Alex Gorbachev wrote:
> On Fri, Aug 2, 2019 at 6:57 PM Robert LeBlanc
> wrote:
> >
> > On Fri, Jul 26, 2019 at 1:02 PM Peter Sabaini wrote:
> >>
> >> On 26.07.19 15:03, Stefan Kooman wrote:
> >> > Quoting Peter Sabaini (pe...@sabaini.at):
> &g
Routing and bind the source port on the connection (not the easiest, but
allows you to have multiple NICs in the same broadcast domain). I don't
have experience with Ceph in this type of configuration.
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62
h_location_hook (potentially using a file with a list
of partition UUIDs that should be in the metadata pool).?
Any other options I may not be considering?
Thank you,
Robert LeBlanc
--------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654
On Tue, Aug 6, 2019 at 11:11 AM Paul Emmerich
wrote:
> On Tue, Aug 6, 2019 at 7:45 PM Robert LeBlanc
> wrote:
> > We have a 12.2.8 luminous cluster with all NVMe and we want to take some
> of the NVMe OSDs and allocate them strictly to metadata pools (we have a
> problem wi
of the pool capacity and sets the
quota if the current quota is 1% out of balance. This is run by cron every
5 minutes.
If there is a way to reserve some capacity for a pool that no other pool
can use, please provide an example. Think of reserved inode space in
ext4/XFS/etc.
Thank you.
--
On Wed, Aug 7, 2019 at 12:08 AM Konstantin Shalygin wrote:
> On 8/7/19 1:40 PM, Robert LeBlanc wrote:
>
> > Maybe it's the lateness of the day, but I'm not sure how to do that.
> > Do you have an example where all the OSDs are of class ssd?
> Can't parse wh
improve that would be appreciated.
Thank you,
Robert LeBlanc
--------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
the space of a pool, which is not what I'm looking for.
Thank you,
Robert LeBlanc
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
pe size
to know how many objects to fetch for the whole object. The file is stored
by the inode (in hex) appended by the object offset. The inode corresponds
to the same value in `ls -li` in CephFS converted to hex.
I hope that is correct and useful as a starting point for you.
--------
Robe
one. When I deleted the directories with the damage the active MDS
crashed, but the replay took over just fine. I haven't had the messages now
for almost a week.
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
On Mon, Aug 19, 2019 at 10:30 PM
rw-r--r-- 1 167 167 37 Jul 30 22:15 IDENTITY
-rw-r--r-- 1 167 1670 Jul 30 22:15 LOCK
-rw-r--r-- 1 167 167 1.3M Aug 28 19:16 MANIFEST-027846
-rw-r--r-- 1 167 167 4.7K Aug 1 23:38 OPTIONS-002825
-rw-r--r-- 1 167 167 4.7K Aug 16 07:40 OPTIONS-027849
Turns out /var/lib/ceph was ceph.ceph and not 167.167, chowning it made
things work. I guess only monitor needs that permission, rgw,mgr,osd are
all happy without needing it to be 167.167.
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
On Wed
27;
- data: '/dev/sdk'
db: '/dev/nvme0n1'
crush_device_class: 'hdd'
- data: '/dev/sdl'
db: '/dev/nvme0n1'
crush_device_class: 'hdd'
grading to
the Ceph distributed packages didn't change the UID.
Thanks,
Robert LeBlanc
--------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
On Thu, Aug 29, 2019 at 12:33 AM Frank Schilder wrote:
> Hi Robert,
>
> this is a bit less tr
ncluded below; oldest blocked for > 62497.675728 secs
> > 2019-09-19 08:53:47.528891 mds.icadmin007 [WRN] 3 slow requests, 0 included
> > below; oldest blocked for > 62501.243214 secs
> > 2019-09-19 08:53:52.529021 mds.icadmin007 [WRN] 3 slow requests, 0 included
> >
m is repaired and when it
deep-scrubs to check it, the problem has reappeared or another problem
was found and the disk needs to be replaced.
Try running:
rados list-inconsistent-obj ${PG} --format=json
and see what the exact problems are.
--------
Robert LeBlanc
PGP Fingerprint 79A2 9CA
t: 141 KiB/s rd, 54 MiB/s wr, 62 op/s rd, 577 op/s wr
> >
>
> > [root@mds02 ~]# ceph health detail
> > HEALTH_WARN 1 MDSs report slow requests; 2 MDSs behind on trimming
> > MDS_SLOW_REQUEST 1 MDSs report slow requests
> > mdsmds02(mds.1): 2 slow reque
I wanted all my config in
a single file, so I put it in my inventory file, but it looks like you
have the right idea.
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
lse I can
> try.
>
> Any suggestions?
If you haven't already tried this, add this to your ceph.conf and
restart your OSDs, this should help bring down the variance in latency
(It will be the default in Octopus):
osd op queue = wpq
osd op queue cut off = high
Rober
On Tue, Oct 1, 2019 at 7:54 AM Robert LeBlanc wrote:
>
> On Mon, Sep 30, 2019 at 5:12 PM Sasha Litvak
> wrote:
> >
> > At this point, I ran out of ideas. I changed nr_requests and readahead
> > parameters to 128->1024 and 128->4096, tuned nodes to
> > pe
ontrol?
>
> best regards,
>
> Samuel
Not sure which version of Ceph you are on, but add these to your
/etc/ceph/ceph.conf on all your OSDs and restart them.
osd op queue = wpq
osd op queue cut off = high
That should really help and make backfills and recovery be
non-impactful. This wi
settings on and it really helped both of them.
----
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
eter?
Wow! Dusting off the cobwebs here. I think this is what lead me to dig
into the code and write the WPQ scheduler. I can't remember doing
anything specific. I'm sorry I'm not much help in this regard.
Robert LeBlanc
PGP Fingerprint 79A2 9CA
mpact for
client traffic. Those would need to be set on all OSDs to be
completely effective. Maybe go back to the defaults?
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
You can try adding
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
On Tue, Oct 22, 2019 at 8:36 PM David Turner wrote:
>
> Most times you are better served with simpler settings like
> osd_recovery_sleep, which has 3 variants if
Yout can try adding
osd op queue = wpq
osd op queue cut off = high
To all the osd ceph configs and restarting, That has made reweighting
pretty painless for us.
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
On Tue, Oct 22, 2019 at 8:36 PM
y and roll back each file's content. The MDS could do this more
> efficiently than rsync give what it knows about the snapped inodes
> (skipping untouched inodes or, eventually, entire subtrees) but it's a
> non-trivial amount of work to implement.
>
> Would it m
--+
> | 8 | 196.3 MB/s |2 1 2 2 3 3 5
> 5 |2 1 2 2 3 3 5 5 |
>
> +-++++
> [...section CLEAN
On Tue, Dec 3, 2019 at 9:11 AM Ed Fisher wrote:
>
>
> On Dec 3, 2019, at 10:28 AM, Robert LeBlanc wrote:
>
> Did you make progress on this? We have a ton of < 64K objects as well and
> are struggling to get good performance out of our RGW. Sometimes we have
> RGW
Our Jewel cluster is exhibiting some similar issues to the one in this
thread [0] and it was indicated that a tool would need to be written to fix
that kind of corruption. Has the tool been written? How would I go about
repair this 16EB directories that won't delete?
Thank you,
Robert LeBlan
33 GiB
0 1.8 PiB
default.rgw.buckets.non-ec 8 8.1 MiB 22 8.1 MiB
0 1.8 PiB
Please help me figure out what I'm doing wrong with these settings.
Thanks,
Robert LeBlanc
--------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD
;
> The solution for you is to simply put the option under global and restart
> ceph-mgr (or use daemon config set; it doesn't support changing config via
> ceph tell for some reason)
>
>
> Paul
>
> On Mon, Dec 9, 2019 at 8:32 PM Paul Emmerich
> wrote:
>
>>
The link that you referenced above is no longer available, do you have a
new link?. We upgraded from 12.2.8 to 12.2.12 and the MDS metrics all
changed, so I'm trying to may the old values to the new values. Might just
have to look in the code. :(
Thanks!
Robert LeBlan
On Tue, Jan 14, 2020 at 12:30 AM Stefan Kooman wrote:
> Quoting Robert LeBlanc (rob...@leblancnet.us):
> > The link that you referenced above is no longer available, do you have a
> > new link?. We upgraded from 12.2.8 to 12.2.12 and the MDS metrics all
> > changed, so I
401 - 492 of 492 matches
Mail list logo