the issue for us.
Sander
From: Gregory Farnum
Sent: Thursday, June 14, 2018 22:45
To: Sander van Schie / True
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Performance issues with deep-scrub since upgrading
from v12.2.2 to v12.2.5
Yes. Deep scrub o
raight. The resharding list also kept showing pretty much
> completely different data every few seconds. Since this was also affecting
> performance, we temporarily disabled this. Could this somehow be related?
>
>
> Thanks
>
>
> Sander
>
>
>
>
> -----
porarily disabled this. Could
this somehow be related?
Thanks
Sander
From: Gregory Farnum
Sent: Thursday, June 14, 2018 19:45
To: Sander van Schie / True
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Performance issues with deep-scrub since upgrading
Deep scrub needs to read every object in the pg. if some pgs are only
taking 5 seconds they must be nearly empty (or maybe they only contain
objects with small amounts of omap or something). Ten minutes is perfectly
reasonable, but it is an added load on the cluster as it does all those
object read
Hello,
We recently upgraded Ceph from version 12.2.2 to version 12.2.5. Since the
upgrade we've been having performance issues which seem to relate to when
deep-scrub actions are performed.
Most of the time deep-scrub actions only takes a couple of seconds at most,
however ocassionaly it takes
This drives are running as osd, not as journal.
I think I can't understand is, why the performance of using rados bench
with 1 thread is 3 times slower? Ceph osd bench shows good results.
In my opinion it could be a 20% less speed, because of software overhead.
I read the blog post
(http://c
You should do your reference test with dd with oflag=direct,dsync
direct will only bypass the cache while dsync will fsync on every
block which is much closer to reality of what ceph is doing afaik
On Thu, Jan 4, 2018 at 9:54 PM, Rafał Wądołowski
wrote:
> Hi folks,
>
> I am currently benchmarki
@cloudferro.com]
> Sent: donderdag 4 januari 2018 16:56
> To: c...@elchaka.de; ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] Performance issues on Luminous
>
> I have size of 2.
>
> We know about this risk and we accept it, but we still don't know why
> p
2018 16:56
To: c...@elchaka.de; ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Performance issues on Luminous
I have size of 2.
We know about this risk and we accept it, but we still don't know why
performance so so bad.
Cheers,
Rafał Wądołowski
On 04.01.2018 16:51, c...@elcha
They are configured with bluestore.
The network, cpu and disk are doing nothing. I was observing with atop,
iostat, top.
Similiar hardware configuration I have on jewel (with filestore), and
there are performing good.
Cheers,
Rafał Wądołowski
On 04.01.2018 17:05, Luis Periquito wrote:
y
you never said if it was bluestore or filestore?
Can you look in the server to see which component is being stressed
(network, cpu, disk)? Utilities like atop are very handy for this.
Regarding those specific SSDs they are particularly bad when running
some time without trimming - performance nos
I have size of 2.
We know about this risk and we accept it, but we still don't know why
performance so so bad.
Cheers,
Rafał Wądołowski
On 04.01.2018 16:51, c...@elchaka.de wrote:
I assume you have size of 3 then divide your expected 400 with 3 and
you are not far Away from what you get...
I assume you have size of 3 then divide your expected 400 with 3 and you are
not far Away from what you get...
In Addition you should Never use Consumer grade ssds for ceph as they will be
reach the DWPD very soon...
Am 4. Januar 2018 09:54:55 MEZ schrieb "Rafał Wądołowski"
:
>Hi folks,
>
>
Hi folks,
I am currently benchmarking my cluster for an performance issue and I
have no idea, what is going on. I am using these devices in qemu.
Ceph version 12.2.2
Infrastructure:
3 x Ceph-mon
11 x Ceph-osd
Ceph-osd has 22x1TB Samsung SSD 850 EVO 1TB
96GB RAM
2x E5-2650 v4
4x10G Netwo
Hello,
first and foremost, do yourself and everybody else a favor by thoroughly
searching net and thus the ML archives.
This kind of question has come up and been answered countless times.
On Thu, 6 Apr 2017 09:59:10 +0800 PYH wrote:
> what I meant is, when the total IOPS reach to 3000+, the t
what I meant is, when the total IOPS reach to 3000+, the total cluster
gets very slow. so any idea? thanks.
On 2017/4/6 9:51, PYH wrote:
Hi,
we have 21 hosts, each has 12 disks (4T sata), no SSD as journal or
cache tier.
so the total OSD number is 21x12=252.
there are three separate hosts fo
Hi,
we have 21 hosts, each has 12 disks (4T sata), no SSD as journal or
cache tier.
so the total OSD number is 21x12=252.
there are three separate hosts for monitor nodes.
network is 10Gbps. replicas are 3.
under this setup, we can get only 3000+ IOPS for random writes for whole
cluster.test
Hi,
1 - rados or rbd bug ? We're using rados bench.
2 - This is not bandwith related. If it was, it should happen almost
instantly and not 15 minutes after I start to write to the pool.
Once it has happened on the pool, I can then reproduce with a fewer
--concurrent-ios, like 12 or even 1.
T
Hello,
I didn't look at your video but i already can tell you some tracks :
1 - there is a bug in 10.2.2 which make the client cache not working. The
client cache works as it never recieved a flush so it will stay in
writethrough mode. This bug is clear in 10.2.3
2 - 2 SSDs in JBOD and 12 x 4TB
Hi,
We're having performance issues on a Jewel 10.2.2 cluster. It started
with IOs taking several seconds to be acknowledged so we did some
benchmarks.
We could reproduce with a rados bench on new pool set on a single host
(R730xd with 2 SSDs in JBOD and 12 4TB NL SAS in RAID0 writeback) wi
On Wed, Feb 17, 2016 at 12:13 AM, Christian Balzer wrote:
>
> Hello,
>
> On Tue, 16 Feb 2016 10:46:32 -0800 Cullen King wrote:
>
> > Thanks for the helpful commentary Christian. Cluster is performing much
> > better with 50% more spindles (12 to 18 drives), along with setting scrub
> > sleep to 0
Hello,
On Tue, 16 Feb 2016 10:46:32 -0800 Cullen King wrote:
> Thanks for the helpful commentary Christian. Cluster is performing much
> better with 50% more spindles (12 to 18 drives), along with setting scrub
> sleep to 0.1. Didn't see really any gain from moving from the Samsung 850
> Pro jou
Thanks for the helpful commentary Christian. Cluster is performing much
better with 50% more spindles (12 to 18 drives), along with setting scrub
sleep to 0.1. Didn't see really any gain from moving from the Samsung 850
Pro journal drives to Intel 3710's, even though dd and other direct tests
of th
Thanks for the tuning tips Bob, I'll play with them after solidifying some
of my other fixes (another 24-48 hours before my migration to 1024
placement groups is finished).
Glad you enjoy ridewithgps, shoot me an email if you have any
questions/ideas/needs :)
On Fri, Feb 5, 2016 at 4:42 PM, Bob R
Cullen,
We operate a cluster with 4 nodes, each has 2xE5-2630, 64gb ram, 10x4tb
spinners. We've recently replaced 2xm550 journals with a single p3700 nvme
drive per server and didn't see the performance gains we were hoping for.
After making the changes below we're now seeing significantly better
Hello,
On Thu, 4 Feb 2016 08:44:25 -0800 Cullen King wrote:
> Replies in-line:
>
> On Wed, Feb 3, 2016 at 9:54 PM, Christian Balzer
> wrote:
>
> >
> > Hello,
> >
> > On Wed, 3 Feb 2016 17:48:02 -0800 Cullen King wrote:
> >
> > > Hello,
> > >
> > > I've been trying to nail down a nasty perform
Replies in-line:
On Wed, Feb 3, 2016 at 9:54 PM, Christian Balzer
wrote:
>
> Hello,
>
> On Wed, 3 Feb 2016 17:48:02 -0800 Cullen King wrote:
>
> > Hello,
> >
> > I've been trying to nail down a nasty performance issue related to
> > scrubbing. I am mostly using radosgw with a handful of buckets
Hello,
On Wed, 3 Feb 2016 17:48:02 -0800 Cullen King wrote:
> Hello,
>
> I've been trying to nail down a nasty performance issue related to
> scrubbing. I am mostly using radosgw with a handful of buckets containing
> millions of various sized objects. When ceph scrubs, both regular and
> deep,
Hello,
I've been trying to nail down a nasty performance issue related to
scrubbing. I am mostly using radosgw with a handful of buckets containing
millions of various sized objects. When ceph scrubs, both regular and deep,
radosgw blocks on external requests, and my cluster has a bunch of request
Hello,
On Wed, 11 Nov 2015 07:12:56 + Ben Town wrote:
> Hi Guys,
>
> I'm in the process of configuring a ceph cluster and am getting some
> less than ideal performance and need some help figuring it out!
>
> This cluster will only really be used for backup storage for Veeam so I
> don't ne
On small cluster i've get a great sequental perfomance by using btrfs
on OSD, journal file (max sync interval ~180s) and with option
filestore journal parallel = true
2015-11-11 10:12 GMT+03:00 Ben Town :
> Hi Guys,
>
>
>
> I’m in the process of configuring a ceph cluster and am getting some less
Hi Guys,
I'm in the process of configuring a ceph cluster and am getting some less than
ideal performance and need some help figuring it out!
This cluster will only really be used for backup storage for Veeam so I don't
need a crazy amount of I/O but good sequential writes would be ideal.
At t
Dear Cephers,
I did a simple test to understand the performance loss of ceph. Here's my
environment:
CPU: 2 * Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz
Memory: 4 * 8G 1067 MHz
NIC: 2 * Intel Corporation 10-Gigabit X540-AT2
HDD:
1 * WDC WD1003FZEX ATA Disk 1TB
4 * Seagate ST2000NM0011 ATA Disk 2TB
27001 Data Protection Classification: A - Public
-Original Message-
From: McNamara, Bradley [mailto:bradley.mcnam...@seattle.gov]
Sent: 04 February 2014 19:22
To: Maciej Bonin; Mark Nelson; ceph-users@lists.ceph.com
Subject: RE: [ceph-users] Performance issues running vmfs on top of Ceph
esday, February 04, 2014 11:01 AM
To: Maciej Bonin; Mark Nelson; ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Performance issues running vmfs on top of Ceph
Hello again,
Having said that we seem to have improved the performance by following
http://kb.vmware.com/selfservice/microsites/sear
On Behalf Of Maciej Bonin
Sent: 04 February 2014 18:21
To: Mark Nelson; ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Performance issues running vmfs on top of Ceph
Hello Mark,
Thanks for getting back to me. We do have a couple of vms running that were
migrated off xen that are fine, performa
age-
From: ceph-users-boun...@lists.ceph.com
[mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Mark Nelson
Sent: 04 February 2014 18:11
To: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Performance issues running vmfs on top of Ceph
On 02/04/2014 11:55 AM, Maciej Bonin wrote:
>
Also, how are you accessing Ceph - is it using the TGT iSCSI package?
On Tue, Feb 4, 2014 at 10:10 AM, Mark Nelson wrote:
> On 02/04/2014 11:55 AM, Maciej Bonin wrote:
>
>> Hello guys,
>>
>> We're testing running an esxi hv on top of a ceph backend and we're
>> getting abysmal performance when u
On 02/04/2014 11:55 AM, Maciej Bonin wrote:
Hello guys,
We're testing running an esxi hv on top of a ceph backend and we're getting
abysmal performance when using vmfs, has anyone else tried this successful, any
advice ?
Would be really thankful for any hints.
Hi!
I don't have a ton of expe
Hello guys,
We're testing running an esxi hv on top of a ceph backend and we're getting
abysmal performance when using vmfs, has anyone else tried this successful, any
advice ?
Would be really thankful for any hints.
Regards,
Maciej Bonin
Systems Engineer | M247 Limited
M247.com Connected with
I need to restart the upload process again because all the objects
have a content-type of 'binary/octet-stream' instead of 'image/jpeg',
'image/png', etc. I plan on enabling monitoring this time so we can
see if there are any signs of what might be going on. Did you want me
to increase the number
Sorry, I meant to say the first four characters, for a total of 65539
buckets
On Thu, Sep 5, 2013 at 12:30 PM, Bryan Stillwell wrote:
> Wouldn't using only the first two characters in the file name result
> in less then 65k buckets being used?
>
> For example if the file names contained 0-9 and
On Thu, 5 Sep 2013, Bill Omer wrote:
> Thats correct. We created 65k buckets, using two hex characters as the
> naming convention, then stored the files in each container based on their
> first two characters in the file name. The end result was 20-50 files per
> bucket. Once all of the buckets
based on your numbers, you were at something like an average of 186
objects per bucket at the 20 hour mark? I wonder how this trend
compares to what you'd see with a single bucket.
With that many buckets you should have indexes well spread across all of
the OSDs. It'd be interesting to know
On Thu, Sep 5, 2013 at 9:49 AM, Sage Weil wrote:
> On Thu, 5 Sep 2013, Bill Omer wrote:
>> Thats correct. We created 65k buckets, using two hex characters as the
>> naming convention, then stored the files in each container based on their
>> first two characters in the file name. The end result
I'm using all defaults created with ceph-deploy
I will try the rgw cache setting. Do you have any other recommendations?
On Thu, Sep 5, 2013 at 1:14 PM, Yehuda Sadeh wrote:
> On Thu, Sep 5, 2013 at 9:49 AM, Sage Weil wrote:
> > On Thu, 5 Sep 2013, Bill Omer wrote:
> >> Thats correct. We cre
Mark,
Yesterday I blew away all the objects and restarted my test using
multiple buckets, and things are definitely better!
After ~20 hours I've already uploaded ~3.5 million objects, which much
is better then the ~1.5 million I did over ~96 hours this past
weekend. Unfortunately it seems that t
Wouldn't using only the first two characters in the file name result
in less then 65k buckets being used?
For example if the file names contained 0-9 and a-f, that would only
be 256 buckets (16*16). Or if they contained 0-9, a-z, and A-Z, that
would only be 3,844 buckets (62 * 62).
Bryan
On Th
On 09/05/2013 09:19 AM, Bill Omer wrote:
Thats correct. We created 65k buckets, using two hex characters as the
naming convention, then stored the files in each container based on
their first two characters in the file name. The end result was 20-50
files per bucket. Once all of the buckets we
Thats correct. We created 65k buckets, using two hex characters as the
naming convention, then stored the files in each container based on their
first two characters in the file name. The end result was 20-50 files per
bucket. Once all of the buckets were created and files were being loaded,
we
Just for clarification, distributing objects over lots of buckets isn't
helping improve small object performance?
The degradation over time is similar to something I've seen in the past,
with higher numbers of seeks on the underlying OSD device over time. Is
it always (temporarily) resolved w
We've actually done the same thing, creating 65k buckets and storing 20-50
objects in each. No change really, not noticeable anyway
On Wed, Sep 4, 2013 at 2:43 PM, Bryan Stillwell
wrote:
> So far I haven't seen much of a change. It's still working through
> removing the bucket that reached 1.5
So far I haven't seen much of a change. It's still working through
removing the bucket that reached 1.5 million objects though (my guess is
that'll take a few more days), so I believe that might have something to do
with it.
Bryan
On Wed, Sep 4, 2013 at 12:14 PM, Mark Nelson wrote:
> Bryan,
>
Bryan,
Good explanation. How's performance now that you've spread the load
over multiple buckets?
Mark
On 09/04/2013 12:39 PM, Bryan Stillwell wrote:
Bill,
I've run into a similar issue with objects averaging ~100KiB. The
explanation I received on IRC is that there are scaling issues if y
Bill,
I've run into a similar issue with objects averaging ~100KiB. The
explanation I received on IRC is that there are scaling issues if you're
uploading them all to the same bucket because the index isn't sharded. The
recommended solution is to spread the objects out to a lot of buckets.
Howe
I'm testing ceph for storing a very large number of small files. I'm
seeing some performance issues and would like to see if anyone could offer
any insight as to what I could do to correct this.
Some numbers:
Uploaded 184111 files, with an average file size of 5KB, using
10 separate servers to u
56 matches
Mail list logo