Re: [ceph-users] Dramatic performance drop at certain number of objects in pool

2016-06-27 Thread Mark Nelson
On 06/27/2016 03:12 AM, Blair Bethwaite wrote: On 25 Jun 2016 6:02 PM, "Kyle Bader" mailto:kyle.ba...@gmail.com>> wrote: fdatasync takes longer when you have more inodes in the slab caches, it's the double edged sword of vfs_cache_pressure. That's a bit sad when, iiuc, it's only journals doing

Re: [ceph-users] Dramatic performance drop at certain number of objects in pool

2016-06-27 Thread Blair Bethwaite
On 25 Jun 2016 6:02 PM, "Kyle Bader" wrote: > fdatasync takes longer when you have more inodes in the slab caches, it's the double edged sword of vfs_cache_pressure. That's a bit sad when, iiuc, it's only journals doing fdatasync in the Ceph write path. I'd have expected the vfs to handle this on

Re: [ceph-users] Dramatic performance drop at certain number of objects in pool

2016-06-24 Thread Christian Balzer
t; > >>Best Regards, > >>Wade > >> > >> > >>On Thu, Jun 23, 2016 at 8:09 PM, Somnath Roy > >>wrote: > >>> Oops , typo , 128 GB :-)... > >>> > >>> -Original Message- > >>> From: Christian

Re: [ceph-users] Dramatic performance drop at certain number of objects in pool

2016-06-24 Thread Wade Holler
t 8:09 PM, Somnath Roy >>wrote: >>> Oops , typo , 128 GB :-)... >>> >>> -Original Message- >>> From: Christian Balzer [mailto:ch...@gol.com] >>> Sent: Thursday, June 23, 2016 5:08 PM >>> To: ceph-users@lists.ceph.com >>> C

Re: [ceph-users] Dramatic performance drop at certain number of objects in pool

2016-06-24 Thread Warren Wang - ISD
's still >>> >>>>>doing over 15000 write IOPS all day long with 302 spinning >>> >>>>>drives >>> >>>>>+ SATA SSD journals. Having enough memory and dropping your >>> >>>>>vfs_cache_pressure should a

Re: [ceph-users] Dramatic performance drop at certain number of objects in pool

2016-06-24 Thread Wade Holler
mnath Roy; Warren Wang - ISD; Wade Holler; Blair Bethwaite; Ceph > Development > Subject: Re: [ceph-users] Dramatic performance drop at certain number of > objects in pool > > > Hello, > > On Thu, 23 Jun 2016 22:24:59 + Somnath Roy wrote: > >> Or even vm.vfs_c

Re: [ceph-users] Dramatic performance drop at certain number of objects in pool

2016-06-23 Thread Somnath Roy
to go to disk for things that normally would be in memory. > >>>> > >>>> Looking at Blair's graph from yesterday pretty much makes that > >>>>clear, a purely split caused degradation should have relented > >>>>much quicker. > &g

Re: [ceph-users] Dramatic performance drop at certain number of objects in pool

2016-06-23 Thread Christian Balzer
--- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Warren Wang - ISD Sent: Thursday, June 23, 2016 3:09 PM > To: Wade Holler; Blair Bethwaite > Cc: Ceph Development; ceph-users@lists.ceph.com > Subject: Re: [ceph-users] Dramatic performance drop at certai

Re: [ceph-users] Dramatic performance drop at certain number of objects in pool

2016-06-23 Thread Somnath Roy
have relented much >>>>quicker. >>>> >>>> >>>>> Keep in mind that if you change the values, it won't take effect >>>>> immediately. It only merges them back if the directory is under >>>>> the calculated thresho

Re: [ceph-users] Dramatic performance drop at certain number of objects in pool

2016-06-23 Thread Warren Wang - ISD
sterday pretty much makes that clear, >>>>a >>>> purely split caused degradation should have relented much quicker. >>>> >>>> >>>>> Keep in mind that if you change the values, it won't take effect >>>>> immediately. It onl

Re: [ceph-users] Dramatic performance drop at certain number of objects in pool

2016-06-22 Thread Christian Balzer
Hello Blair, hello Wade (see below), On Thu, 23 Jun 2016 12:55:17 +1000 Blair Bethwaite wrote: > On 23 June 2016 at 12:37, Christian Balzer wrote: > > Case in point, my main cluster (RBD images only) with 18 5+TB OSDs on 3 > > servers (64GB RAM each) has 1.8 million 4MB RBD objects using about

Re: [ceph-users] Dramatic performance drop at certain number of objects in pool

2016-06-22 Thread Blair Bethwaite
On 23 June 2016 at 12:37, Christian Balzer wrote: > Case in point, my main cluster (RBD images only) with 18 5+TB OSDs on 3 > servers (64GB RAM each) has 1.8 million 4MB RBD objects using about 7% of > the available space. > Don't think I could hit this problem before running out of space. Perhap

Re: [ceph-users] Dramatic performance drop at certain number of objects in pool

2016-06-22 Thread Christian Balzer
ead, I forget). > >>> > >> If it's a read a plain scrub might do the trick. > >> > >> Christian > >>> Warren > >>> > >>> > >>> From: ceph-users > >>> mailto:ceph-users-boun...@lists.ceph.com>

Re: [ceph-users] Dramatic performance drop at certain number of objects in pool

2016-06-22 Thread Blair Bethwaite
Hi Christian, Ah ok, I didn't see object size mentioned earlier. But I guess direct rados small objects would be a rarish use-case and explains the very high object counts. I'm interested in finding the right balance for RBD given object size is another variable that can be tweaked there. I recal

Re: [ceph-users] Dramatic performance drop at certain number of objects in pool

2016-06-22 Thread Wade Holler
No. Our application writes very small objects. On Wed, Jun 22, 2016 at 10:01 PM, Blair Bethwaite wrote: > On 23 June 2016 at 11:41, Wade Holler wrote: >> Workload is native librados with python. ALL 4k objects. > > Was that meant to be 4MB? > > -- > Cheers, > ~Blairo _

Re: [ceph-users] Dramatic performance drop at certain number of objects in pool

2016-06-22 Thread Christian Balzer
On Thu, 23 Jun 2016 12:01:38 +1000 Blair Bethwaite wrote: > On 23 June 2016 at 11:41, Wade Holler wrote: > > Workload is native librados with python. ALL 4k objects. > > Was that meant to be 4MB? > Nope, he means 4K, he's putting lots of small objects via a python script into the cluster to te

Re: [ceph-users] Dramatic performance drop at certain number of objects in pool

2016-06-22 Thread Blair Bethwaite
On 23 June 2016 at 11:41, Wade Holler wrote: > Workload is native librados with python. ALL 4k objects. Was that meant to be 4MB? -- Cheers, ~Blairo ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-c

Re: [ceph-users] Dramatic performance drop at certain number of objects in pool

2016-06-22 Thread Wade Holler
culated threshold and a write occurs (maybe a read, I forget). >>>> >>> If it's a read a plain scrub might do the trick. >>> >>> Christian >>>> Warren >>>> >>>> >>>> From: ceph-users >>>> mailto:

Re: [ceph-users] Dramatic performance drop at certain number of objects in pool

2016-06-22 Thread Blair Bethwaite
on behalf of Wade Holler >>> mailto:wade.hol...@gmail.com>> Date: Monday, June >>> 20, 2016 at 2:48 PM To: Blair Bethwaite >>> mailto:blair.bethwa...@gmail.com>>, Wido den >>> Hollander mailto:w...@42on.com>> Cc: Ceph Development >&

Re: [ceph-users] Dramatic performance drop at certain number of objects in pool

2016-06-22 Thread Wade Holler
gt;> on behalf of Wade Holler >> mailto:wade.hol...@gmail.com>> Date: Monday, June >> 20, 2016 at 2:48 PM To: Blair Bethwaite >> mailto:blair.bethwa...@gmail.com>>, Wido den >> Hollander mailto:w...@42on.com>> Cc: Ceph Development >> mailto:ceph-de...

Re: [ceph-users] Dramatic performance drop at certain number of objects in pool

2016-06-20 Thread Christian Balzer
-de...@vger.kernel.org>>, > "ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>" > mailto:ceph-users@lists.ceph.com>> Subject: > Re: [ceph-users] Dramatic performance drop at certain number of objects > in pool > > Thanks everyone for you

Re: [ceph-users] Dramatic performance drop at certain number of objects in pool

2016-06-20 Thread Warren Wang - ISD
ne 20, 2016 at 2:48 PM To: Blair Bethwaite mailto:blair.bethwa...@gmail.com>>, Wido den Hollander mailto:w...@42on.com>> Cc: Ceph Development mailto:ceph-de...@vger.kernel.org>>, "ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>" mailto:ceph-users@lists

Re: [ceph-users] Dramatic performance drop at certain number of objects in pool

2016-06-20 Thread Wade Holler
Thanks everyone for your replies. I sincerely appreciate it. We are testing with different pg_num and filestore_split_multiple settings. Early indications are well not great. Regardless it is nice to understand the symptoms better so we try to design around it. Best Regards, Wade On Mon,

Re: [ceph-users] Dramatic performance drop at certain number of objects in pool

2016-06-19 Thread Blair Bethwaite
On 20 June 2016 at 09:21, Blair Bethwaite wrote: > slow request issues). If you watch your xfs stats you'll likely get > further confirmation. In my experience xs_dir_lookups balloons (which > means directory lookups are missing cache and going to disk). Murphy's a bitch. Today we upgraded a clus

Re: [ceph-users] Dramatic performance drop at certain number of objects in pool

2016-06-19 Thread Christian Balzer
Hello Blair, On Mon, 20 Jun 2016 09:21:27 +1000 Blair Bethwaite wrote: > Hi Wade, > > (Apologies for the slowness - AFK for the weekend). > > On 16 June 2016 at 23:38, Wido den Hollander wrote: > > > >> Op 16 juni 2016 om 14:14 schreef Wade Holler : > >> > >> > >> Hi All, > >> > >> I have a r

Re: [ceph-users] Dramatic performance drop at certain number of objects in pool

2016-06-19 Thread Blair Bethwaite
Hi Wade, (Apologies for the slowness - AFK for the weekend). On 16 June 2016 at 23:38, Wido den Hollander wrote: > >> Op 16 juni 2016 om 14:14 schreef Wade Holler : >> >> >> Hi All, >> >> I have a repeatable condition when the object count in a pool gets to >> 320-330 million the object write ti

Re: [ceph-users] Dramatic performance drop at certain number of objects in pool

2016-06-16 Thread Wade Holler
Blairo, Thats right, I do see "lots" of READ IO! If I compare the "bad (330Mil)" pool, with the new test (good) pool: iostat while running to the "good" pool shows almost all writes. iostat while running to the "bad" pool has VERY large read spikes, with almost no writes. Sounds like you have a

Re: [ceph-users] Dramatic performance drop at certain number of objects in pool

2016-06-16 Thread Wade Holler
Blairo, Thats right, I do see "lots" of READ IO! If I compare the "bad (330Mil)" pool, with the new test (good) pool: iostat while running to the "good" pool shows almost all writes. iostat while running to the "bad" pool has VERY large read spikes, with almost no writes. Sounds like you have a

Re: [ceph-users] Dramatic performance drop at certain number of objects in pool

2016-06-16 Thread Blair Bethwaite
Hi Wade, What IO are you seeing on the OSD devices when this happens (see e.g. iostat), are there short periods of high read IOPS where (almost) no writes occur? What does your memory usage look like (including slab)? Cheers, On 16 June 2016 at 22:14, Wade Holler wrote: > Hi All, > > I have a r