Re: [HACKERS] _mdfd_getseg can be expensive

2016-09-08 Thread Andres Freund
On 2016-08-31 15:15:16 -0700, Peter Geoghegan wrote: > On Wed, Aug 31, 2016 at 3:08 PM, Andres Freund wrote: > > On August 31, 2016 3:06:23 PM PDT, Peter Geoghegan wrote: > > > >>In other painfully pedantic news, I should point out that > >>sizeof(size_t) isn't necessarily word size (the most gen

Re: [HACKERS] _mdfd_getseg can be expensive

2016-08-31 Thread Peter Geoghegan
On Wed, Aug 31, 2016 at 3:08 PM, Andres Freund wrote: > On August 31, 2016 3:06:23 PM PDT, Peter Geoghegan wrote: > >>In other painfully pedantic news, I should point out that >>sizeof(size_t) isn't necessarily word size (the most generic >>definition of word size for the architecture), contrary

Re: [HACKERS] _mdfd_getseg can be expensive

2016-08-31 Thread Andres Freund
On August 31, 2016 3:06:23 PM PDT, Peter Geoghegan wrote: >In other painfully pedantic news, I should point out that >sizeof(size_t) isn't necessarily word size (the most generic >definition of word size for the architecture), contrary to my reading >of the 0002-* patch comments. I'm mostly tal

Re: [HACKERS] _mdfd_getseg can be expensive

2016-08-31 Thread Peter Geoghegan
On Wed, Aug 31, 2016 at 2:37 PM, Andres Freund wrote: >> This looks good. > > Thanks for looking! No problem. In other painfully pedantic news, I should point out that sizeof(size_t) isn't necessarily word size (the most generic definition of word size for the architecture), contrary to my readi

Re: [HACKERS] _mdfd_getseg can be expensive

2016-08-31 Thread Andres Freund
On 2016-08-31 14:09:47 -0700, Peter Geoghegan wrote: > On Thu, Aug 18, 2016 at 5:26 PM, Andres Freund wrote: > > Rebased version attached. A review would be welcome. Plan to push this > > forward otherwise in the not too far away future. > > This looks good. Thanks for looking! > The only thin

Re: [HACKERS] _mdfd_getseg can be expensive

2016-08-31 Thread Peter Geoghegan
On Wed, Aug 31, 2016 at 2:09 PM, Peter Geoghegan wrote: > The only thing that stuck out to any degree is that we don't grow the > "reln->md_seg_fds[forknum]" array within the new _fdvec_resize() > function geometrically. That new function looks like this: > +static void > +_fdvec_resize(SMgrRela

Re: [HACKERS] _mdfd_getseg can be expensive

2016-08-31 Thread Peter Geoghegan
On Thu, Aug 18, 2016 at 5:26 PM, Andres Freund wrote: > Rebased version attached. A review would be welcome. Plan to push this > forward otherwise in the not too far away future. This looks good. The only thing that stuck out to any degree is that we don't grow the "reln->md_seg_fds[forknum]" ar

Re: [HACKERS] _mdfd_getseg can be expensive

2016-08-18 Thread Peter Geoghegan
On Thu, Aug 18, 2016 at 5:42 PM, Andres Freund wrote: > How large was the index & table in question? I mean this really only > comes into effect at 100+ segments. Not that big, but I see no reason to take the chance, I suppose. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql

Re: [HACKERS] _mdfd_getseg can be expensive

2016-08-18 Thread Andres Freund
On 2016-08-18 17:35:47 -0700, Peter Geoghegan wrote: > On Thu, Aug 18, 2016 at 5:28 PM, Andres Freund wrote: > >> I can review this next week. > > > > Thanks > > Given the time frame that you have in mind, I won't revisit the > question the parallel CLUSTER CPU bottleneck issue until this is > co

Re: [HACKERS] _mdfd_getseg can be expensive

2016-08-18 Thread Peter Geoghegan
On Thu, Aug 18, 2016 at 5:28 PM, Andres Freund wrote: >> I can review this next week. > > Thanks Given the time frame that you have in mind, I won't revisit the question the parallel CLUSTER CPU bottleneck issue until this is committed. The patch might change things enough that that would be a wa

Re: [HACKERS] _mdfd_getseg can be expensive

2016-08-18 Thread Andres Freund
On 2016-08-18 17:27:59 -0700, Peter Geoghegan wrote: > On Thu, Aug 18, 2016 at 5:26 PM, Andres Freund wrote: > > Rebased version attached. A review would be welcome. Plan to push this > > forward otherwise in the not too far away future. > > I can review this next week. Thanks -- Sent via pgs

Re: [HACKERS] _mdfd_getseg can be expensive

2016-08-18 Thread Peter Geoghegan
On Thu, Aug 18, 2016 at 5:26 PM, Andres Freund wrote: > Rebased version attached. A review would be welcome. Plan to push this > forward otherwise in the not too far away future. I can review this next week. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql

Re: [HACKERS] _mdfd_getseg can be expensive

2016-08-18 Thread Andres Freund
On 2016-06-30 18:14:15 -0700, Peter Geoghegan wrote: > On Tue, Dec 15, 2015 at 10:04 AM, Andres Freund wrote: > > Took a while. But here we go. The attached version is a significantly > > revised version of my earlier patch. Notably I've pretty much entirely > > revised the code in _mdfd_getseg()

Re: [HACKERS] _mdfd_getseg can be expensive

2016-07-01 Thread Peter Geoghegan
On Thu, Jun 30, 2016 at 7:08 PM, Andres Freund wrote: > If you have a big enough index (maybe ~150GB+), sure. Before that, > probably not. > > It's usually pretty easy to see in cpu profiles whether this issue > exists. I think that this is a contributing factor to why merging in parallel CREATE

Re: [HACKERS] _mdfd_getseg can be expensive

2016-06-30 Thread Andres Freund
On 2016-06-30 18:34:20 -0700, Peter Geoghegan wrote: > On Thu, Jun 30, 2016 at 6:23 PM, Andres Freund wrote: > > I plan to, once the tree opens again. Likely needs some considerable > > updates for recent changes. > > Offhand, do you think that CREATE INDEX calls to smgrextend() could be > apprec

Re: [HACKERS] _mdfd_getseg can be expensive

2016-06-30 Thread Peter Geoghegan
On Thu, Jun 30, 2016 at 6:23 PM, Andres Freund wrote: > I plan to, once the tree opens again. Likely needs some considerable > updates for recent changes. Offhand, do you think that CREATE INDEX calls to smgrextend() could be appreciably affected by this bottleneck? If that's a very involved or d

Re: [HACKERS] _mdfd_getseg can be expensive

2016-06-30 Thread Andres Freund
On 2016-06-30 18:14:15 -0700, Peter Geoghegan wrote: > On Tue, Dec 15, 2015 at 10:04 AM, Andres Freund wrote: > > Took a while. But here we go. The attached version is a significantly > > revised version of my earlier patch. Notably I've pretty much entirely > > revised the code in _mdfd_getseg()

Re: [HACKERS] _mdfd_getseg can be expensive

2016-06-30 Thread Peter Geoghegan
On Tue, Dec 15, 2015 at 10:04 AM, Andres Freund wrote: > Took a while. But here we go. The attached version is a significantly > revised version of my earlier patch. Notably I've pretty much entirely > revised the code in _mdfd_getseg() to be more similar to the state in > master. Also some commen

Re: [HACKERS] _mdfd_getseg can be expensive

2015-12-15 Thread Andres Freund
On 2014-11-01 18:23:47 +0100, Andres Freund wrote: > On 2014-11-01 12:57:40 -0400, Tom Lane wrote: > > Andres Freund writes: > > > On 2014-10-31 18:48:45 -0400, Tom Lane wrote: > > >> While the basic idea is sound, this particular implementation seems > > >> pretty bizarre. What's with the "md_se

Re: [HACKERS] _mdfd_getseg can be expensive

2014-11-01 Thread Andres Freund
On 2014-11-01 12:57:40 -0400, Tom Lane wrote: > Andres Freund writes: > > On 2014-10-31 18:48:45 -0400, Tom Lane wrote: > >> While the basic idea is sound, this particular implementation seems > >> pretty bizarre. What's with the "md_seg_no" stuff, and why is that > >> array typed size_t? > > >

Re: [HACKERS] _mdfd_getseg can be expensive

2014-11-01 Thread Tom Lane
Andres Freund writes: > On 2014-10-31 18:48:45 -0400, Tom Lane wrote: >> While the basic idea is sound, this particular implementation seems >> pretty bizarre. What's with the "md_seg_no" stuff, and why is that >> array typed size_t? > It stores the length of the array of _MdfdVec entries. Oh.

Re: [HACKERS] _mdfd_getseg can be expensive

2014-10-31 Thread Andres Freund
On 2014-10-31 18:48:45 -0400, Tom Lane wrote: > Andres Freund writes: > > I wrote the attached patch that get rids of that essentially quadratic > > behaviour, by replacing the mdfd chain/singly linked list with an > > array. Since we seldomly grow files by a whole segment I can't see the > > slig

Re: [HACKERS] _mdfd_getseg can be expensive

2014-10-31 Thread Tom Lane
Andres Freund writes: > I wrote the attached patch that get rids of that essentially quadratic > behaviour, by replacing the mdfd chain/singly linked list with an > array. Since we seldomly grow files by a whole segment I can't see the > slightly bigger memory reallocations matter significantly. I

Re: [HACKERS] _mdfd_getseg can be expensive

2014-10-31 Thread Andres Freund
Hi, On 2014-03-31 12:10:01 +0200, Andres Freund wrote: > I recently have seen some perf profiles in which _mdfd_getseg() was in > the top #3 when VACUUMing large (~200GB) relations. Called by mdread(), > mdwrite(). Looking at it's implementation, I am not surprised. It > iterates over all segment