On Wed, Sep 14, 2016 at 3:48 PM, Heikki Linnakangas wrote:
> If we flushed the tree to a tape instead, then we could perhaps use the
> machinery that Peter's parallel B-tree patch is adding to tuplesort.c, to
> merge the tapes. I'm not sure if that works out, but I think it's worth some
> experime
On 01/17/2016 10:03 PM, Jeff Janes wrote:
On Fri, Jan 15, 2016 at 3:29 PM, Peter Geoghegan wrote:
On Fri, Jan 15, 2016 at 2:38 PM, Constantin S. Pan wrote:
I have a draft implementation which divides the whole process between
N parallel workers, see the patch attached. Instead of a full scan
On Fri, Jan 15, 2016 at 5:38 PM, Constantin S. Pan wrote:
> In current state the implementation is just a proof of concept
> and it has all the configuration hardcoded, but it already works as is,
> though it does not speed up the build process more than 4 times on my
> configuration (12 CPUs). Th
On Sun, Jan 17, 2016 at 12:03 PM, Jeff Janes wrote:
> I think it would take a lot of changes to tuple sort to make this be a
> almost-always win.
>
> In the general case each GIN key occurs in many tuples, and the
> in-memory rbtree is good at compressing the tid list for each key to
> maximize th
On Fri, Jan 15, 2016 at 3:29 PM, Peter Geoghegan wrote:
> On Fri, Jan 15, 2016 at 2:38 PM, Constantin S. Pan wrote:
>> I have a draft implementation which divides the whole process between
>> N parallel workers, see the patch attached. Instead of a full scan of
>> the relation, I give each worker
On Fri, 15 Jan 2016 15:29:51 -0800
Peter Geoghegan wrote:
> On Fri, Jan 15, 2016 at 2:38 PM, Constantin S. Pan
> wrote:
> Even without parallelism, wouldn't it be better if GIN indexes were
> built using tuplesort? I know way way less about the gin am than the
> nbtree am, but I imagine that a p
On Fri, Jan 15, 2016 at 2:38 PM, Constantin S. Pan wrote:
> I have a draft implementation which divides the whole process between
> N parallel workers, see the patch attached. Instead of a full scan of
> the relation, I give each worker a range of blocks to read.
I am currently working on a patch
Hi, Hackers.
The task of building GIN can require lots of time and eats 100 % CPU,
but we could easily make it use more than a 100 %, especially since we
now have parallel workers in postgres.
The process of building GIN looks like this:
1. Accumulate a batch of index records into an rbtree in m