On Mon, Aug 31, 2020 at 03:13:06PM -0700, Melanie Plageman wrote:
> Attached is the current version of adaptive hash join with two
> significant changes as compared to v10:
The CF bot is complaining about a regression test failure:
@@ -2465,7 +2465,7 @@
Gather (actual rows=469 loops=1)
Worke
On Thu, Jun 25, 2020 at 03:09:44PM -0700, Melanie Plageman wrote:
On Tue, Jun 23, 2020 at 3:24 PM Tomas Vondra
wrote:
I started looking at the patch to refresh my knowledge both of this
patch and parallel hash join, but I think it needs a rebase. The
changes in 7897e3bb90 apparently touched so
Hi Tomas,
On Tue, Jun 23, 2020 at 3:24 PM Tomas Vondra wrote:
>
> Now, a couple comments / questions about the code.
>
>
> nodeHash.c
> --
>
>
> 1) MultiExecPrivateHash says this
>
>/*
> * Not subject to skew optimization, so either insert normally
> * or save to batch file if
On Mon, Jun 08, 2020 at 05:12:25PM -0700, Melanie Plageman wrote:
On Wed, May 27, 2020 at 7:25 PM Melanie Plageman
wrote:
I've attached a rebased patch which includes the "provisionally detach"
deadlock hazard fix approach
Alas, the "provisional detach" logic proved incorrect (see last poin
On Wed, Apr 29, 2020 at 4:44 PM David Kimura wrote:
>
> Following patch adds logic to create a batch 0 file for serial hash join so
> that even in pathalogical case we do not need to exceed work_mem.
Updated the patch to spill batch 0 tuples after it is marked as fallback.
A couple questions fro
On Fri, May 1, 2020 at 2:30 AM Melanie Plageman
wrote:
> I made a few edits to the message and threw it into a draft patch (on
> top of master, of course). I didn't want to junk up peoples' inboxes, so
> I didn't start a separate thread, but, it will be pretty hard to
> collaboratively edit the co
On 2020-Apr-30, Melanie Plageman wrote:
> On Tue, Apr 28, 2020 at 11:50 PM Heikki Linnakangas wrote:
> > I haven't looked at the patch in detail, but thanks [...]
> Thanks for taking a look, Heikki!
Hmm. We don't have Heikki's message in the archives. In fact, the last
message from Heikki we
On Tue, Apr 28, 2020 at 11:50 PM Heikki Linnakangas wrote:
> On 29/04/2020 05:03, Melanie Plageman wrote:
> > I've attached a patch which should address some of the previous feedback
> > about code complexity. Two of my co-workers and I wrote what is
> > essentially a new prototype of the idea. I
On Wed, Apr 29, 2020 at 4:39 PM Melanie Plageman
wrote:
>
> In addition to many assorted TODOs in the code, there are a few major
> projects left:
> - Batch 0 falling back
> - Stripe barrier deadlock
> - Performance improvements and testing
>
Batch 0 never spills. That behavior is an artifact of
On Tue, Jan 7, 2020 at 4:14 PM Thomas Munro wrote:
> * I am uneasy about BarrierArriveExplicitAndWait() (a variant of
> BarrierArriveAndWait() that lets you skip directly to a given phase?);
> perhaps you only needed that for a circular phase system, which you
> could do with modular phase number
On Mon, Dec 30, 2019 at 4:34 PM Melanie Plageman
wrote:
> So, I finally have a prototype to share of parallel hashloop fallback.
Hi Melanie,
Thanks for all your continued work on this! I started looking at it
today; it's a difficult project and I think it'll take me a while to
grok. I do have
On Fri, Sep 06, 2019 at 10:54:13AM -0700, Melanie Plageman wrote:
On Thu, Sep 5, 2019 at 10:35 PM Thomas Munro wrote:
Seems like a good time for me to try to summarise what I think the
main problems are here:
1. The match-bit storage problem already discussed. The tuples that
each process r
On Thu, Sep 5, 2019 at 10:35 PM Thomas Munro wrote:
> Seems like a good time for me to try to summarise what I think the
> main problems are here:
>
> 1. The match-bit storage problem already discussed. The tuples that
> each process receives while reading from SharedTupleStore are
> non-determ
On Wed, Jul 31, 2019 at 6:47 AM Melanie Plageman
wrote:
> So, I've rewritten the patch to use a BufFile for the outer table
> batch file tuples' match statuses and write bytes to and from the file
> which start as 0 and, upon encountering a match for a tuple, I set its
> bit in the file to 1 (also
On Tue, Jul 30, 2019 at 8:07 PM Melanie Plageman
wrote:
> For the actual write to disk, I'm pretty sure I get that for free from
> the BufFile API, no?
> I was more thinking about optimizing when I call BufFileWrite at all.
Right. Clearly several existing buffile.c users regularly have very
small
On Tue, Jul 30, 2019 at 4:36 PM Robert Haas wrote:
> On Tue, Jul 30, 2019 at 2:47 PM Melanie Plageman
> wrote:
> > I did the "needlessly dumb implementation" Robert mentioned, though,
> > I thought about it and couldn't come up with a much smarter way to
> > write match bits to a file. I think t
On Tue, Jul 30, 2019 at 2:47 PM Melanie Plageman
wrote:
> I did the "needlessly dumb implementation" Robert mentioned, though,
> I thought about it and couldn't come up with a much smarter way to
> write match bits to a file. I think there might be an optimization
> opportunity in not writing the
So, I've rewritten the patch to use a BufFile for the outer table
batch file tuples' match statuses and write bytes to and from the file
which start as 0 and, upon encountering a match for a tuple, I set its
bit in the file to 1 (also rebased with current master).
It, of course, only works for par
On Wed, Jul 03, 2019 at 02:22:09PM -0700, Melanie Plageman wrote:
On Tue, Jun 18, 2019 at 3:24 PM Melanie Plageman
wrote:
These questions will probably make a lot more sense with corresponding
code, so I will follow up with the second version of the state machine
patch once I finish it.
I
On Tue, Jun 18, 2019 at 3:24 PM Melanie Plageman
wrote:
>
> These questions will probably make a lot more sense with corresponding
> code, so I will follow up with the second version of the state machine
> patch once I finish it.
>
>
I have changed the state machine and resolved the questions I h
On Thu, Jun 13, 2019 at 7:10 AM Robert Haas wrote:
> On Tue, Jun 11, 2019 at 2:35 PM Melanie Plageman
> wrote:
> > Let me try to articulate what I think the bitmap implementation would
> look
> > like:
> >
> > Before doing chunked hashloop join for any batch, we would need to
> > know how many t
On Tue, Jun 11, 2019 at 2:35 PM Melanie Plageman
wrote:
> Let me try to articulate what I think the bitmap implementation would look
> like:
>
> Before doing chunked hashloop join for any batch, we would need to
> know how many tuples are in the outer batch to make the bitmap the
> correct size.
On Fri, Jun 7, 2019 at 7:30 AM Robert Haas wrote:
> On Thu, Jun 6, 2019 at 7:31 PM Melanie Plageman
> wrote:
> > I'm not sure I understand why you would need to compare the original
> > tuples to the unmatched tuples file.
>
> I think I was confused. Actually, I'm still not sure I understand th
On Fri, Jun 7, 2019 at 10:17 AM Tomas Vondra
wrote:
> Yes, they could get quite big, and I think you're right we need to
> keep that in mind, because it's on the outer (often quite large) side of
> the join. And if we're aiming to restrict memory usage, it'd be weird to
> just ignore this.
>
> But
On Thu, Jun 6, 2019 at 7:31 PM Melanie Plageman
wrote:
> I'm not sure I understand why you would need to compare the original
> tuples to the unmatched tuples file.
I think I was confused. Actually, I'm still not sure I understand this part:
> Then, you iterate again through the outer side a th
On Thu, Jun 06, 2019 at 04:33:31PM -0700, Melanie Plageman wrote:
On Tue, Jun 4, 2019 at 12:15 PM Robert Haas wrote:
On Tue, Jun 4, 2019 at 3:09 PM Melanie Plageman
wrote:
> One question I have is, how would the OR'd together bitmap be
> propagated to workers after the first chunk? That is, w
On Thu, Jun 06, 2019 at 04:37:19PM -0700, Melanie Plageman wrote:
On Thu, May 16, 2019 at 3:22 PM Thomas Munro wrote:
Admittedly I don't have a patch, just a bunch of handwaving. One
reason I haven't attempted to write it is because although I know how
to do the non-parallel version using a B
On Thu, May 16, 2019 at 3:22 PM Thomas Munro wrote:
> Admittedly I don't have a patch, just a bunch of handwaving. One
> reason I haven't attempted to write it is because although I know how
> to do the non-parallel version using a BufFile full of match bits in
> sync with the tuples for outer j
On Tue, Jun 4, 2019 at 12:15 PM Robert Haas wrote:
> On Tue, Jun 4, 2019 at 3:09 PM Melanie Plageman
> wrote:
> > One question I have is, how would the OR'd together bitmap be
> > propagated to workers after the first chunk? That is, when there are
> > no tuples left in the outer bunch, for a gi
On Tue, Jun 4, 2019 at 12:08 PM Robert Haas wrote:
> On Tue, Jun 4, 2019 at 2:47 PM Melanie Plageman
> wrote:
> > Oops! You are totally right.
> > I will amend the idea:
> > For each chunk on the inner side, loop through both the original batch
> > file and the unmatched outer tuples file create
On Tue, Jun 04, 2019 at 03:08:24PM -0400, Robert Haas wrote:
On Tue, Jun 4, 2019 at 2:47 PM Melanie Plageman
wrote:
Oops! You are totally right.
I will amend the idea:
For each chunk on the inner side, loop through both the original batch
file and the unmatched outer tuples file created for the
On Tue, Jun 4, 2019 at 3:09 PM Melanie Plageman
wrote:
> One question I have is, how would the OR'd together bitmap be
> propagated to workers after the first chunk? That is, when there are
> no tuples left in the outer bunch, for a given inner chunk, would you
> load the bitmaps from each worker
On Tue, Jun 4, 2019 at 6:05 AM Robert Haas wrote:
> On Sun, May 19, 2019 at 7:07 PM Thomas Munro
> wrote:
> > Unfortunately that bits-in-order scheme doesn't work for parallel
> > hash, where the SharedTuplestore tuples seen by each worker are
> > non-deterministic. So perhaps in that case we c
On Tue, Jun 4, 2019 at 2:47 PM Melanie Plageman
wrote:
> Oops! You are totally right.
> I will amend the idea:
> For each chunk on the inner side, loop through both the original batch
> file and the unmatched outer tuples file created for the last chunk.
> Emit any matches and write out any unmatc
On Tue, Jun 4, 2019 at 5:43 AM Robert Haas wrote:
> On Mon, Jun 3, 2019 at 5:10 PM Melanie Plageman
> wrote:
> > I was talking to Jeff Davis about this on Saturday, and, he felt that
> > there might be a way to solve the problem differently if we thought of
> > the left join case as performing a
On Sun, May 19, 2019 at 7:07 PM Thomas Munro wrote:
> Unfortunately that bits-in-order scheme doesn't work for parallel
> hash, where the SharedTuplestore tuples seen by each worker are
> non-deterministic. So perhaps in that case we could use the
> HEAP_TUPLE_HAS_MATCH bit in the outer tuple hea
On Mon, Jun 3, 2019 at 5:10 PM Melanie Plageman
wrote:
> I was talking to Jeff Davis about this on Saturday, and, he felt that
> there might be a way to solve the problem differently if we thought of
> the left join case as performing an inner join and an antijoin
> instead.
>
> Riffing on this id
On Sun, May 19, 2019 at 4:07 PM Thomas Munro wrote:
> On Sat, May 18, 2019 at 12:15 PM Melanie Plageman
> wrote:
> > On Thu, May 16, 2019 at 3:22 PM Thomas Munro
> wrote:
> >> Admittedly I don't have a patch, just a bunch of handwaving. One
> >> reason I haven't attempted to write it is becaus
On Sun, May 19, 2019 at 4:07 PM Thomas Munro wrote:
> On Sat, May 18, 2019 at 12:15 PM Melanie Plageman
> wrote:
> > On Thu, May 16, 2019 at 3:22 PM Thomas Munro
> wrote:
> >> Admittedly I don't have a patch, just a bunch of handwaving. One
> >> reason I haven't attempted to write it is becaus
On Mon, May 20, 2019 at 01:25:52PM +1200, Thomas Munro wrote:
On Mon, May 20, 2019 at 12:22 PM Tomas Vondra
wrote:
On Mon, May 20, 2019 at 11:07:03AM +1200, Thomas Munro wrote:
>First let me restate the PostgreSQL terminology for this stuff so I
>don't get confused while talking about it:
>
>*
Hi,
On 2019-05-20 13:25:52 +1200, Thomas Munro wrote:
> In PostgreSQL, it's always inner = right, outer = left. You can see
> that reflected in plannodes.h and elsewhere:
>
> /*
> * these are defined to avoid confusion problems with "left"
> * and "right" and "inner"
On Mon, May 20, 2019 at 12:22 PM Tomas Vondra
wrote:
> On Mon, May 20, 2019 at 11:07:03AM +1200, Thomas Munro wrote:
> >First let me restate the PostgreSQL terminology for this stuff so I
> >don't get confused while talking about it:
> >
> >* The inner side of the join = the right side = the side
On Mon, May 20, 2019 at 11:07:03AM +1200, Thomas Munro wrote:
On Sat, May 18, 2019 at 12:15 PM Melanie Plageman
wrote:
On Thu, May 16, 2019 at 3:22 PM Thomas Munro wrote:
Admittedly I don't have a patch, just a bunch of handwaving. One
reason I haven't attempted to write it is because althou
On Sat, May 18, 2019 at 12:15 PM Melanie Plageman
wrote:
> On Thu, May 16, 2019 at 3:22 PM Thomas Munro wrote:
>> Admittedly I don't have a patch, just a bunch of handwaving. One
>> reason I haven't attempted to write it is because although I know how
>> to do the non-parallel version using a Bu
On Thu, May 16, 2019 at 3:22 PM Thomas Munro wrote:
> Admittedly I don't have a patch, just a bunch of handwaving. One
> reason I haven't attempted to write it is because although I know how
> to do the non-parallel version using a BufFile full of match bits in
> sync with the tuples for outer j
On Fri, May 17, 2019 at 11:46 AM Tomas Vondra
wrote:
> On Thu, May 16, 2019 at 06:58:43PM -0400, Tom Lane wrote:
> >Thomas Munro writes:
> >> On Fri, May 17, 2019 at 4:39 AM Tomas Vondra
> >> wrote:
> >>> I kinda like the idea with increasing the spaceAllowed value. Essentially,
> >>> if we deci
On Thu, May 16, 2019 at 06:58:43PM -0400, Tom Lane wrote:
Thomas Munro writes:
On Fri, May 17, 2019 at 4:39 AM Tomas Vondra
wrote:
I kinda like the idea with increasing the spaceAllowed value. Essentially,
if we decide adding batches would be pointless, increasing the memory
budget is the onl
Thomas Munro writes:
> On Fri, May 17, 2019 at 4:39 AM Tomas Vondra
> wrote:
>> I kinda like the idea with increasing the spaceAllowed value. Essentially,
>> if we decide adding batches would be pointless, increasing the memory
>> budget is the only thing we can do anyway.
> But that's not OK, w
On Fri, May 17, 2019 at 10:21:56AM +1200, Thomas Munro wrote:
On Fri, May 17, 2019 at 4:39 AM Tomas Vondra
wrote:
I think this is a step in the right direction, but as I said on the other
thread(s), I think we should not disable growth forever and recheck once
in a while. Otherwise we'll end up
On Fri, May 17, 2019 at 4:39 AM Tomas Vondra
wrote:
> I think this is a step in the right direction, but as I said on the other
> thread(s), I think we should not disable growth forever and recheck once
> in a while. Otherwise we'll end up in sad situation with non-uniform data
> sets, as poined o
On Thu, May 16, 2019 at 01:22:31PM +1200, Thomas Munro wrote:
...
Here's a quick hack to show that a 95% cut-off fixes those examples.
I don't really know how to choose the number, but I suspect it should
be much closer to 100 than 50. I think this is the easiest of three
fundamental problems t
51 matches
Mail list logo