Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

Amit Kapila Sun, 14 Jan 2018 20:26:32 -0800

On Sun, Jan 14, 2018 at 1:43 AM, Peter Geoghegan <p...@bowt.ie> wrote:
> On Sat, Jan 13, 2018 at 4:32 AM, Amit Kapila <amit.kapil...@gmail.com> wrote:
>> Yeah, but this would mean that now with parallel create index, it is
>> possible that some tuples from the transaction would end up in index
>> and others won't.
>
> You mean some tuples from some past transaction that deleted a bunch
> of tuples and committed, but not before someone acquired a still-held
> snapshot that didn't see the deleter's transaction as committed yet?
>


I think I am talking about something different.  Let me try to explain
in some more detail.  Consider a transaction T-1 has deleted two
tuples from tab-1, first on page-1 and second on page-2 and committed.
There is a parallel transaction T-2 which has an open snapshot/query
due to which oldestXmin will be smaller than T-1.   Now, in another
session, we started parallel Create Index on tab-1 which has launched
one worker.  The worker decided to scan page-1 and will found that the
deleted tuple on page-1 is Recently Dead, so will include it in Index.
In the meantime transaction, T-2 got committed/aborted which allows
oldestXmin to be greater than the value of transaction T-1 and now
leader decides to scan the page-2 with freshly computed oldestXmin and
found that the tuple on that page is Dead and has decided not to
include it in the index.  So, this leads to a situation where some
tuples deleted by the transaction will end up in index whereas others
won't.  Note that I am not arguing that there is any fundamental
problem with this, but just want to highlight that such a case doesn't
seem to exist with Create Index.

>
>> The patch uses both parallel_leader_participation and
>> force_parallel_mode, but it seems the definition is different from
>> what we have in Gather.  Basically, even with force_parallel_mode, the
>> leader is participating in parallel build. I see there is some
>> discussion above about both these parameters and still, there is not
>> complete agreement on the best way forward.  I think we should have
>> parallel_leader_participation as that can help in testing if nothing
>> else.
>
> I think that you're quite right that parallel_leader_participation
> needs to be supported for testing purposes. I had some sympathy for
> the idea that we should remove leader participation as a worker from
> the patch entirely, but the testing argument seems to clinch it. I'm
> fine with killing force_parallel_mode, though, because it will be
> possible to force the use of parallelism by using the existing
> parallel_workers table storage param in the next version of the patch,
> regardless of how small the table is.
>

Okay, this makes sense to me.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

Reply via email to