Hi,
Err, the subject ought to have been "force parallel mode vs CTAS" or
such.
On 2017-12-21 06:31:06 -0800, Andres Freund wrote:
> Hi Robert, Todd, All,
>
> I think both I and commit e9baa5e9fa147e are confused.
>
> Mantid started to fail with the parallel hash join commit, with the following
> assert failure [1]:
> #2 0x00000000008698e7 in ExceptionalCondition
> (conditionName=conditionName@entry=0x8efee8
> "!(CurrentTransactionState->parallelModeLevel == 0)",
> errorType=errorType@entry=0x8b35c9 "FailedAssertion",
> fileName=fileName@entry=0x8ec2bf "xact.c", lineNumber=lineNumber@entry=691)
> at assert.c:54
> #3 0x000000000050c7fb in GetCurrentCommandId (used=used@entry=1 '\001') at
> xact.c:691
> #4 0x00000000004d3f0c in toast_save_datum (value=value@entry=43175280,
> oldexternal=0x0, options=options@entry=2, rel=0x7f98a770d588,
> rel=0x7f98a770d588) at tuptoaster.c:1477
> #5 0x00000000004d54f9 in toast_insert_or_update
> (rel=rel@entry=0x7f98a770d588, newtup=0x28deaf0, oldtup=oldtup@entry=0x0,
> options=2) at tuptoaster.c:814
> #6 0x00000000004c0bf5 in heap_prepare_insert
> (relation=relation@entry=0x7f98a770d588, tup=tup@entry=0x28deaf0,
> xid=<optimized out>, cid=cid@entry=34, options=options@entry=2) at
> heapam.c:2660
> #7 0x00000000004c6eeb in heap_insert (relation=0x7f98a770d588,
> tup=0x28deaf0, cid=34, options=2, bistate=0x251d220) at heapam.c:2429
> #8 0x00000000005b39a3 in intorel_receive (slot=<optimized out>,
> self=0x2359420) at createas.c:599
> #9 0x000000000061cd58 in ExecutePlan (execute_once=<optimized out>,
> dest=0x2359420, direction=<optimized out>, numberTuples=0, sendTuples=1
> '\001', operation=CMD_SELECT, use_parallel_mode=<optimized out>,
> planstate=0x23effb0, estate=0x23efd98) at execMain.c:1753
> #10 standard_ExecutorRun (queryDesc=0x24f3910, direction=<optimized out>,
> count=0, execute_once=<optimized out>) at execMain.c:361
> #11 0x00000000005b3c91 in ExecCreateTableAs (stmt=stmt@entry=0x2337e58,
> queryString=queryString@entry=0x2336bc8 "create table wide as select
> generate_series(1, 2) as id, rpad('', 320000, 'x') as t;",
> params=params@entry=0x0, queryEnv=queryEnv@entry=0x0,
> completionTag=completionTag@entry=0x7fff66986fc0 "") at createas.c:351
> #12 0x000000000076a6a9 in ProcessUtilitySlow (pstate=pstate@entry=0x2359308,
> pstmt=pstmt@entry=0x23387d0, queryString=queryString@entry=0x2336bc8 "create
> table wide as select generate_series(1, 2) as id, rpad('', 320000, 'x') as
> t;", context=context@entry=PROCESS_UTILITY_TOPLEVEL, params=params@entry=0x0,
> queryEnv=queryEnv@entry=0x0, completionTag=completionTag@entry=0x7fff66986fc0
> "", dest=0x2338868) at utility.c:1454
> #13 0x000000000076951a in standard_ProcessUtility (pstmt=0x23387d0,
> queryString=0x2336bc8 "create table wide as select generate_series(1, 2) as
> id, rpad('', 320000, 'x') as t;", context=PROCESS_UTILITY_TOPLEVEL,
> params=0x0, queryEnv=0x0, dest=0x2338868, completionTag=0x7fff66986fc0 "") at
> utility.c:932
> #14 0x0000000000766bb8 in PortalRunUtility (portal=0x239aaf8,
> pstmt=0x23387d0, isTopLevel=<optimized out>, setHoldSnapshot=<optimized out>,
> dest=0x2338868, completionTag=0x7fff66986fc0 "") at pquery.c:1178
> #15 0x0000000000767679 in PortalRunMulti (portal=portal@entry=0x239aaf8,
> isTopLevel=isTopLevel@entry=1 '\001', setHoldSnapshot=setHoldSnapshot@entry=0
> '\000', dest=dest@entry=0x2338868, altdest=altdest@entry=0x2338868,
> completionTag=completionTag@entry=0x7fff66986fc0 "") at pquery.c:1331
> #16 0x0000000000768332 in PortalRun (portal=portal@entry=0x239aaf8,
> count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=1 '\001',
> run_once=run_once@entry=1 '\001', dest=dest@entry=0x2338868,
> altdest=altdest@entry=0x2338868,
> completionTag=completionTag@entry=0x7fff66986fc0 "") at pquery.c:799
> #17 0x00000000007640df in exec_simple_query (query_string=0x2336bc8 "create
> table wide as select generate_series(1, 2) as id, rpad('', 320000, 'x') as
> t;") at postgres.c:1120
>
> the reason this confuses me is that the PHJ commit hasn't changed
> anything relevant here. Testing force_parallel_mode=regress locally also
> shows that the failure is present before the commit.
>
> Why mantid's earlier builds don't show the problem, even though I can
> locally reproduce the issue escapes me. Todd, did you recently change
> anything on Mantid?
>
> By my reading this is the fault of e9baa5e9fa147e [3]. Robert, Haribabu
> any idea?
>
> Greetings,
>
> Andres Freund
>
> [1]
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mantid&dt=2017-12-21%2011%3A07%3A06
> [2] https://buildfarm.postgresql.org/cgi-bin/show_history.pl?nm=mantid&br=HEAD
> [3]
> http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=e9baa5e9fa147e00a2466ab2c40eb99c8a700824