On Thu, Nov 14, 2024 at 9:33 PM Alvaro Herrera <alvhe...@alvh.no-ip.org> wrote:
> On 2024-Nov-14, Tom Lane wrote: > > > Alvaro Herrera <alvhe...@alvh.no-ip.org> writes: > > > On 2024-Nov-14, Christoph Berg wrote: > > >> It didn't actually crash, true. > > > > > But timescale did crash: > > > > So what is timescale doing differently? > > No idea. Maybe a stacktrace from a core would tell us more; but the > jenkins task doesn't seem to have server logs or anything else to gives > us more clues. > Here is one stack trace from my failure. It's a little inconclusive, but maybe better brains than mine might realize something: (gdb) bt #0 __pthread_kill_implementation (no_tid=0, signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:44 #1 __pthread_kill_internal (signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:78 #2 __GI___pthread_kill (threadid=<optimized out>, signo=signo@entry=6) at ./nptl/pthread_kill.c:89 #3 0x00007ab10c44526e in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26 #4 0x00007ab10c4288ff in __GI_abort () at ./stdlib/abort.c:79 #5 0x000056b2b0439b70 in ExceptionalCondition ( conditionName=conditionName@entry=0x7ab10cad0280 "whichrel >= 0 && whichrel < mtstate->mt_nrels", fileName=fileName@entry=0x7ab10cad0170 "/home/mats/work/timescale/timescaledb+tam/src/nodes/hypertable_modify.c", lineNumber=lineNumber@entry=1249) at assert.c:66 #6 0x00007ab10caa0c90 in ExecInitUpdateProjection (mtstate=mtstate@entry=0x56b2b0bbf8f0, resultRelInfo=resultRelInfo@entry=0x56b2b0bbfc80) at /home/mats/work/timescale/timescaledb+tam/src/nodes/hypertable_modify.c:1249 #7 0x00007ab10caa385e in ExecModifyTable (cs_node=cs_node@entry=0x56b2b0bbf2e0, pstate=0x56b2b0bbf8f0) at /home/mats/work/timescale/timescaledb+tam/src/nodes/hypertable_modify.c:940 #8 0x00007ab10caa3a83 in hypertable_modify_exec (node=0x56b2b0bbf2e0) at /home/mats/work/timescale/timescaledb+tam/src/nodes/hypertable_modify.c:174 #9 0x000056b2b011bcba in ExecCustomScan (pstate=<optimized out>) at nodeCustom.c:121 #10 0x000056b2b0107147 in ExecProcNodeFirst (node=0x56b2b0bbf2e0) at execProcnode.c:464 #11 0x000056b2b00fed07 in ExecProcNode (node=node@entry=0x56b2b0bbf2e0) at ../../../src/include/executor/executor.h:274 #12 0x000056b2b00fed9b in ExecutePlan (estate=estate@entry=0x56b2b0bbf030, planstate=0x56b2b0bbf2e0, use_parallel_mode=<optimized out>, operation=operation@entry=CMD_UPDATE, sendTuples=sendTuples@entry=false, numberTuples=numberTuples@entry=0, direction=ForwardScanDirection, dest=0x56b2b0ba6a68, execute_once=true) at execMain.c:1654 #13 0x000056b2b00ffb6e in standard_ExecutorRun (queryDesc=0x56b2b0b96780, direction=ForwardScanDirection, count=0, execute_once=execute_once@entry=true) at execMain.c:365 #14 0x000056b2b00ffcb2 in ExecutorRun (queryDesc=queryDesc@entry=0x56b2b0b96780, direction=direction@entry=ForwardScanDirection, count=count@entry=0, execute_once=execute_once@entry=true) at execMain.c:306 #15 0x000056b2b02f108a in ProcessQuery (plan=plan@entry=0x56b2b0ba6958, sourceText=<optimized out>, params=0x0, queryEnv=0x0, dest=dest@entry=0x56b2b0ba6a68, qc=qc@entry=0x7ffc452f04e0) at pquery.c:160 #16 0x000056b2b02f1bf2 in PortalRunMulti (portal=portal@entry=0x56b2b0af24d0, isTopLevel=isTopLevel@entry=true, setHoldSnapshot=setHoldSnapshot@entry=false, dest=dest@entry=0x56b2b0ba6a68, altdest=altdest@entry=0x56b2b0ba6a68, qc=qc@entry=0x7ffc452f04e0) at pquery.c:1277 #17 0x000056b2b02f210f in PortalRun (portal=portal@entry=0x56b2b0af24d0, count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=true, run_once=run_once@entry=true, dest=dest@entry=0x56b2b0ba6a68, altdest=altdest@entry=0x56b2b0ba6a68, qc=0x7ffc452f04e0) at pquery.c:791 #18 0x000056b2b02ee17d in exec_simple_query (query_string=query_string@entry=0x56b2b0a5c9e0 "update ts_continuous_test SET val = 5 where time > 15 and time < 25;") at postgres.c:1278 #19 0x000056b2b02f01a7 in PostgresMain (dbname=<optimized out>, username=<optimized out>) at postgres.c:4767 #20 0x000056b2b02e959e in BackendMain (startup_data=<optimized out>, startup_data_len=<optimized out>) at backend_startup.c:105 #21 0x000056b2b023f955 in postmaster_child_launch ( child_type=child_type@entry=B_BACKEND, startup_data=startup_data@entry=0x7ffc452f0714 "", startup_data_len=startup_data_len@entry=4, client_sock=client_sock@entry=0x7ffc452f0750) at launch_backend.c:277 #22 0x000056b2b024411e in BackendStartup (client_sock=client_sock@entry=0x7ffc452f0750) at postmaster.c:3593 #23 0x000056b2b02443a1 in ServerLoop () at postmaster.c:1674 #24 0x000056b2b0245a15 in PostmasterMain (argc=argc@entry=8, argv=argv@entry=0x56b2b0a16860) at postmaster.c:1372 #25 0x000056b2b0161207 in main (argc=8, argv=0x56b2b0a16860) at main.c:197 > There are three makeNode(ResultRelInfo), but they don't look anything > out of the ordinary: > > C perhan: timescaledb 0$ git grep -C5 makeNode.ResultRelInfo > src/copy.c- * > src/copy.c- * WARNING. The dummy rangetable index is decremented by 1 > (unchecked) > src/copy.c- * inside `ExecConstraints` so unless you want to have a > overflow, keep it > src/copy.c- * above zero. See `rt_fetch` in parsetree.h. > src/copy.c- */ > src/copy.c: resultRelInfo = makeNode(ResultRelInfo); > src/copy.c- > src/copy.c-#if PG16_LT > src/copy.c- ExecInitRangeTable(estate, pstate->p_rtable); > src/copy.c-#else > src/copy.c- Assert(pstate->p_rteperminfos != NULL); > -- > src/nodes/chunk_dispatch/chunk_insert_state.c-create_chunk_result_relation_info(const > ChunkDispatch *dispatch, Relation rel) > src/nodes/chunk_dispatch/chunk_insert_state.c-{ > src/nodes/chunk_dispatch/chunk_insert_state.c- ResultRelInfo *rri; > src/nodes/chunk_dispatch/chunk_insert_state.c- ResultRelInfo *rri_orig = > dispatch->hypertable_result_rel_info; > src/nodes/chunk_dispatch/chunk_insert_state.c- Index hyper_rti = > rri_orig->ri_RangeTableIndex; > src/nodes/chunk_dispatch/chunk_insert_state.c: rri = > makeNode(ResultRelInfo); > src/nodes/chunk_dispatch/chunk_insert_state.c- > src/nodes/chunk_dispatch/chunk_insert_state.c- InitResultRelInfo(rri, > rel, hyper_rti, NULL, dispatch->estate->es_instrument); > src/nodes/chunk_dispatch/chunk_insert_state.c- > src/nodes/chunk_dispatch/chunk_insert_state.c- /* Copy options from the > main table's (hypertable's) result relation info */ > src/nodes/chunk_dispatch/chunk_insert_state.c- rri->ri_WithCheckOptions = > rri_orig->ri_WithCheckOptions; > -- > src/ts_catalog/catalog.c-extern TSDLLEXPORT ResultRelInfo * > src/ts_catalog/catalog.c-ts_catalog_open_indexes(Relation heapRel) > src/ts_catalog/catalog.c-{ > src/ts_catalog/catalog.c- ResultRelInfo *resultRelInfo; > src/ts_catalog/catalog.c- > src/ts_catalog/catalog.c: resultRelInfo = makeNode(ResultRelInfo); > src/ts_catalog/catalog.c- resultRelInfo->ri_RangeTableIndex = 0; /* > dummy */ > src/ts_catalog/catalog.c- resultRelInfo->ri_RelationDesc = heapRel; > src/ts_catalog/catalog.c- resultRelInfo->ri_TrigDesc = NULL; /* we don't > fire triggers */ > src/ts_catalog/catalog.c- > src/ts_catalog/catalog.c- ExecOpenIndices(resultRelInfo, false); > > > I didn't catch any other allocations that would depend on the size of > ResultRelInfo in obvious ways. > > > > Oh, hmm, there's this bit which uses struct assignment into a stack > element: > > tsl/src/compression/compression.c-1559- if > (decompressor->indexstate->ri_NumIndices > 0) > tsl/src/compression/compression.c-1560- { > tsl/src/compression/compression.c:1561: ResultRelInfo indexstate_copy > = *decompressor->indexstate; > tsl/src/compression/compression.c-1562- Relation single_index_relation; > > I find this rather suspect. > Yes, I agree, trying to figure out other ways to deal with this piece of code. It is not possible to make a copy of the ResultRelInfo node here since copyObject() does not support T_ResultRelInfo. Trying to use CatalogOpenIndexes() causes another warning. > -- > Álvaro Herrera PostgreSQL Developer — > https://www.EnterpriseDB.com/ > "La espina, desde que nace, ya pincha" (Proverbio africano) > > > -- Best wishes, Mats Kindahl, Timescale