On Thu, Nov 15, 2018 at 04:17:51PM -0800, Andres Freund wrote: > I'm about to commit some changes to 12/master that'd possibly make it > easier to find issues like this.
Are you referring to this a future commit ? commit 763f2edd92095b1ca2f4476da073a28505c13820 Rejigger materializing and fetching a HeapTuple from a slot. I was able to reproduce under HEAD with pg_restored data. I guess you're right that the "memory alloc failure" is related/same thing, I've seen it intermittently with queries which also sometimes crash (and also sometimes don't). Note that when it crashes, it seems to take a longer time to do so than the query would normally take. Like we're walking off the end of an array, say. I've been able to reproduce the crash with a self join of a table (no view, no expressions, no parallel, directly querying a relkind='r' child). In that case, enable_bitmapscan=on and jit_tuple_deforming=on are both needed to crash, and jit_debugging_support=on does not yield a useful bt. The table is not too special, but was probably ALTERed to add columns a good number of times by one of our processes. It has ~1100 columns, including arrays, and some with null_frac=1. I'm trying to come up with a test case involving column types and order. (gdb) bt #0 0x00007f81a08b8b98 in ?? () #1 0x0000000000000000 in ?? () ts=# SET jit=on;SET jit_above_cost=0;explain(analyze off,verbose off) SELECT a.* FROM child.daily_eric_umts_rnc_utrancell_view_201804 a JOIN child.daily_eric_umts_rnc_utrancell_view_201804 b USING(start_time,sect_id) WHERE a.start_time BETWEEN '2018-04-30' AND '2018-05-04' AND b.start_time BETWEEN '2018-04-30' AND '2018-05-04'; SET SET QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Hash Join (cost=527.36..1038.17 rows=1 width=7760) Hash Cond: ((a.start_time = b.start_time) AND (a.sect_id = b.sect_id)) -> Bitmap Heap Scan on daily_eric_umts_rnc_utrancell_view_201804 a (cost=9.78..515.59 rows=133 width=7760) Recheck Cond: ((start_time >= '2018-04-30 00:00:00'::timestamp without time zone) AND (start_time <= '2018-05-04 00:00:00'::timestamp without time zone)) -> Bitmap Index Scan on daily_eric_umts_rnc_utrancell_view_201804_unique_idx (cost=0.00..9.74 rows=133 width=0) Index Cond: ((start_time >= '2018-04-30 00:00:00'::timestamp without time zone) AND (start_time <= '2018-05-04 00:00:00'::timestamp without time zone)) -> Hash (cost=515.59..515.59 rows=133 width=12) -> Bitmap Heap Scan on daily_eric_umts_rnc_utrancell_view_201804 b (cost=9.78..515.59 rows=133 width=12) Recheck Cond: ((start_time >= '2018-04-30 00:00:00'::timestamp without time zone) AND (start_time <= '2018-05-04 00:00:00'::timestamp without time zone)) -> Bitmap Index Scan on daily_eric_umts_rnc_utrancell_view_201804_unique_idx (cost=0.00..9.74 rows=133 width=0) Index Cond: ((start_time >= '2018-04-30 00:00:00'::timestamp without time zone) AND (start_time <= '2018-05-04 00:00:00'::timestamp without time zone)) JIT: Functions: 19 Options: Inlining false, Optimization false, Expressions true, Deforming true BTW find attached patch which I believe corrects some comments. Justin
diff --git a/src/backend/jit/llvm/llvmjit_deform.c b/src/backend/jit/llvm/llvmjit_deform.c index 59e38d2..ab0c6d0 100644 --- a/src/backend/jit/llvm/llvmjit_deform.c +++ b/src/backend/jit/llvm/llvmjit_deform.c @@ -93,7 +93,7 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc, int natts) funcname = llvm_expand_funcname(context, "deform"); /* - * Check which columns do have to exist, so we don't have to check the + * Check which columns have to exist, so we don't have to check the * rows natts unnecessarily. */ for (attnum = 0; attnum < desc->natts; attnum++) @@ -252,7 +252,7 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc, int natts) } /* - * Check if's guaranteed the all the desired attributes are available in + * Check if it's guaranteed that all the desired attributes are available in * tuple. If so, we can start deforming. If not, need to make sure to * fetch the missing columns. */ @@ -337,7 +337,7 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc, int natts) /* * If this is the first attribute, slot->tts_nvalid was 0. Therefore - * reset offset to 0 to, it be from a previous execution. + * also reset offset to 0, it may be from a previous execution. */ if (attnum == 0) { @@ -554,7 +554,7 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc, int natts) else if (att->attnotnull && attguaranteedalign && known_alignment >= 0) { /* - * If the offset to the column was previously known a NOT NULL & + * If the offset to the column was previously known, a NOT NULL & * fixed width column guarantees that alignment is just the * previous alignment plus column width. */