Hello Jeff,

Sorry for the delay, here are the EXPLAIN ANALYSE results for one single
row in the inner-query:

Nested Loop Semi Join  (cost=10000000993.81..10004731160.70 rows=536206
> width=28) (actual time=93765.182..93765.183 rows=0 loops=1)
>   Output: fu.w2_page_idxs
>   Join Filter: (fu.w2_page_idxs && (ARRAY[fact_pages.idx]))
>   Rows Removed by Join Filter: 53762825
>   Buffers: shared hit=569194 read=2821768
>   I/O Timings: read=56586.955
>   ->  Seq Scan on public.fact_users fu
>  (cost=10000000000.00..10003925857.68 rows=53620568 width=28) (actual
> time=79.139..67423.779 rows=53762825 loops=1)
>         Output: fu.w2_page_idxs
>         Buffers: shared hit=567884 read=2821768
>         I/O Timings: read=56586.955
>   ->  Materialize  (cost=993.81..994.50 rows=1 width=32) (actual
> time=0.000..0.000 rows=1 loops=53762825)
>         Output: (ARRAY[fact_pages.idx])
>         Buffers: shared hit=148
>         ->  Limit  (cost=993.81..994.48 rows=1 width=32) (actual
> time=26.382..26.383 rows=1 loops=1)
>               Output: (ARRAY[fact_pages.idx])
>               Buffers: shared hit=148
>               ->  Bitmap Heap Scan on public.fact_pages
>  (cost=993.81..70645.00 rows=103556 width=32) (actual time=26.378..26.379
> rows=1 loops=1)
>                     Output: ARRAY[fact_pages.idx]
>                     Recheck Cond: (fact_pages.attribute_idxs &&
> '{300000160}'::integer[])
>                     Heap Blocks: exact=1
>                     Buffers: shared hit=148
>                     ->  Bitmap Index Scan on fact_pages_attribute_idxs_int
>  (cost=0.00..967.92 rows=103556 width=0) (actual time=14.865..14.865
> rows=101462 loops=1)
>                           Index Cond: (fact_pages.attribute_idxs &&
> '{300000160}'::integer[])
>                           Buffers: shared hit=147
> Query Identifier: 6779965332684941204
> Planning:
>   Buffers: shared hit=2
> Planning Time: 0.162 ms
> JIT:
>   Functions: 10
>   Options: Inlining true, Optimization true, Expressions true, Deforming
> true
>   Timing: Generation 1.507 ms, Inlining 9.797 ms, Optimization 54.902 ms,
> Emission 14.314 ms, Total 80.521 ms
> Execution Time: 93766.772 ms


Query:

EXPLAIN (
>   ANALYZE,
>   VERBOSE,
>   COSTS,
>   BUFFERS,
>   TIMING
> )
> SELECT
>   fu.w2_page_idxs
> FROM
>   fact_users
>     AS fu
> WHERE
>   EXISTS (
>     SELECT
>     FROM
>       (
>         SELECT
>           ARRAY[idx] AS page_idx
>         FROM
>           fact_pages
>         WHERE
>           attribute_idxs && ARRAY[300000160]
>         FETCH FIRST 1 ROWS ONLY
>       )
>         AS fp
>     WHERE
>       fu.w2_page_idxs && fp.page_idx
>   )
> ;


Without any surprises, the planner is using a sequential scan on the
"fact_users" table which is very large instead of using the GIN index set
on the "w2_page_idxs" column.

Link to the query plan visualiser: https://explain.dalibo.com/plan/1vC

Thank you very much in advance,

Mickael

On Wed, Apr 27, 2022 at 4:54 PM Mickael van der Beek <
mickael.van.der.b...@gmail.com> wrote:

> Hello Jeff,
>
> I have waited a few hours without the query ever finishing which is the
> reason I said "never finishes".
> Especially because the INNER JOIN version finishes within a few minutes
> while being combinatorial and less efficient.
> The query probably only does sequential scans.
>
> You will find the query plan using EXPLAIN here:
> - Visual query plan: https://explain.dalibo.com/plan#plan
> - Raw query plan: https://explain.dalibo.com/plan#raw
>
> Thanks for your help,
>
> Mickael
>
> On Wed, Apr 27, 2022 at 4:28 PM Jeff Janes <jeff.ja...@gmail.com> wrote:
>
>> On Wed, Apr 27, 2022 at 8:19 AM Mickael van der Beek <
>> mickael.van.der.b...@gmail.com> wrote:
>>
>>>
>>> The last query does not finish after waiting for more than 15 minutes.
>>> (The temporary view creation is very fast and required due to the same
>>> query in a CTE greatly reducing performance (by more than 5 min.) due to
>>> the optimisation barrier I'm guessing.)
>>>
>>
>> How much over 15 minutes?  20 minutes doesn't seem that long to wait to
>> get a likely definitive answer.  But at the least show us the EXPLAIN
>> without ANALYZE of it, that should take no milliseconds.
>>
>> And what does it mean for something to take 5 minutes longer than "never
>> finishes"?
>>
>> (Also, putting every or every other token on a separate line does not
>> make it easier to read)
>>
>> Cheer,
>>
>> Jeff
>>
>>>
>
> --
> Mickael van der BeekWeb developer & Security analyst
>
> mickael.van.der.b...@gmail.com
>


-- 
Mickael van der BeekWeb developer & Security analyst

mickael.van.der.b...@gmail.com

Reply via email to