2017-04-05 10:13 GMT+02:00 Daniel Westermann < daniel.westerm...@dbi-services.com>:
> 2017-04-05 9:28 GMT+02:00 Daniel Westermann <daniel.westermann@dbi- > services.com>: > >> >>what is result of EXPLAIN statement for slow and fast cases? >> >> >> >>regards >> >> >> >>Pavel >> >> For work_mem=32MB >> >> explain (analyze,verbose,buffers) select count(user_id) from users where >> user_id not in ( select id from ids); >> QUERY >> PLAN >> ------------------------------------------------------------ >> ------------------------------------------------------------ >> -------------- >> Aggregate (cost=83175.01..83175.02 rows=1 width=8) (actual >> time=7945.592..7945.593 rows=1 loops=1) >> Output: count(users.user_id) >> Buffers: shared read=29425 >> -> Seq Scan on public.users (cost=16925.01..79425.01 rows=1500000 >> width=4) (actual time=1928.665..5888.645 rows=2499999 loops=1) >> Output: users.user_id, users.username >> Filter: (NOT (hashed SubPlan 1)) >> Rows Removed by Filter: 500001 >> Buffers: shared read=29425 >> SubPlan 1 >> -> Seq Scan on public.ids (cost=0.00..14425.01 rows=1000001 >> width=4) (actual time=11.111..868.382 rows=1000001 loops=1) >> Output: ids.id >> Buffers: shared read=4425 >> Planning time: 187.396 ms >> Execution time: 7948.108 ms >> (14 rows) >> >> Time: 8244.493 ms >> >> For work_mem='16MB' it does not complete with analyze in on hour. For >> explain only: >> >> explain (verbose) select count(user_id) from users where user_id not in >> ( select id from ids); >> QUERY >> PLAN >> ------------------------------------------------------------ >> ---------------------------- >> Aggregate (cost=38748092500.00..38748092500.01 rows=1 width=8) >> Output: count(users.user_id) >> -> Seq Scan on public.users (cost=0.00..38748088750.00 rows=1500000 >> width=4) >> Output: users.user_id, users.username >> Filter: (NOT (SubPlan 1)) >> SubPlan 1 >> -> Materialize (cost=0.00..23332.01 rows=1000001 width=4) >> Output: ids.id >> -> Seq Scan on public.ids (cost=0.00..14425.01 >> rows=1000001 width=4) >> Output: ids.id >> (10 rows) >> > > >There is a materialize op more > > > >do you have a index on ids.id? > > Yes: > > \d ids > Table "public.ids" > Column | Type | Modifiers > --------+---------+----------- > id | integer | > Indexes: > "i_ids" UNIQUE, btree (id) > > hmm .. NOT IN is just bad :( The second is slow becase table ids is stored in temp file. and it is repeatedly read from file. In first case, ids table is stored in memory. SELECT count(user_id) FROM users WHERE user_id NOT EXISTS(SELECT id from ids where id = user_id) Regards Pavel