2015-08-19 20:12 GMT+09:00 Simon Riggs <si...@2ndquadrant.com>: > On 12 June 2015 at 00:29, Tomas Vondra <tomas.von...@2ndquadrant.com> wrote: > >> >> I see two ways to fix this: >> >> (1) enforce the 1GB limit (probably better for back-patching, if that's >> necessary) >> >> (2) make it work with hash tables over 1GB >> >> I'm in favor of (2) if there's a good way to do that. It seems a bit >> stupid not to be able to use fast hash table because there's some artificial >> limit. Are there any fundamental reasons not to use the >> MemoryContextAllocHuge fix, proposed by KaiGai-san? > > > If there are no objections, I will apply the patch for 2) to HEAD and > backpatch to 9.5. > Please don't be rush. :-)
It is not difficult to replace palloc() by palloc_huge(), however, it may lead another problem once planner gives us a crazy estimation. Below is my comment on the another thread. ========== Also, we may need to pay attention to reliability of scale estimation by planner. Even though the plan below says that Join generates 60521928028 rows, it actually generates 776157676 rows (0.12%). tpcds100=# EXPLAIN ANALYZE select ws1.ws_order_number,ws1.ws_warehouse_sk wh1,ws2.ws_warehouse_sk wh2 from web_sales ws1,web_sales ws2 where ws1.ws_order_number = ws2.ws_order_number and ws1.ws_warehouse_sk <> ws2.ws_warehouse_sk; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------- Merge Join (cost=25374644.08..1160509591.61 rows=60521928028 width=24) (actual time=138347.979..491889.343 rows=776157676 loops=1) Merge Cond: (ws1.ws_order_number = ws2.ws_order_number) Join Filter: (ws1.ws_warehouse_sk <> ws2.ws_warehouse_sk) Rows Removed by Join Filter: 127853313 -> Sort (cost=12687322.04..12867325.16 rows=72001248 width=16) (actual time=73252.300..79017.420 rows=72001237 loops=1) Sort Key: ws1.ws_order_number Sort Method: quicksort Memory: 7083296kB -> Seq Scan on web_sales ws1 (cost=0.00..3290612.48 rows=72001248 width=16) (actual time=0.023..39951.201 rows=72001237 loops=1) -> Sort (cost=12687322.04..12867325.16 rows=72001248 width=16) (actual time=65095.655..128885.811 rows=904010978 loops=1) Sort Key: ws2.ws_order_number Sort Method: quicksort Memory: 7083296kB -> Seq Scan on web_sales ws2 (cost=0.00..3290612.48 rows=72001248 width=16) (actual time=0.014..31046.888 rows=72001237 loops=1) Planning time: 0.232 ms Execution time: 530176.521 ms (14 rows) So, even if we allows nodeHash.c to allocate hash buckets larger than 1GB, its initial size may be determined carefully. Probably, 1GB is a good starting point even if expanded later. -- KaiGai Kohei <kai...@kaigai.gr.jp> -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers