On Wed, Dec 9, 2015 at 12:14 AM, Robert Haas <robertmh...@gmail.com> wrote:
> On Wed, Dec 2, 2015 at 6:45 AM, Rushabh Lathia <rushabh.lat...@gmail.com> > wrote: > > Thanks Ashutosh. > > > > Re-reviewed and Re-verified the patch, pg_sort_all_pd_v5.patch > > looks good to me. > > This patch needs a rebase. > Done. > > It's not going to work to say this is a patch proposed for commit when > it's still got a TODO comment in it that obviously needs to be > changed. And the formatting of that long comment is pretty weird, > too, and not consistent with other functions in that same file (e.g. > get_remote_estimate, ec_member_matches_foreign, create_cursor). > > The TODO was present in v4 but not in v5 and is not present in v6 attached here.. Formatted comment according estimate_path_cost_size(), convert_prep_stmt_params(). > Aside from that, I think before we commit this, somebody should do > some testing that demonstrates that this is actually a good idea. Not > as part of the test case set for this patch, but just in general. > Merge joins are typically going to be relevant for large tables, but > the examples in the regression tests are necessarily tiny. I'd like > to see some sample data and some sample queries that get appreciably > faster with this code. If we can't find any, we don't need the code. > > I tested the patch on my laptop with two types of queries, a join between two foreign tables on different foreign servers (pointing to the same self server) and a join between one foreign and one local table. The foreign tables and servers are created using sort_pd_setup.sql attached. Foreign tables pointed to table with index useful for join clause. Both the joining tables had 10M rows. The execution time of query was measured for 100 runs and average and standard deviation were calculated (using function query_execution_stats() in script sort_pd.sql) and are presented below. 1. Query between foreign tables SELECT ft1.val, ft2.val FROM ft1 join ft2 on (ft1.val = ft2.val) Plan and timings without patch EXPLAIN (VERBOSE, ANALYSE) :query ; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------- Hash Join (cost=508510.02..1129945.94 rows=999995 width=8) (actual time=33803.826..82416.342 rows=10000000 loops=1) Output: ft1.val, ft2.val Hash Cond: (ft1.val = ft2.val) -> Foreign Scan on public.ft1 (cost=100.00..344347.31 rows=9999977 width=4) (actual time=0.624..28531.803 rows=10000000 loops=1) Output: ft1.val Remote SQL: SELECT val FROM public.lt -> Hash (cost=344347.31..344347.31 rows=9999977 width=4) (actual time=33258.025..33258.025 rows=10000000 loops=1) Output: ft2.val Buckets: 131072 Batches: 256 Memory Usage: 2400kB -> Foreign Scan on public.ft2 (cost=100.00..344347.31 rows=9999977 width=4) (actual time=22.171..28134.970 rows=10000000 loops=1) Output: ft2.val Remote SQL: SELECT val FROM public.lt Planning time: 33.155 ms Execution time: 82914.607 ms (14 rows) avg_exe_time | std_dev_exe_time | min_exe_time | max_exe_time --------------+------------------+--------------+-------------- 78750.95487 | 2911.51825687913 | 74314.886 | 89358.464 Plan and timing with patch EXPLAIN (VERBOSE, ANALYSE) :query ; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------- Merge Join (cost=200.86..1183070.86 rows=10000000 width=8) (actual time=1.776..73140.219 rows=10000000 loops=1) Output: ft1.val, ft2.val Merge Cond: (ft1.val = ft2.val) -> Foreign Scan on public.ft1 (cost=100.43..504035.43 rows=10000000 width=4) (actual time=0.937..30422.457 rows=10000000 loops=1) Output: ft1.val, ft1.val2 Remote SQL: SELECT val FROM public.lt ORDER BY val ASC -> Materialize (cost=100.43..529035.43 rows=10000000 width=4) (actual time=0.826..33448.822 rows=10000000 loops=1) Output: ft2.val, ft2.val2 -> Foreign Scan on public.ft2 (cost=100.43..504035.43 rows=10000000 width=4) (actual time=0.818..31035.362 rows=10000000 loops=1) Output: ft2.val, ft2.val2 Remote SQL: SELECT val FROM public.lt ORDER BY val ASC Planning time: 163.161 ms Execution time: 73654.106 ms (13 rows) avg_exe_time | std_dev_exe_time | min_exe_time | max_exe_time --------------+------------------+--------------+-------------- 71881.15916 | 819.091605498189 | 70197.312 | 74653.314 It can be observed that the with the patch, merge join strategy is used instead of hash join and the execution time reduces by approx 9%. A desired effect is that the deviation in the execution time has reduced heavily (almost by 75%). 2. Join between local and foreign table Without patch the plan and timings are EXPLAIN (VERBOSE, ANALYSE) :query ; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------- Hash Join (cost=308410.66..1019846.69 rows=9999970 width=8) (actual time=7674.681..47767.136 rows=10000000 loops=1) Output: lt.val, ft1.val Hash Cond: (ft1.val = lt.val) -> Foreign Scan on public.ft1 (cost=100.00..344347.55 rows=9999985 width=4) (actual time=0.506..26679.980 rows=10000000 loops=1) Output: ft1.val Remote SQL: SELECT val FROM public.lt -> Hash (cost=144247.85..144247.85 rows=9999985 width=4) (actual time=7667.598..7667.598 rows=10000000 loops=1) Output: lt.val Buckets: 131072 Batches: 256 Memory Usage: 2400kB -> Seq Scan on public.lt (cost=0.00..144247.85 rows=9999985 width=4) (actual time=0.018..2959.111 rows=10000000 loops=1) Output: lt.val Planning time: 8.668 ms Execution time: 48209.365 ms (13 rows) SELECT avg_exe_time, std_dev_exe_time, min_exe_time, max_exe_time FROM query_execution_stats(:'query', :num_samples); avg_exe_time | std_dev_exe_time | min_exe_time | max_exe_time --------------+------------------+--------------+-------------- 47246.46956 | 2579.42041949119 | 43603.411 | 56096.759 With the patch the plan and timings are EXPLAIN (VERBOSE, ANALYSE) :query ; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------- Merge Join (cost=155.01..957924.85 rows=9999970 width=8) (actual time=0.592..45125.356 rows=10000000 loops=1) Output: lt.val, ft1.val Merge Cond: (ft1.val = lt.val) -> Foreign Scan on public.ft1 (cost=100.43..504038.91 rows=9999985 width=4) (actual time=0.551..30526.048 rows=10000000 loops=1) Output: ft1.val, ft1.val2 Remote SQL: SELECT val FROM public.lt ORDER BY val ASC -> Index Only Scan using i_lt_val on public.lt (cost=0.43..303939.21 rows=9999985 width=4) (actual time=0.032..6192.406 rows=10000000 loops=1) Output: lt.val Heap Fetches: 10000000 Planning time: 9.043 ms Execution time: 45666.023 ms (11 rows) avg_exe_time | std_dev_exe_time | min_exe_time | max_exe_time --------------+------------------+--------------+-------------- 42803.36105 | 166.874491432755 | 42321.314 | 43316.902 Again observe that with the patch, merge join is used instead of hash join and timing reduces by approx 9%. Again the deviation in execution reduces heavily (almost by 75%). There is increase in planning time with the patch owing to firing EXPLAIN on the foreign server. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
-- \set num_rows (1000*1000*10) -- patched server uses merge join from this \set num_rows (1000*1000*10) -- Create local tables (to be pointed by foreign tables) DROP TABLE lt CASCADE; CREATE TABLE lt(val int, val2 int); INSERT INTO lt SELECT i, i FROM (SELECT generate_series(1, :num_rows)) s(i); CREATE INDEX i_lt_val ON lt(val); DROP EXTENSION postgres_fdw CASCADE; create extension postgres_fdw; CREATE SERVER pg1 FOREIGN DATA WRAPPER postgres_fdw OPTIONS (dbname 'postgres'); CREATE SERVER pg2 FOREIGN DATA WRAPPER postgres_fdw OPTIONS (dbname 'postgres'); CREATE USER MAPPING FOR CURRENT_USER SERVER pg1; CREATE USER MAPPING FOR CURRENT_USER SERVER pg2; CREATE FOREIGN TABLE ft1 () INHERITS (lpt) SERVER pg1 OPTIONS (table_name 'lt', use_remote_estimate 'true'); CREATE FOREIGN TABLE ft2 () INHERITS (lpt) SERVER pg2 OPTIONS (table_name 'lt', use_remote_estimate 'true'); DROP FUNCTION query_execution_stats(query text, num_samples int, OUT avg_exe_time float, OUT exec_time_dev float, OUT min_exe_time float, OUT max_exe_time float); CREATE FUNCTION query_execution_stats(query text, num_samples int, OUT avg_exe_time float, OUT std_dev_exe_time float, OUT min_exe_time float, OUT max_exe_time float) RETURNS record LANGUAGE plpgsql AS $$ DECLARE plan json; BEGIN CREATE TEMPORARY TABLE query_exe_times(exe_time float); -- Execute query a few times (5% of user specified runs) to warm the cache FOR i IN 1 .. num_samples/20 LOOP EXECUTE query; END LOOP; FOR i IN 1 .. num_samples LOOP EXECUTE 'EXPLAIN (analyze, format json) ' || query INTO plan; INSERT INTO query_exe_times VALUES ((plan->0->'Execution Time')::text::float); RAISE NOTICE 'completed % samples', i; END LOOP; SELECT avg(exe_time), stddev(exe_time), min(exe_time), max(exe_time) INTO avg_exe_time, std_dev_exe_time, min_exe_time, max_exe_time FROM query_exe_times; DROP TABLE query_exe_times; END; $$; ANALYZE ft1; ANALYZE lt;
pg_sort_all_pd_v6.patch
Description: binary/octet-stream
\set num_samples 100 \set query 'SELECT lt.val, ft1.val FROM lt join ft1 on (lt.val = ft1.val)' EXPLAIN (VERBOSE, ANALYSE) :query ; SELECT avg_exe_time, std_dev_exe_time, min_exe_time, max_exe_time FROM query_execution_stats(:'query', :num_samples);
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers