(2011/12/15 11:30), Etsuro Fujita wrote: > (2011/12/14 15:34), Shigeru Hanada wrote: >> I think this patch could be marked as "Ready for committer" with some >> minor fixes. Please find attached a revised patch (v6.1).
I've tried to make pgsql_fdw work with this feature, and found that few static functions to be needed to exported to implement ANALYZE handler in short-cut style. The "Short-cut style" means the way to generate statistics (pg_class and pg_statistic) for foreign tables without retrieving sample data from foreign server. Attached patch (export_funcs.patch) exports examine_attribute and update_attstats which are necessary to implement ANALYZE handler for pgsql_fdw. In addition to exporting, update_attstats is also renamed to vac_update_attstats to fit with already exported function vac_update_relstats. I also attached archive of WIP pgsql_fdw with ANALYZE support. This version has better estimation than original pgsql_fdw, because it can use selectivity of qualifiers evaluated on local side to estimate number of result rows. To show the effect of ANALYZE clearly, WHERE push-down feature is disabled. Please see pgsqlAnalyzeForeignTable and store_remote_stats in pgsql_fdw.c. I used pgbench_accounts tables with 30000 records, and got reasonable rows estimation for queries below. <on remote side> postgres=# UPDATE pgbench_accounts SET filler = NULL postgres-# WHERE aid % 3 = 0; postgres=# ANALYZE; <on local side> postgres=# ANALYZE pgbench_accounts; -- needs explicit table name postgres=# EXPLAIN SELECT * FROM pgbench_accounts WHERE filler IS NULL; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------- Foreign Scan on pgbench_accounts (cost=100.00..40610.00 rows=100030 width=97) Filter: (filler IS NULL) Remote SQL: DECLARE pgsql_fdw_cursor_13 SCROLL CURSOR FOR SELECT aid, bid, abalance, filler FROM public.pgbench_accounts (3 rows) postgres=# EXPLAIN SELECT * FROM pgbench_accounts WHERE aid < 100; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------- Foreign Scan on pgbench_accounts (cost=100.00..40610.00 rows=96 width=97) Filter: (aid < 100) Remote SQL: DECLARE pgsql_fdw_cursor_14 SCROLL CURSOR FOR SELECT aid, bid, abalance, filler FROM public.pgbench_accounts (3 rows) postgres=# EXPLAIN SELECT * FROM pgbench_accounts WHERE aid < 1000; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------- Foreign Scan on pgbench_accounts (cost=100.00..40610.00 rows=1004 width=97) Filter: (aid < 1000) Remote SQL: DECLARE pgsql_fdw_cursor_15 SCROLL CURSOR FOR SELECT aid, bid, abalance, filler FROM public.pgbench_accounts (3 rows) In implementing ANALYZE handler, hardest part was copying anyarray values from remote to local. If we can make it common in core, it would help FDW authors who want to implement ANALYZE handler without retrieving sample rows from remote server. Regards, -- Shigeru Hanada
commit bb28cb5a69aae3bd9c7fbebc8b9483d23711bec4 Author: Shigeru Hanada <shigeru.han...@gmail.com> Date: Thu Feb 9 16:06:14 2012 +0900 Export functions which are useful for FDW analyze support. Export examine_attribute and update_attstats (with renaming to vac_update_attstats) which are useful (and nealy required) to implement short-cut version of ANALYZE handler in FDWs. diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c index 6a22d49..d0a323a 100644 --- a/src/backend/commands/analyze.c +++ b/src/backend/commands/analyze.c @@ -94,8 +94,6 @@ static void compute_index_stats(Relation onerel, double totalrows, AnlIndexData *indexdata, int nindexes, HeapTuple *rows, int numrows, MemoryContext col_context); -static VacAttrStats *examine_attribute(Relation onerel, int attnum, - Node *index_expr); static int acquire_sample_rows(Relation onerel, HeapTuple *rows, int targrows, double *totalrows, double *totaldeadrows, @@ -105,8 +103,6 @@ static int acquire_inherited_sample_rows(Relation onerel, double *totalrows, double *totaldeadrows, BlockNumber *totalpages, int elevel); static int compare_rows(const void *a, const void *b); -static void update_attstats(Oid relid, bool inh, - int natts, VacAttrStats **vacattrstats); static Datum std_fetch_func(VacAttrStatsP stats, int rownum, bool *isNull); static Datum ind_fetch_func(VacAttrStatsP stats, int rownum, bool *isNull); @@ -215,9 +211,9 @@ analyze_rel(Oid relid, VacuumStmt *vacstmt, BufferAccessStrategy bstrategy) } /* - * We can ANALYZE any table except pg_statistic. See update_attstats. In - * addition, we can ANALYZE foreign tables if AnalyzeForeignTable callback - * routines of underlying foreign-data wrappers are implemented. + * We can ANALYZE any table except pg_statistic. See vac_update_attstats. + * In addition, we can ANALYZE foreign tables if AnalyzeForeignTable + * callback routines of underlying foreign-data wrappers are implemented. */ if (RelationGetRelid(onerel) == StatisticRelationId) { @@ -283,7 +279,7 @@ analyze_rel(Oid relid, VacuumStmt *vacstmt, BufferAccessStrategy bstrategy) * Close source relation now, but keep lock so that no one deletes it * before we commit. (If someone did, they'd fail to clean up the entries * we made in pg_statistic. Also, releasing the lock before commit would - * expose us to concurrent-update failures in update_attstats.) + * expose us to concurrent-update failures in vac_update_attstats.) */ relation_close(onerel, NoLock); @@ -551,15 +547,15 @@ do_analyze_rel(Relation onerel, VacuumStmt *vacstmt, int elevel, * previous statistics for the target columns. (If there are stats in * pg_statistic for columns we didn't process, we leave them alone.) */ - update_attstats(RelationGetRelid(onerel), inh, - attr_cnt, vacattrstats); + vac_update_attstats(RelationGetRelid(onerel), inh, + attr_cnt, vacattrstats); for (ind = 0; ind < nindexes; ind++) { AnlIndexData *thisdata = &indexdata[ind]; - update_attstats(RelationGetRelid(Irel[ind]), false, - thisdata->attr_cnt, thisdata->vacattrstats); + vac_update_attstats(RelationGetRelid(Irel[ind]), false, + thisdata->attr_cnt, thisdata->vacattrstats); } } @@ -842,7 +838,7 @@ compute_index_stats(Relation onerel, double totalrows, * If index_expr isn't NULL, then we're trying to analyze an expression index, * and index_expr is the expression tree representing the column's data. */ -static VacAttrStats * +VacAttrStats * examine_attribute(Relation onerel, int attnum, Node *index_expr) { Form_pg_attribute attr = onerel->rd_att->attrs[attnum - 1]; @@ -1583,7 +1579,7 @@ acquire_inherited_sample_rows(Relation onerel, HeapTuple *rows, int targrows, /* - * update_attstats() -- update attribute statistics for one relation + * vac_update_attstats() -- update attribute statistics for one relation * * Statistics are stored in several places: the pg_class row for the * relation has stats about the whole relation, and there is a @@ -1604,8 +1600,8 @@ acquire_inherited_sample_rows(Relation onerel, HeapTuple *rows, int targrows, * ANALYZE the same table concurrently. Presently, we lock that out * by taking a self-exclusive lock on the relation in analyze_rel(). */ -static void -update_attstats(Oid relid, bool inh, int natts, VacAttrStats **vacattrstats) +void +vac_update_attstats(Oid relid, bool inh, int natts, VacAttrStats **vacattrstats) { Relation sd; int attno; diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h index 1530970..b165953 100644 --- a/src/include/commands/vacuum.h +++ b/src/include/commands/vacuum.h @@ -152,6 +152,10 @@ extern void vac_update_relstats(Relation relation, BlockNumber num_all_visible_pages, bool hasindex, TransactionId frozenxid); +extern void vac_update_attstats(Oid relid, + bool inh, + int natts, + VacAttrStats **vacattrstats); extern void vacuum_set_xid_limits(int freeze_min_age, int freeze_table_age, bool sharedRel, TransactionId *oldestXmin, @@ -177,6 +181,8 @@ extern void do_analyze_rel(Relation onerel, VacuumStmt *vacstmt, int elevel, extern double random_fract(void); extern double init_selection_state(int n); extern double get_next_S(double t, int n, double *stateptr); +extern VacAttrStats *examine_attribute(Relation onerel, int attnum, + Node *index_expr); #endif /* VACUUM_H */
pgsql_fdw.tar.gz
Description: application/gzip
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers