On Tue, Mar 13, 2012 at 09:15:52PM -0400, Bruce Momjian wrote: > On Tue, Mar 13, 2012 at 08:22:51PM -0400, Bruce Momjian wrote: > > On Tue, Mar 13, 2012 at 05:33:29PM -0500, Kevin Grittner wrote: > > > Bruce Momjian <br...@momjian.us> wrote: > > > > > > > What is the target=10 duration? I think 10 is as low as we can > > > > acceptably recommend. Should we recommend they run vacuumdb > > > > twice, once with default_statistics_target = 4, and another with > > > > the default? > > > > > > Here are the results at various settings. > > > > > > 1 : 172198.892 ms > > > 2 : 295536.814 ms > > > 4 : 474319.826 ms > > > 10 : 750458.312 ms > > > 100 : 3433794.609 ms > > > > Thanks, good numbers to know. > > OK, new crazy idea. Kevin has shown that his database can get a single > bucket in 2.8 minutes. What if we have pg_upgrade create a vacuumdb > script that generates increasingly accurate statistics, e.g. it runs for > default_statistics_target values of 1, 10, and default (100). That > would give basic statistics quickly (2.8 minutes), and full statistics > in an hour, for Kevin's database. > > PGOPTIONS='-c default_statistics_target=1' vacuumdb --all --analyze-only > PGOPTIONS='-c default_statistics_target=10' vacuumdb --all > --analyze-only > vacuumdb --all --analyze-only > > The only problem I see is that users who use non-default statistics per > table would not be affected by the increasing default_statistics_target > values. > > The upside is this would work for all releases of Postgres.
OK, I have modified pg_upgrade with the attached patch to do exactly this. I have also attached the script pg_upgrade creates that should be run instead of vacuumdb. Based on Kevin's numbers above, the first vacuumdb will be done in 2.8 minutes (1 target), the 10 target vacuumdb done after 15 minutes, and the 100 target vacuumdb done after 72 minutes (times accumulate). Here is what the output looks like: Generating minimal optimizer statistics (1 target) -------------------------------------------------- vacuumdb: vacuuming database "postgres" vacuumdb: vacuuming database "template1" vacuumdb: vacuuming database "test" The server is now available with minimal optimizer statistics. Query performance will be optimal once this script completes. Generating medium optimizer statistics (10 targets) --------------------------------------------------- vacuumdb: vacuuming database "postgres" vacuumdb: vacuuming database "template1" vacuumdb: vacuuming database "test" Generating default (full) optimizer statistics (100 targets?) ------------------------------------------------------------- vacuumdb: vacuuming database "postgres" vacuumdb: vacuuming database "template1" vacuumdb: vacuuming database "test" Done If we don't want to try migrating the statistics from the old system, this seems like the best approach. Does anyone know how bad the queries will be with only one target? I did see if vacuumdb --analyze-only was somehow being throttled by the vacuum settings, but saw the drive at 100% utilization analying a 36GB table on a 24GB RAM server, so it seems I/O bound. -- Bruce Momjian <br...@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
diff --git a/contrib/pg_upgrade/check.c b/contrib/pg_upgrade/check.c new file mode 100644 index cf43384..4481de0 *** a/contrib/pg_upgrade/check.c --- b/contrib/pg_upgrade/check.c *************** static void check_for_reg_data_type_usag *** 23,28 **** --- 23,33 ---- static void check_for_support_lib(ClusterInfo *cluster); static void get_bin_version(ClusterInfo *cluster); + #ifndef WIN32 + #define ECHO_QUOTE "'" + #else + #define ECHO_QUOTE "" + #endif void output_check_banner(bool *live_check) *************** issue_warnings(char *sequence_script_fil *** 193,213 **** void ! output_completion_banner(char *deletion_script_file_name) { /* Did we copy the free space files? */ if (GET_MAJOR_VERSION(old_cluster.major_version) >= 804) pg_log(PG_REPORT, ! "Optimizer statistics are not transferred by pg_upgrade so\n" ! "consider running:\n" ! " vacuumdb --all --analyze-only\n" ! "on the newly-upgraded cluster.\n\n"); else pg_log(PG_REPORT, "Optimizer statistics and free space information are not transferred\n" ! "by pg_upgrade so consider running:\n" ! " vacuumdb --all --analyze\n" ! "on the newly-upgraded cluster.\n\n"); pg_log(PG_REPORT, "Running this script will delete the old cluster's data files:\n" --- 198,217 ---- void ! output_completion_banner(char *analyze_script_file_name, ! char *deletion_script_file_name) { /* Did we copy the free space files? */ if (GET_MAJOR_VERSION(old_cluster.major_version) >= 804) pg_log(PG_REPORT, ! "Optimizer statistics are not transferred by pg_upgrade so,\n" ! "once you start the new server, consider running:\n" ! " %s\n\n", analyze_script_file_name); else pg_log(PG_REPORT, "Optimizer statistics and free space information are not transferred\n" ! "by pg_upgrade so, once you start the new server, consider running:\n" ! " %s\n\n", analyze_script_file_name); pg_log(PG_REPORT, "Running this script will delete the old cluster's data files:\n" *************** check_new_cluster_is_empty(void) *** 379,384 **** --- 383,477 ---- } + /* + * create_script_for_cluster_analyze() + * + * This incrementally generates better optimizer statistics + */ + void + create_script_for_cluster_analyze(char **analyze_script_file_name) + { + FILE *script = NULL; + + *analyze_script_file_name = pg_malloc(MAXPGPATH); + + prep_status("Creating script to analyze new cluster"); + + snprintf(*analyze_script_file_name, MAXPGPATH, "analyze_new_cluster.%s", + SCRIPT_EXT); + + if ((script = fopen_priv(*analyze_script_file_name, "w")) == NULL) + pg_log(PG_FATAL, "Could not open file \"%s\": %s\n", + *analyze_script_file_name, getErrorText(errno)); + + #ifndef WIN32 + /* add shebang header */ + fprintf(script, "#!/bin/sh\n\n"); + + fprintf(script, "PGOPTIONS='-c default_statistics_target=1'\n"); + /* only need to export once */ + fprintf(script, "export PGOPTIONS\n"); + #else + fprintf(script, "SET PGOPTIONS=-c default_statistics_target=1\n"); + #endif + + fprintf(script, "echo %sGenerating minimal optimizer statistics (1 target)%s\n", + ECHO_QUOTE, ECHO_QUOTE); + fprintf(script, "echo %s--------------------------------------------------%s\n", + ECHO_QUOTE, ECHO_QUOTE); + fprintf(script, "vacuumdb --all --analyze-only\n"); + fprintf(script, "echo\n"); + fprintf(script, "echo %sThe server is now available with minimal optimizer statistics.%s\n", + ECHO_QUOTE, ECHO_QUOTE); + fprintf(script, "echo %sQuery performance will be optimal once this script completes.%s\n", + ECHO_QUOTE, ECHO_QUOTE); + fprintf(script, "echo\n\n"); + + #ifndef WIN32 + fprintf(script, "PGOPTIONS='-c default_statistics_target=10'\n"); + #else + fprintf(script, "SET PGOPTIONS=-c default_statistics_target=10\n"); + #endif + + fprintf(script, "echo %sGenerating medium optimizer statistics (10 targets)%s\n", + ECHO_QUOTE, ECHO_QUOTE); + fprintf(script, "echo %s---------------------------------------------------%s\n", + ECHO_QUOTE, ECHO_QUOTE); + fprintf(script, "vacuumdb --all --analyze-only\n"); + fprintf(script, "echo\n\n"); + + #ifndef WIN32 + fprintf(script, "unset PGOPTIONS\n"); + #else + fprintf(script, "SET PGOPTIONS\n"); + #endif + + fprintf(script, "echo %sGenerating default (full) optimizer statistics (100 targets?)%s\n", + ECHO_QUOTE, ECHO_QUOTE); + fprintf(script, "echo %s-------------------------------------------------------------%s\n", + ECHO_QUOTE, ECHO_QUOTE); + /* Did we copy the free space files? */ + if (GET_MAJOR_VERSION(old_cluster.major_version) >= 804) + fprintf(script, "vacuumdb --all --analyze-only\n"); + else + fprintf(script, "vacuumdb --all --analyze\n"); + + fprintf(script, "echo\n\n"); + fprintf(script, "echo %sDone%s\n", + ECHO_QUOTE, ECHO_QUOTE); + + fclose(script); + + #ifndef WIN32 + if (chmod(*analyze_script_file_name, S_IRWXU) != 0) + pg_log(PG_FATAL, "Could not add execute permission to file \"%s\": %s\n", + *analyze_script_file_name, getErrorText(errno)); + #endif + + check_ok(); + } + + /* * create_script_for_old_cluster_deletion() * diff --git a/contrib/pg_upgrade/pg_upgrade.c b/contrib/pg_upgrade/pg_upgrade.c new file mode 100644 index b0720c3..fa48ef1 *** a/contrib/pg_upgrade/pg_upgrade.c --- b/contrib/pg_upgrade/pg_upgrade.c *************** int *** 67,72 **** --- 67,73 ---- main(int argc, char **argv) { char *sequence_script_file_name = NULL; + char *analyze_script_file_name = NULL; char *deletion_script_file_name = NULL; bool live_check = false; *************** main(int argc, char **argv) *** 142,147 **** --- 143,149 ---- new_cluster.pgdata, UTILITY_LOG_FILE); check_ok(); + create_script_for_cluster_analyze(&analyze_script_file_name); create_script_for_old_cluster_deletion(&deletion_script_file_name); issue_warnings(sequence_script_file_name); *************** main(int argc, char **argv) *** 149,156 **** pg_log(PG_REPORT, "\nUpgrade Complete\n"); pg_log(PG_REPORT, "----------------\n"); ! output_completion_banner(deletion_script_file_name); pg_free(deletion_script_file_name); pg_free(sequence_script_file_name); --- 151,160 ---- pg_log(PG_REPORT, "\nUpgrade Complete\n"); pg_log(PG_REPORT, "----------------\n"); ! output_completion_banner(analyze_script_file_name, ! deletion_script_file_name); + pg_free(analyze_script_file_name); pg_free(deletion_script_file_name); pg_free(sequence_script_file_name); diff --git a/contrib/pg_upgrade/pg_upgrade.h b/contrib/pg_upgrade/pg_upgrade.h new file mode 100644 index 46f9169..da37249 *** a/contrib/pg_upgrade/pg_upgrade.h --- b/contrib/pg_upgrade/pg_upgrade.h *************** void check_old_cluster(bool live_check, *** 283,292 **** void check_new_cluster(void); void report_clusters_compatible(void); void issue_warnings(char *sequence_script_file_name); ! void output_completion_banner(char *deletion_script_file_name); void check_cluster_versions(void); void check_cluster_compatibility(bool live_check); void create_script_for_old_cluster_deletion(char **deletion_script_file_name); /* controldata.c */ --- 283,294 ---- void check_new_cluster(void); void report_clusters_compatible(void); void issue_warnings(char *sequence_script_file_name); ! void output_completion_banner(char *analyze_script_file_name, ! char *deletion_script_file_name); void check_cluster_versions(void); void check_cluster_compatibility(bool live_check); void create_script_for_old_cluster_deletion(char **deletion_script_file_name); + void create_script_for_cluster_analyze(char **analyze_script_file_name); /* controldata.c */
analyze_new_cluster.sh
Description: Bourne shell script
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers