Re: [HACKERS] pg_upgrade and statistics

2012-03-16 Thread Bruce Momjian
On Thu, Mar 15, 2012 at 11:46:04AM -0400, Bruce Momjian wrote: > On Thu, Mar 15, 2012 at 11:15:42AM -0400, Andrew Dunstan wrote: > > You're not the only person who could do that. I don't think this is > > all down to you. It should just be understood that if the stats > > format is changed, adjusti

Re: [HACKERS] pg_upgrade and statistics

2012-03-16 Thread Ants Aasma
On Thu, Mar 15, 2012 at 8:48 PM, Alvaro Herrera > What Peter proposed seems to me pretty reasonable, in the sense that it > should be possible to come up with a function that creates some text > representation of whatever is in pg_statistic, and another function to > load that data into the new ca

Re: [HACKERS] pg_upgrade and statistics

2012-03-15 Thread Alvaro Herrera
Excerpts from Greg Stark's message of jue mar 15 14:45:16 -0300 2012: > On Thu, Mar 15, 2012 at 3:15 PM, Andrew Dunstan wrote: > > > > You're not the only person who could do that. I don't think this is all down > > to you. It should just be understood that if the stats format is changed, > > adj

Re: [HACKERS] pg_upgrade and statistics

2012-03-15 Thread Tom Lane
Peter Eisentraut writes: > On tor, 2012-03-15 at 11:15 -0400, Andrew Dunstan wrote: >> I haven't looked at it, but I'm wondering how hard it is going to be >> in practice? > Take a look at the commit log of pg_statistic.h; it's not a lot. That says nothing as all about the cost of dealing with a

Re: [HACKERS] pg_upgrade and statistics

2012-03-15 Thread Peter Eisentraut
On tor, 2012-03-15 at 11:15 -0400, Andrew Dunstan wrote: > I haven't looked at it, but I'm wondering how hard it is going to be > in practice? Take a look at the commit log of pg_statistic.h; it's not a lot. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes t

Re: [HACKERS] pg_upgrade and statistics

2012-03-15 Thread Greg Stark
On Thu, Mar 15, 2012 at 3:15 PM, Andrew Dunstan wrote: > > You're not the only person who could do that. I don't think this is all down > to you. It should just be understood that if the stats format is changed, > adjusting pg_upgrade needs to be part of the change. When we modified how > enums wo

Re: [HACKERS] pg_upgrade and statistics

2012-03-15 Thread Tom Lane
Andrew Dunstan writes: > On 03/15/2012 11:03 AM, Bruce Momjian wrote: >> On Thu, Mar 15, 2012 at 08:22:24AM +0200, Peter Eisentraut wrote: >>> I think this could be budgeted under keeping pg_dump backward >>> compatible. You have to do that anyway for each catalog change, and so >>> doing somethi

Re: [HACKERS] pg_upgrade and statistics

2012-03-15 Thread Bruce Momjian
On Thu, Mar 15, 2012 at 10:20:02AM -0500, Kevin Grittner wrote: > Bruce Momjian wrote: > > > I think we have two choices --- either migrate the statistics, or > > adopt my approach to generating incremental statistics quickly. > > Does anyone see any other options? > > Would it make any sense

Re: [HACKERS] pg_upgrade and statistics

2012-03-15 Thread Bruce Momjian
On Thu, Mar 15, 2012 at 11:15:42AM -0400, Andrew Dunstan wrote: > You're not the only person who could do that. I don't think this is > all down to you. It should just be understood that if the stats > format is changed, adjusting pg_upgrade needs to be part of the > change. When we modified how en

Re: [HACKERS] pg_upgrade and statistics

2012-03-15 Thread Kevin Grittner
Bruce Momjian wrote: > I think we have two choices --- either migrate the statistics, or > adopt my approach to generating incremental statistics quickly. > Does anyone see any other options? Would it make any sense to modify the incremental approach to do a first pass of any tables with targe

Re: [HACKERS] pg_upgrade and statistics

2012-03-15 Thread Andrew Dunstan
On 03/15/2012 11:03 AM, Bruce Momjian wrote: On Thu, Mar 15, 2012 at 08:22:24AM +0200, Peter Eisentraut wrote: On ons, 2012-03-14 at 17:36 -0400, Bruce Momjian wrote: Well, I have not had to make major adjustments to pg_upgrade since 9.0, meaning the code is almost complete unchanged and does

Re: [HACKERS] pg_upgrade and statistics

2012-03-15 Thread Bruce Momjian
On Thu, Mar 15, 2012 at 08:22:24AM +0200, Peter Eisentraut wrote: > On ons, 2012-03-14 at 17:36 -0400, Bruce Momjian wrote: > > Well, I have not had to make major adjustments to pg_upgrade since 9.0, > > meaning the code is almost complete unchanged and does not require > > additional testing for e

Re: [HACKERS] pg_upgrade and statistics

2012-03-14 Thread Peter Eisentraut
On ons, 2012-03-14 at 17:36 -0400, Bruce Momjian wrote: > Well, I have not had to make major adjustments to pg_upgrade since 9.0, > meaning the code is almost complete unchanged and does not require > additional testing for each major release. If we go down the road of > dumping stats, we will nee

Re: [HACKERS] pg_upgrade and statistics

2012-03-14 Thread Bruce Momjian
On Wed, Mar 14, 2012 at 08:26:06PM -0400, Tom Lane wrote: > Bruce Momjian writes: > > Does anyone know how bad the queries will be with only one target? > > Bad. That cycle seems like largely a waste of time. About the only > thing it would do for you is ensure that relpages/reltuples are up to

Re: [HACKERS] pg_upgrade and statistics

2012-03-14 Thread Tom Lane
Bruce Momjian writes: > Does anyone know how bad the queries will be with only one target? Bad. That cycle seems like largely a waste of time. About the only thing it would do for you is ensure that relpages/reltuples are up to date, which seems like something we could possibly arrange for duri

Re: [HACKERS] pg_upgrade and statistics

2012-03-14 Thread Bruce Momjian
On Tue, Mar 13, 2012 at 09:15:52PM -0400, Bruce Momjian wrote: > On Tue, Mar 13, 2012 at 08:22:51PM -0400, Bruce Momjian wrote: > > On Tue, Mar 13, 2012 at 05:33:29PM -0500, Kevin Grittner wrote: > > > Bruce Momjian wrote: > > > > > > > What is the target=10 duration? I think 10 is as low as we

Re: [HACKERS] pg_upgrade and statistics

2012-03-14 Thread Bruce Momjian
On Wed, Mar 14, 2012 at 10:40:41PM +0200, Peter Eisentraut wrote: > On tis, 2012-03-13 at 20:34 -0400, Bruce Momjian wrote: > > I frankly am worried that if we copy over statistics even in ASCII > > that don't match what the server expects, it might lead to a crash, > > which has me back to wanting

Re: [HACKERS] pg_upgrade and statistics

2012-03-14 Thread Peter Eisentraut
On tis, 2012-03-13 at 20:34 -0400, Bruce Momjian wrote: > I frankly am worried that if we copy over statistics even in ASCII > that don't match what the server expects, it might lead to a crash, > which has me back to wanting to speed up vacuumdb. Why can't we maintain a conversion routine for sta

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Euler Taveira
On 13-03-2012 21:34, Bruce Momjian wrote: > It might be a solution for cases where we don't modify it. I frankly am > worried that if we copy over statistics even in ASCII that don't match > what the server expects, it might lead to a crash, which has me back to > wanting to speed up vacuumdb. >

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Bruce Momjian
On Tue, Mar 13, 2012 at 08:22:51PM -0400, Bruce Momjian wrote: > On Tue, Mar 13, 2012 at 05:33:29PM -0500, Kevin Grittner wrote: > > Bruce Momjian wrote: > > > > > What is the target=10 duration? I think 10 is as low as we can > > > acceptably recommend. Should we recommend they run vacuumdb >

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Bruce Momjian
On Tue, Mar 13, 2012 at 08:30:17PM -0400, Tom Lane wrote: > Bruce Momjian writes: > > Another idea is to just copy over pg_statistic like we copy of > > pg_largeobject now, and force autovacuum to run. > > That would be an automatic crash in a 9.1 to 9.2 migration, as well as > any other release

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Tom Lane
Bruce Momjian writes: > Another idea is to just copy over pg_statistic like we copy of > pg_largeobject now, and force autovacuum to run. That would be an automatic crash in a 9.1 to 9.2 migration, as well as any other release where we changed the column layout of pg_statistic.

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Bruce Momjian
On Tue, Mar 13, 2012 at 05:33:29PM -0500, Kevin Grittner wrote: > Bruce Momjian wrote: > > > What is the target=10 duration? I think 10 is as low as we can > > acceptably recommend. Should we recommend they run vacuumdb > > twice, once with default_statistics_target = 4, and another with > > t

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Andrew Dunstan
On 03/13/2012 06:30 PM, Robert Haas wrote: On Tue, Mar 13, 2012 at 5:42 PM, Bruce Momjian wrote: What is the target=10 duration? I think 10 is as low as we can acceptably recommend. Should we recommend they run vacuumdb twice, once with default_statistics_target = 4, and another with the de

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Daniel Farina
On Tue, Mar 13, 2012 at 3:30 PM, Robert Haas wrote: > On Tue, Mar 13, 2012 at 5:42 PM, Bruce Momjian wrote: >> What is the target=10 duration?  I think 10 is as low as we can >> acceptably recommend.  Should we recommend they run vacuumdb twice, once >> with default_statistics_target = 4, and ano

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Kevin Grittner
Robert Haas wrote: > I'm not sure why we're so glibly rejecting Dan's original > proposal. Sure, adjusting pg_upgrade when we whack around > pg_statistic is work, but who ever said that a workable in-place > upgrade facility would be maintenance-free? We're operating under > a number of restri

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Kevin Grittner
Bruce Momjian wrote: > What is the target=10 duration? I think 10 is as low as we can > acceptably recommend. Should we recommend they run vacuumdb > twice, once with default_statistics_target = 4, and another with > the default? Here are the results at various settings. 1 : 172198.892

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Robert Haas
On Tue, Mar 13, 2012 at 5:42 PM, Bruce Momjian wrote: > What is the target=10 duration?  I think 10 is as low as we can > acceptably recommend.  Should we recommend they run vacuumdb twice, once > with default_statistics_target = 4, and another with the default? I'm not sure why we're so glibly r

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Bruce Momjian
On Tue, Mar 13, 2012 at 03:29:22PM -0500, Kevin Grittner wrote: > I went even lower than you suggested: > > set default_statistics_target = 4; > > And it was much faster, but still more time than the pg_upgrade run > itself: > > cir=# analyze; > ANALYZE > Time: 474319.826 ms > > A little un

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Bruce Momjian
On Tue, Mar 13, 2012 at 09:28:33PM +, Greg Stark wrote: > hmph. One thing that could speed up analyze on raid arrays would be > doing prefetching so more than one spindle can be busy. Sacrificing > statistical accuracy by reading a less random sample on contiguous > blocks of rows would also be

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Greg Stark
On Tue, Mar 13, 2012 at 7:30 PM, Bruce Momjian wrote: > OK, so a single 44GB tables took 2.5 minutes to analyze;  that is not > good.  It would require 11 such tables to reach 500GB (0.5 TB), and > would take 27 minutes.  The report I had was twice as long, but still in > the ballpark of "too long

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Kevin Grittner
Bruce Momjian wrote: > OK, good idea. Kevin, can you test this: > > PGOPTIONS='-c default_statistics_target=10' vacuumdb --all > --analyze-only > > Is it faster? Thanks. Well, I just did something similar in psql -- I disabled the delays by: set vacuum_cost_delay = 0; I checked f

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Bruce Momjian
On Tue, Mar 13, 2012 at 10:10:02PM +0200, Peter Eisentraut wrote: > On tis, 2012-03-13 at 15:44 -0400, Tom Lane wrote: > > I wonder whether it'd be worth recommending that people do an initial > > ANALYZE with a low stats target, just to get some stats in place, > > and then go back to analyze at w

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Kevin Grittner
Bruce Momjian wrote: > OK, so a single 44GB tables took 2.5 minutes to analyze; that is > not good. It would require 11 such tables to reach 500GB (0.5 > TB), and would take 27 minutes. The report I had was twice as > long, but still in the ballpark of "too long". :-( But it's really 600 t

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Peter Eisentraut
On tis, 2012-03-13 at 15:44 -0400, Tom Lane wrote: > I wonder whether it'd be worth recommending that people do an initial > ANALYZE with a low stats target, just to get some stats in place, > and then go back to analyze at whatever their normal setting is. Perhaps going even further, ANALYZE coul

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Kevin Grittner
Bruce Momjian wrote: cir=# analyze "CaseHist"; ANALYZE Time: 143450.467 ms > OK, so a single 44GB tables took 2.5 minutes to analyze; that is > not good. It would require 11 such tables to reach 500GB (0.5 > TB), and would take 27 minutes. The report I had was twice as > long

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Tom Lane
Greg Stark writes: > On Tue, Mar 13, 2012 at 1:38 AM, Daniel Farina wrote: >> You probably are going to ask: "why not just run ANALYZE and be done >> with it?" > Uhm yes. If analyze takes a long time then something is broken. It's > only reading a sample which should be pretty much a fixed numbe

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Bruce Momjian
On Tue, Mar 13, 2012 at 02:07:14PM -0500, Kevin Grittner wrote: > Bruce Momjian wrote: > > On Tue, Mar 13, 2012 at 01:18:58PM -0500, Kevin Grittner wrote: > > >> cir=# analyze "CaseHist"; > >> ANALYZE > >> Time: 143450.467 ms > >> cir=# select relpages, reltuples from pg_class where relname = >

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Kevin Grittner
"Kevin Grittner" wrote: > Bruce Momjian wrote: >> That is 2.5 minutes. How large is that database? I dug around a little and found that we had turned on vacuum cost limits on the central databases, because otherwise the web team complained about performance during maintenance windows. On th

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Kevin Grittner
Bruce Momjian wrote: > On Tue, Mar 13, 2012 at 01:18:58PM -0500, Kevin Grittner wrote: >> cir=# analyze "CaseHist"; >> ANALYZE >> Time: 143450.467 ms >> cir=# select relpages, reltuples from pg_class where relname = >> 'CaseHist'; >> relpages | reltuples >> --+- >> 3588

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Bruce Momjian
On Tue, Mar 13, 2012 at 01:18:58PM -0500, Kevin Grittner wrote: > Greg Stark wrote: > > Daniel Farina wrote: > >> You probably are going to ask: "why not just run ANALYZE and be > >> done with it?" > > > > Uhm yes. If analyze takes a long time then something is broken. > > It's only reading a sa

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Bruce Momjian
On Tue, Mar 13, 2012 at 05:46:06PM +, Greg Stark wrote: > On Tue, Mar 13, 2012 at 1:38 AM, Daniel Farina wrote: > > You probably are going to ask: "why not just run ANALYZE and be done > > with it?" > > Uhm yes. If analyze takes a long time then something is broken. It's > only reading a samp

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Kevin Grittner
Greg Stark wrote: > Daniel Farina wrote: >> You probably are going to ask: "why not just run ANALYZE and be >> done with it?" > > Uhm yes. If analyze takes a long time then something is broken. > It's only reading a sample which should be pretty much a fixed > number of pages per table. It shoul

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Greg Stark
On Tue, Mar 13, 2012 at 1:38 AM, Daniel Farina wrote: > You probably are going to ask: "why not just run ANALYZE and be done > with it?" Uhm yes. If analyze takes a long time then something is broken. It's only reading a sample which should be pretty much a fixed number of pages per table. It sho

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Bruce Momjian
On Tue, Mar 13, 2012 at 12:08:41PM -0300, Alvaro Herrera wrote: > > > You're wrong. Autovacuum does not consider time, only dead/live tuple > > > counts. The formulas it uses are in the autovacuum docs; some details > > > (such as the fact that it skips tables that do not have stat entries) > > >

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Bruce Momjian
On Tue, Mar 13, 2012 at 09:28:58AM -0500, Kevin Grittner wrote: > Bruce Momjian wrote: > > > I just received a sobering blog comment stating that pg_upgrade > > took 5 minutes on a 0.5TB database, but analyze took over an hour: > > Yeah, we have had similar experiences. Even if this can't be

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Alvaro Herrera
Excerpts from Bruce Momjian's message of mar mar 13 11:49:26 -0300 2012: > > On Tue, Mar 13, 2012 at 11:34:16AM -0300, Alvaro Herrera wrote: > > > > Excerpts from Bruce Momjian's message of mar mar 13 11:14:43 -0300 2012: > > > On Tue, Mar 13, 2012 at 12:33:09AM -0700, Daniel Farina wrote: > > >

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Bruce Momjian
On Tue, Mar 13, 2012 at 11:34:16AM -0300, Alvaro Herrera wrote: > > Excerpts from Bruce Momjian's message of mar mar 13 11:14:43 -0300 2012: > > On Tue, Mar 13, 2012 at 12:33:09AM -0700, Daniel Farina wrote: > > > On Mon, Mar 12, 2012 at 8:10 PM, Bruce Momjian wrote: > > > > To answer your specif

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Alvaro Herrera
Excerpts from Bruce Momjian's message of mar mar 13 11:14:43 -0300 2012: > On Tue, Mar 13, 2012 at 12:33:09AM -0700, Daniel Farina wrote: > > On Mon, Mar 12, 2012 at 8:10 PM, Bruce Momjian wrote: > > > To answer your specific question, I think clearing the last analyzed > > > fields should cause

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Kevin Grittner
Bruce Momjian wrote: > I just received a sobering blog comment stating that pg_upgrade > took 5 minutes on a 0.5TB database, but analyze took over an hour: Yeah, we have had similar experiences. Even if this can't be done for every release or for every data type, bringing over statistics from

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Bruce Momjian
On Tue, Mar 13, 2012 at 12:33:09AM -0700, Daniel Farina wrote: > On Mon, Mar 12, 2012 at 8:10 PM, Bruce Momjian wrote: > > To answer your specific question, I think clearing the last analyzed > > fields should cause autovacuum to run on analyze those tables.  What I > > don't know is whether not c

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Bruce Momjian
On Tue, Mar 13, 2012 at 12:12:27AM -0400, Tom Lane wrote: > Bruce Momjian writes: > > Copying the statistics from the old server is on the pg_upgrade TODO > > list. I have avoided it because it will add an additional requirement > > that will make pg_upgrade more fragile in case of major version

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Daniel Farina
On Mon, Mar 12, 2012 at 9:12 PM, Tom Lane wrote: > Bruce Momjian writes: >> Copying the statistics from the old server is on the pg_upgrade TODO >> list.  I have avoided it because it will add an additional requirement >> that will make pg_upgrade more fragile in case of major version changes. >

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Daniel Farina
On Mon, Mar 12, 2012 at 8:10 PM, Bruce Momjian wrote: > To answer your specific question, I think clearing the last analyzed > fields should cause autovacuum to run on analyze those tables.  What I > don't know is whether not clearing the last vacuum datetime will cause > the table not to be analy

Re: [HACKERS] pg_upgrade and statistics

2012-03-12 Thread Tom Lane
Bruce Momjian writes: > Copying the statistics from the old server is on the pg_upgrade TODO > list. I have avoided it because it will add an additional requirement > that will make pg_upgrade more fragile in case of major version changes. > Does anyone have a sense of how often we change the st

Re: [HACKERS] pg_upgrade and statistics

2012-03-12 Thread Bruce Momjian
On Mon, Mar 12, 2012 at 06:38:30PM -0700, Daniel Farina wrote: > You probably are going to ask: "why not just run ANALYZE and be done > with it?" The reasons are: > > * ANALYZE can take a sufficiently long time on large databases that > the downtime of switching versions is not attractive > >