Re: [PERFORM] Very big insert/join performance problem (bacula)

2009-07-27 Thread Eric Comeau
>>> It really has very little impact. It only affects index scans, and >>> even then only if effective_cache_size is less than the size of the >> table. >>> >>> Essentially, when this kicks in, it models the effect that if you are >>> index scanning a table much larger than the size of your cache,

Re: [PERFORM] Very big insert/join performance problem (bacula)

2009-07-25 Thread Robert Haas
On Fri, Jul 24, 2009 at 1:13 AM, Marc Cousin wrote: >> It really has very little impact.  It only affects index scans, and >> even then only if effective_cache_size is less than the size of the >> table. >> >> Essentially, when this kicks in, it models the effect that if you are >> index scanning a

Re: [PERFORM] Very big insert/join performance problem (bacula)

2009-07-23 Thread Marc Cousin
> It really has very little impact. It only affects index scans, and > even then only if effective_cache_size is less than the size of the > table. > > Essentially, when this kicks in, it models the effect that if you are > index scanning a table much larger than the size of your cache, you > migh

Re: [PERFORM] Very big insert/join performance problem (bacula)

2009-07-23 Thread Robert Haas
On Thu, Jul 16, 2009 at 6:30 PM, Kevin Grittner wrote: > Marc Cousin wrote: > >> As mentionned in another mail from the thread (from Richard Huxton), >> I felt this message in the documentation a bit misleading : >> >> effective_cache_size (integer) >>  Sets the planner's assumption about the effe

Re: [PERFORM] Very big insert/join performance problem (bacula)

2009-07-16 Thread Kevin Grittner
Marc Cousin wrote: > As mentionned in another mail from the thread (from Richard Huxton), > I felt this message in the documentation a bit misleading : > > effective_cache_size (integer) > Sets the planner's assumption about the effective size of the disk > cache that is available to a single

Re: [PERFORM] Very big insert/join performance problem (bacula)

2009-07-16 Thread Devin Ben-Hur
Marc Cousin wrote: Le Thursday 16 July 2009 22:07:25, Kevin Grittner a écrit : Marc Cousin wrote: the hot parts of these 2 tables are extremely likely to be in the database or linux cache (buffer hit rate was 97% in the example provided). Moreover, the first two queries of the insert procedure

Re: [PERFORM] Very big insert/join performance problem (bacula)

2009-07-16 Thread Marc Cousin
Le Thursday 16 July 2009 23:54:54, Kevin Grittner a écrit : > Marc Cousin wrote: > > to sum it up, should I keep these values (I hate doing this :) ) ? > > Many people need to set the random_page_cost and/or seq_page_cost to > reflect the overall affect of caching on the active portion of the > da

Re: [PERFORM] Very big insert/join performance problem (bacula)

2009-07-16 Thread Kevin Grittner
Marc Cousin wrote: > to sum it up, should I keep these values (I hate doing this :) ) ? Many people need to set the random_page_cost and/or seq_page_cost to reflect the overall affect of caching on the active portion of the data. We set our fully-cached databases to 0.1 for both. Databases w

Re: [PERFORM] Very big insert/join performance problem (bacula)

2009-07-16 Thread Marc Cousin
Le Thursday 16 July 2009 22:07:25, Kevin Grittner a écrit : > Marc Cousin wrote: > > the hot parts of these 2 tables are extremely likely to be in the > > database or linux cache (buffer hit rate was 97% in the example > > provided). Moreover, the first two queries of the insert procedure > > fill

Re: [PERFORM] Very big insert/join performance problem (bacula)

2009-07-16 Thread Kevin Grittner
Marc Cousin wrote: > the hot parts of these 2 tables are extremely likely to be in the > database or linux cache (buffer hit rate was 97% in the example > provided). Moreover, the first two queries of the insert procedure > fill the cache for us... This would be why the optimizer does the be

Re: [PERFORM] Very big insert/join performance problem (bacula)

2009-07-16 Thread Marc Cousin
On Thursday 16 July 2009 07:20:18 Marc Cousin wrote: > Le Thursday 16 July 2009 01:56:37, Devin Ben-Hur a écrit : > > Marc Cousin wrote: > > > This mail contains the asked plans : > > > Plan 1 > > > around 1 million records to insert, seq_page_cost 1, random_page_cost 4 > > > > > > -> Has

Re: [PERFORM] Very big insert/join performance problem (bacula)

2009-07-15 Thread Marc Cousin
Le Thursday 16 July 2009 01:56:37, Devin Ben-Hur a écrit : > Marc Cousin wrote: > > This mail contains the asked plans : > > Plan 1 > > around 1 million records to insert, seq_page_cost 1, random_page_cost 4 > > > > -> Hash (cost=425486.72..425486.72 rows=16746972 width=92) > > (actual t

Re: [PERFORM] Very big insert/join performance problem (bacula)

2009-07-15 Thread Scott Carey
On 7/15/09 4:56 PM, "Devin Ben-Hur" wrote: > Marc Cousin wrote: >> This mail contains the asked plans : >> Plan 1 >> around 1 million records to insert, seq_page_cost 1, random_page_cost 4 > >> -> Hash (cost=425486.72..425486.72 rows=16746972 width=92) (actual >> time=23184.196..231

Re: [PERFORM] Very big insert/join performance problem (bacula)

2009-07-15 Thread Devin Ben-Hur
Marc Cousin wrote: This mail contains the asked plans : Plan 1 around 1 million records to insert, seq_page_cost 1, random_page_cost 4 -> Hash (cost=425486.72..425486.72 rows=16746972 width=92) (actual time=23184.196..23184.196 rows=16732049 loops=1) -> Seq Scan on

Re: [PERFORM] Very big insert/join performance problem (bacula)

2009-07-15 Thread Marc Cousin
Le Wednesday 15 July 2009 15:45:01, Alvaro Herrera a écrit : > Marc Cousin escribió: > > There are other things I am thinking of : maybe it would be better to > > have sort space on another (and not DBRD'ded) raid set ? we have a quite > > cheap setup right now for the database, and I think maybe t

Re: [PERFORM] Very big insert/join performance problem (bacula)

2009-07-15 Thread Alvaro Herrera
Marc Cousin escribió: > There are other things I am thinking of : maybe it would be better to have > sort space on another (and not DBRD'ded) raid set ? we have a quite > cheap setup right now for the database, and I think maybe this would help > scale better. I can get a filesystem in another v

Re: [PERFORM] Very big insert/join performance problem (bacula)

2009-07-15 Thread Marc Cousin
This mail contains the asked plans : I've done them with the different configurations, as I had done the effort of setting up the whole thing :) Stats were updated between all runs. Each time is the first run of the query (that's what we have in production with bacula) And I added the executor s

Re: [PERFORM] Very big insert/join performance problem (bacula)

2009-07-14 Thread Marc Cousin
Le Tuesday 14 July 2009 10:23:25, Richard Huxton a écrit : > Marc Cousin wrote: > > Temporarily I moved the problem at a bit higher sizes of batch by > > changing random_page_cost to 0.02 and seq_page_cost to 0.01, but I feel > > like an apprentice sorcerer with this, as I told postgreSQL that fetc

Re: [PERFORM] Very big insert/join performance problem (bacula)

2009-07-14 Thread Marc Cousin
Le Tuesday 14 July 2009 10:15:21, vous avez écrit : > Marc Cousin wrote: > >> Your effective_cache_size is really small for the system you seem to > >> have - its the size of IO caching your os is doing and uses no resources > >> itself. And 800MB of that on a system with that amount of data seems

Re: [PERFORM] Very big insert/join performance problem (bacula)

2009-07-14 Thread Richard Huxton
Marc Cousin wrote: Temporarily I moved the problem at a bit higher sizes of batch by changing random_page_cost to 0.02 and seq_page_cost to 0.01, but I feel like an apprentice sorcerer with this, as I told postgreSQL that fetching rows from disk are much cheaper than they are. These values ar

Re: [PERFORM] Very big insert/join performance problem (bacula)

2009-07-14 Thread Richard Huxton
Marc Cousin wrote: Your effective_cache_size is really small for the system you seem to have - its the size of IO caching your os is doing and uses no resources itself. And 800MB of that on a system with that amount of data seems a bit unlikely ;-) Using `free` you can see the amount of io cac

Re: [PERFORM] Very big insert/join performance problem (bacula)

2009-07-13 Thread Marc Cousin
> > While this is not your questions, I still noticed you seem to be on 8.3 - > it might be a bit faster to use GROUP BY instead of DISTINCT. It didn't do a big difference, I already tried that before for this query. Anyway, as you said, it's not the query having problems :) > Your effective_cac

Re: [PERFORM] Very big insert/join performance problem (bacula)

2009-07-13 Thread Andres Freund
Hi Marc, I don't have really extensive comments, but I found two small things... On Monday 13 July 2009 15:40:18 Marc Cousin wrote: > I'm trying to solve big performance issues with PostgreSQL + bacula while > inserting very big sets of records. > > I'm sorry, this email will be a long one, as I'

Re: [PERFORM] Very big insert/join performance problem (bacula)

2009-07-13 Thread Marc Cousin
We regularly do all of dbcheck. This is our real configuration, there are really lots of servers and lots of files (500 million files backed up every month). But thanks for mentionning that. The thing is we're trying to improve bacula with postgresql in order to make it able to bear with this

Re: [PERFORM] Very big insert/join performance problem (bacula)

2009-07-13 Thread SystemManagement
Hi, just a remark, as the number of entries seems to be very high: Did you ever activate bacula's program dbcheck Option 16? Regards Reiner Marc Cousin schrieb: Hi, I'm trying to solve big performance issues with PostgreSQL + bacula while inserting very big sets of records. I'm sorry, thi

[PERFORM] Very big insert/join performance problem (bacula)

2009-07-13 Thread Marc Cousin
Hi, I'm trying to solve big performance issues with PostgreSQL + bacula while inserting very big sets of records. I'm sorry, this email will be a long one, as I've already spent quite a lot of time on the issue, I don't want to waste your time speculating on things I may already have done, and