Re: [PERFORM] Nested loops overpriced

2007-05-10 Thread Tom Lane
Peter Eisentraut <[EMAIL PROTECTED]> writes: > Am Mittwoch, 9. Mai 2007 19:40 schrieb Tom Lane: >> Hmmm ... I see at least part of the problem, which is that email_header >> is joined twice in this query, which means that it's counted twice in >> figuring the total volume of pages competing for cac

Re: [PERFORM] Nested loops overpriced

2007-05-10 Thread Peter Eisentraut
Am Mittwoch, 9. Mai 2007 19:40 schrieb Tom Lane: > Hmmm ... I see at least part of the problem, which is that email_header > is joined twice in this query, which means that it's counted twice in > figuring the total volume of pages competing for cache space.  So the > thing thinks cache space is ov

Re: [PERFORM] Nested loops overpriced

2007-05-10 Thread Peter Eisentraut
Am Mittwoch, 9. Mai 2007 19:40 schrieb Tom Lane: > I remember having dithered about whether > to try to avoid counting the same physical relation more than once in > total_table_pages, but this example certainly suggests that we > shouldn't. Meanwhile, do the estimates get better if you set > effe

Re: [PERFORM] Nested loops overpriced

2007-05-09 Thread Tom Lane
Peter Eisentraut <[EMAIL PROTECTED]> writes: >> Are you using any nondefault planner settings? > random_page_cost = 3 > effective_cache_size = 384MB >> How big are these tables, anyway? > email 35 MB > email_header 421 MB > mime_part 37 MB Hmmm ... I see at least part of the proble

Re: [PERFORM] Nested loops overpriced

2007-05-09 Thread Peter Eisentraut
Am Mittwoch, 9. Mai 2007 16:11 schrieb Tom Lane: > Well, there's something funny going on here. You've got for instance > >-> Index Scan using email_pkey on email (cost=0.00..3.85 > rows=1 width=8) (actual time=0.005..0.005 rows=0 loops=280990) Index Cond: > (email.email_id = eh_from

Re: [PERFORM] Nested loops overpriced

2007-05-09 Thread Daniel Cristian Cruz
2007/5/9, Gregory Stark <[EMAIL PROTECTED]>: "Daniel Cristian Cruz" <[EMAIL PROTECTED]> writes: > -> Nested Loop (cost=0.00..13187.94 rows=93 width=4) (actual time=2.622..125.739 rows=50 loops=1) > -> Seq Scan on u (cost=0.00..2838.80 rows=10289 width=4) (actual time=0.012..9.86

Re: [PERFORM] Nested loops overpriced

2007-05-09 Thread Gregory Stark
"Daniel Cristian Cruz" <[EMAIL PROTECTED]> writes: > -> Nested Loop (cost=0.00..13187.94 rows=93 width=4) (actual > time=2.622..125.739 rows=50 loops=1) > -> Seq Scan on u (cost=0.00..2838.80 rows=10289 width=4) (actual > time=0.012..9.863 rows=10291 loops=1) > -> Index S

Re: [PERFORM] Nested loops overpriced

2007-05-09 Thread Tom Lane
Peter Eisentraut <[EMAIL PROTECTED]> writes: > Am Dienstag, 8. Mai 2007 17:53 schrieb Tom Lane: >> Hmm, I'd have expected it to discount the repeated indexscans a lot more >> than it seems to be doing for you. As an example in the regression >> database, note what happens to the inner indexscan co

Re: [PERFORM] Nested loops overpriced

2007-05-09 Thread Daniel Cristian Cruz
I'm having something weird too... Look: Nested Loop Left Join (cost=93.38..7276.26 rows=93 width=58) (actual time=99.211..4804.525 rows=2108 loops=1) -> Hash Join (cost=93.38..3748.18 rows=93 width=4) (actual time=0.686..20.632 rows=45 loops=1) Hash Cond: ((u.i)::text = (m.i)::text)

Re: [PERFORM] Nested loops overpriced

2007-05-09 Thread Peter Eisentraut
Am Dienstag, 8. Mai 2007 17:53 schrieb Tom Lane: > Hmm, I'd have expected it to discount the repeated indexscans a lot more > than it seems to be doing for you. As an example in the regression > database, note what happens to the inner indexscan cost estimate when > the number of outer tuples grow

Re: [PERFORM] Nested loops overpriced

2007-05-08 Thread Tom Lane
Peter Eisentraut <[EMAIL PROTECTED]> writes: > Note how spectacularly overpriced this plan is. Hmm, I'd have expected it to discount the repeated indexscans a lot more than it seems to be doing for you. As an example in the regression database, note what happens to the inner indexscan cost estima

[PERFORM] Nested loops overpriced

2007-05-08 Thread Peter Eisentraut
This query does some sort of analysis on an email archive: SELECT eh_subj.header_body AS subject, count(distinct eh_from.header_body) FROM email JOIN mime_part USING (email_id) JOIN email_header eh_subj USING (email_id