Re: [PERFORM] is it possible to get the optimizer to use indexes

2003-12-24 Thread Dave Cramer
Doug,

Yes, it does depend on the locale, you can get around this in 7.4 by
building the index with smart operators

Dave
On Thu, 2003-12-18 at 20:38, Doug McNaught wrote:
> Christopher Kings-Lynne <[EMAIL PROTECTED]> writes:
> 
> >> It appears that the optimizer only uses indexes for = clause?
> >
> > The optimizer will used indexes for LIKE clauses, so long as the
> > clause is a prefix search, eg:
> >
> > SELECT * FROM test WHERE a LIKE 'prf%';
> 
> Doesn't this still depend on your locale?
> 
> -Doug
> 
> 
> ---(end of broadcast)---
> TIP 3: if posting/reading through Usenet, please send an appropriate
>   subscribe-nomail command to [EMAIL PROTECTED] so that your
>   message can get through to the mailing list cleanly
> 
> 


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


[PERFORM] mnogosearch under 7.4 ...

2003-12-24 Thread Marc G. Fournier

G'day all ...

Dave asked me today about 'slow downs' on the search engines, so am
looking at the various queries generated by enabling
log_statement/log_duration, to get a feel for is something is "off" ...
and the following seems a bit weird ...

QueryA and QueryB are the same query, but against two different tables in
the databases ... QueryA takes ~4x longer to run then QueryB, but both
EXPLAINs look similar ... in fact, looking at the EXPLAIN ANALYZE output,
I would expect that QueryB would be the slower of the two ... but, the
actual vs estimated times for ndict5/ndict4 seem off (ndict4 is estimated
high, ndict5 is estimated low) ...

QueryA:

186_archives=# explain analyze SELECT ndict5.url_id,ndict5.intag
 FROM ndict5, url
WHERE ndict5.word_id=1343124681
  AND url.rec_id=ndict5.url_id
  AND ((url.url || '') LIKE 
'http://archives.postgresql.org/%%');
  QUERY PLAN
--
 Nested Loop  (cost=0.00..69799.69 rows=44 width=8) (actual time=113.067..26477.672 
rows=14112 loops=1)
   ->  Index Scan using n5_word on ndict5  (cost=0.00..34321.89 rows=8708 width=8) 
(actual time=27.349..25031.666 rows=15501 loops=1)
 Index Cond: (word_id = 1343124681)
   ->  Index Scan using url_rec_id on url  (cost=0.00..4.06 rows=1 width=4) (actual 
time=0.061..0.068 rows=1 loops=15501)
 Index Cond: (url.rec_id = "outer".url_id)
 Filter: ((url || ''::text) ~~ 'http://archives.postgresql.org/%%'::text)
 Total runtime: 26550.566 ms
(7 rows)

QueryB:

186_archives=# explain analyze SELECT ndict4.url_id,ndict4.intag
 FROM ndict4, url
WHERE ndict4.word_id=-2038735111
  AND url.rec_id=ndict4.url_id
  AND ((url.url || '') LIKE 
'http://archives.postgresql.org/%%');

 QUERY PLAN

 Nested Loop  (cost=0.00..99120.97 rows=62 width=8) (actual time=26.330..6630.581 
rows=2694 loops=1)
   ->  Index Scan using n4_word on ndict4  (cost=0.00..48829.52 rows=12344 width=8) 
(actual time=7.954..6373.098 rows=2900 loops=1)
 Index Cond: (word_id = -2038735111)
   ->  Index Scan using url_rec_id on url  (cost=0.00..4.06 rows=1 width=4) (actual 
time=0.059..0.066 rows=1 loops=2900)
 Index Cond: (url.rec_id = "outer".url_id)
 Filter: ((url || ''::text) ~~ 'http://archives.postgresql.org/%%'::text)
 Total runtime: 6643.462 ms
(7 rows)





Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email: [EMAIL PROTECTED]   Yahoo!: yscrappy  ICQ: 7615664

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


[PERFORM]

2003-12-24 Thread Michael Rothschild
Title: Message




I consider using PostgreSQL for a project we have in 
our company and, to get a better picture of the product, I started 
scanning its source code and internal documentation.
Based on what I saw (and maybe I didn't see enough) 
it seems that the optimizer will always decide to repeatedly scan the 
whole row set returned by sub selects in the context of an IN 
clause sequentially, as opposed to what I would expect it to do (which is 
to create some index or hash structure to improve 
performance).
For example, if I have the following 
query:Select * from a where x in (select y from 
b where z=7) 
Then I would expect an index or hash structure to be created 
for b.y when it is first scanned and brought into the cache but I couldn't see 
it happening in the source.
As I said, I only inferred it from reading the 
source - not from actual experiments - so I may be 
wrong.
1. Am I 
wrong?
2. If I'm right, is 
there any plan to change it (after all, in the context of an IN clause, an index 
on the returned row set is all that is needed - the row set itself does not seem 
to matter).
 
Thank 
you,
 
Michael 
Rothschild


Re: [PERFORM] postgresql performance on linux port

2003-12-24 Thread Michael Guerin
Hi Tom,

   I don't believe I did run Analyze, I was under the assumption that the
statistics would have been up to date when the indexes were created.
Thanks for the quick response.

-mike


Tom Lane wrote:

> Michael Guerin <[EMAIL PROTECTED]> writes:
> >I just restored a database running on a solaris box to a linux box
> > and queries take forever to execute.
>
> Did you remember to run ANALYZE?  Have you applied the same
> configuration settings that you were using before?
>
> regards, tom lane


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM]

2003-12-24 Thread Christopher Kings-Lynne
> For example, if I have the following query:
> Select * from a where x in (select y from b where z=7)
> Then I would expect an index or hash structure to be created for b.y
> when it is first scanned and brought into the cache but I couldn't see
> it happening in the source.
> As I said, I only inferred it from reading the source - not from actual
> experiments - so I may be wrong.
> 1. Am I wrong?

You are wrong - this is old behaviour and one of the major speed
improvements of PostgreSQL 7.4 is that IN subqueries now use a hash index
and hence they are much faster.

Chris



---(end of broadcast)---
TIP 8: explain analyze is your friend