Re: [HACKERS] slow IN() clause for many cases

2005-12-02 Thread Martijn van Oosterhout
On Fri, Dec 02, 2005 at 08:18:44AM +, Simon Riggs wrote: > It can, of course, but there must be value in that optimization. > > If you consider how IN () would be transformed into > =ANY(ARRAY(subselect)) you'll see that the subselect values would be > treated as constants that could result in

Re: [HACKERS] slow IN() clause for many cases

2005-12-02 Thread Simon Riggs
On Wed, 2005-11-30 at 07:18 +0100, Martijn van Oosterhout wrote: > And finally, why can't: > > > > > Select * From Sales where month IN ( > > > > select month from time_dimension where FinYear = 2005 and Quarter = 3) > > Be written as: > > Select sales.* From Sales, time_dimension > where mon

Re: [HACKERS] slow IN() clause for many cases

2005-11-29 Thread Tom Lane
Martijn van Oosterhout writes: > Do these constructs have the same semantics w.r.t. NULL? I think so, though it'd be good to read the spec closely. > Currently arrays can't have nulls That's so last week ;-) regards, tom lane ---(end of broadcas

Re: [HACKERS] slow IN() clause for many cases

2005-11-29 Thread Greg Stark
Simon Riggs <[EMAIL PROTECTED]> writes: > IMHO the only way to do joins that access partitions is to do the > constraint exclusion at run time, but I can see thats a longer > conversation than I can start right now. My experience in Oracle was that you can see three different types of partition

Re: [HACKERS] slow IN() clause for many cases

2005-11-29 Thread Martijn van Oosterhout
On Tue, Nov 29, 2005 at 10:53:38PM +, Simon Riggs wrote: > On Tue, 2005-11-29 at 17:21 -0500, Tom Lane wrote: > > regression=# explain select * from tenk1 where unique1 = any (array(select > > f1 from int4_tbl)); > So we could teach the planner to transform: > > IN (subselect) > > into >

Re: [HACKERS] slow IN() clause for many cases

2005-11-29 Thread Simon Riggs
On Tue, 2005-11-29 at 17:21 -0500, Tom Lane wrote: > Simon Riggs <[EMAIL PROTECTED]> writes: > > Do you think we'll be able to generate a single ScalarArrayOpExpr from a > > small subselect and pass it through as an indexable expression? > > If you don't mind spelling it with the ARRAY(sub-select)

Re: [HACKERS] slow IN() clause for many cases

2005-11-29 Thread Joe Conway
Tom Lane wrote: Simon Riggs <[EMAIL PROTECTED]> writes: Do you think we'll be able to generate a single ScalarArrayOpExpr from a small subselect and pass it through as an indexable expression? If you don't mind spelling it with the ARRAY(sub-select) syntax, which I think is a Postgres-ism (th

Re: [HACKERS] slow IN() clause for many cases

2005-11-29 Thread Tom Lane
Simon Riggs <[EMAIL PROTECTED]> writes: > Do you think we'll be able to generate a single ScalarArrayOpExpr from a > small subselect and pass it through as an indexable expression? If you don't mind spelling it with the ARRAY(sub-select) syntax, which I think is a Postgres-ism (though it's possibl

Re: [HACKERS] slow IN() clause for many cases

2005-11-29 Thread Simon Riggs
On Mon, 2005-10-17 at 12:49 +0100, Simon Riggs wrote: > On Fri, 2005-10-14 at 19:09 -0400, Tom Lane wrote: > > I wrote: > > > I'm thinking that IN should be > > > converted to a ScalarArrayOpExpr, ie > > > > > x = ANY (ARRAY[val1,val2,val3,val4,...]) > > > > Actually, there is one little thing

Re: [HACKERS] slow IN() clause for many cases

2005-10-17 Thread Simon Riggs
On Fri, 2005-10-14 at 19:09 -0400, Tom Lane wrote: > I wrote: > > I'm thinking that IN should be > > converted to a ScalarArrayOpExpr, ie > > > x = ANY (ARRAY[val1,val2,val3,val4,...]) > > Actually, there is one little thing in the way of doing this: it'll > fail if any of the IN-list element

Re: [HACKERS] slow IN() clause for many cases

2005-10-16 Thread Tom Lane
Greg Stark <[EMAIL PROTECTED]> writes: > Martijn van Oosterhout writes: >> It's called LIMIT and has been supported for a long time. > And if I *don't* want to limit the number of rows I retriev? You still need to do something proactive to inform the system that you want a fast-start plan. What

Re: [HACKERS] slow IN() clause for many cases

2005-10-16 Thread Greg Stark
Martijn van Oosterhout writes: > > (I think there really ought to be a bit in the protocol that the client > > sends > > with the query to indicate which is needed. That would be cleaner than > > Oracle's /*+ FIRST_ROW */ and /*+ ALL_ROWS */ hints.) > > It's called LIMIT and has been supported

Re: [HACKERS] slow IN() clause for many cases

2005-10-16 Thread Martijn van Oosterhout
On Sun, Oct 16, 2005 at 05:09:57PM -0400, Greg Stark wrote: > Tom Lane <[EMAIL PROTECTED]> writes: > > Certainly, if you do not supply a LIMIT, there is no justification > > at all for expecting the planner to prefer fast-start over > > minimum-total-cost. > > Well figuring out when to prefer one

Re: [HACKERS] slow IN() clause for many cases

2005-10-16 Thread Greg Stark
Tom Lane <[EMAIL PROTECTED]> writes: > Martijn van Oosterhout writes: > > On Sun, Oct 16, 2005 at 12:03:33PM -0400, Greg Stark wrote: > >> That's true. That's why I was wondering more about cases where the client > >> end > >> was going to read all the records until it found the record it's look

Re: [HACKERS] slow IN() clause for many cases

2005-10-16 Thread Tom Lane
Greg Stark <[EMAIL PROTECTED]> writes: > The example above raises another idea though. Would it be possible for the > optimizer to recognize when a clause is so expansive that it would be faster > to read the complement than the actual clause as written? Being able to compute the complement, much

Re: [HACKERS] slow IN() clause for many cases

2005-10-16 Thread Tom Lane
Martijn van Oosterhout writes: > On Sun, Oct 16, 2005 at 12:03:33PM -0400, Greg Stark wrote: >> That's true. That's why I was wondering more about cases where the client end >> was going to read all the records until it found the record it's looking for >> or found enough records for its purposes.

Re: [HACKERS] slow IN() clause for many cases

2005-10-16 Thread Martijn van Oosterhout
On Sun, Oct 16, 2005 at 12:03:33PM -0400, Greg Stark wrote: > Tom Lane <[EMAIL PROTECTED]> writes: > > Another point here is that LIMIT without any ORDER BY isn't an amazingly > > useful case. Neither the old implementation of OR indexscans nor the > > new can produce ordered output, which means y

Re: [HACKERS] slow IN() clause for many cases

2005-10-16 Thread Greg Stark
Tom Lane <[EMAIL PROTECTED]> writes: > If the fraction of records matching the IN-list is so large as to make > that an issue, I'd think the planner would prefer a seqscan anyway. > Besides which, it's a bit silly to worry about whether a plan is > fast-start without taking into account the amount

Re: [HACKERS] slow IN() clause for many cases

2005-10-15 Thread Tom Lane
Greg Stark <[EMAIL PROTECTED]> writes: > Tom Lane <[EMAIL PROTECTED]> writes: >> Why? They certainly wouldn't be any slower than they are now. > Well currently they get rewritten to use OR which does a single index scan Not in 8.1 it doesn't; that code is long gone. 2005-04-24 21:30 tgl

Re: [HACKERS] slow IN() clause for many cases

2005-10-15 Thread Greg Stark
Tom Lane <[EMAIL PROTECTED]> writes: > Greg Stark <[EMAIL PROTECTED]> writes: > > I would fear queries like > > > SELECT * from tab WHERE x IN (1,2,3) LIMIT 1 > > > Which ought to be instantaneous would become potentially slow. > > Why? They certainly wouldn't be any slower than they are now.

Re: [HACKERS] slow IN() clause for many cases

2005-10-15 Thread Tom Lane
Greg Stark <[EMAIL PROTECTED]> writes: > I would fear queries like > SELECT * from tab WHERE x IN (1,2,3) LIMIT 1 > Which ought to be instantaneous would become potentially slow. Why? They certainly wouldn't be any slower than they are now. regards, tom lane -

Re: [HACKERS] slow IN() clause for many cases

2005-10-15 Thread Greg Stark
Tom Lane <[EMAIL PROTECTED]> writes: > It strikes me that now that we have the bitmap indexscan mechanism, > it wouldn't be hard to do better. I'm thinking that IN should be > converted to a ScalarArrayOpExpr, ie > > x = ANY (ARRAY[val1,val2,val3,val4,...]) > > and then we could treat bot

Re: [HACKERS] slow IN() clause for many cases

2005-10-15 Thread Tom Lane
[EMAIL PROTECTED] writes: > But doesn't "x IN (NULL)" already fail to ever match, similar to "x > = NULL"? (Assuming that compatibility switch isn't enabled) The case I'm worried about is "x IN (1,2,NULL)". This *can* match. Also, it's supposed to return NULL (not FALSE) if it doesn't match. So

Re: [HACKERS] slow IN() clause for many cases

2005-10-15 Thread mark
On Fri, Oct 14, 2005 at 07:09:17PM -0400, Tom Lane wrote: > I wrote: > > I'm thinking that IN should be > > converted to a ScalarArrayOpExpr, ie > > x = ANY (ARRAY[val1,val2,val3,val4,...]) > Actually, there is one little thing in the way of doing this: it'll > fail if any of the IN-list elemen

Re: [HACKERS] slow IN() clause for many cases

2005-10-14 Thread David Fetter
On Fri, Oct 14, 2005 at 07:09:17PM -0400, Tom Lane wrote: > I wrote: > > I'm thinking that IN should be converted to a ScalarArrayOpExpr, > > ie > > > x = ANY (ARRAY[val1,val2,val3,val4,...]) > > Actually, there is one little thing in the way of doing this: it'll > fail if any of the IN-list

Re: [HACKERS] slow IN() clause for many cases

2005-10-14 Thread Tom Lane
I wrote: > I'm thinking that IN should be > converted to a ScalarArrayOpExpr, ie > x = ANY (ARRAY[val1,val2,val3,val4,...]) Actually, there is one little thing in the way of doing this: it'll fail if any of the IN-list elements are NULL, because we have not got support for arrays with null

Re: [HACKERS] slow IN() clause for many cases

2005-10-14 Thread Tom Lane
Andrew - Supernews <[EMAIL PROTECTED]> writes: > With everything in cache, selecting 1000 random rows from a 200k row table, > I get: > > for IN (list): planning time 47ms, execution time 27ms > array/nestloop: planning time 8ms, execution time 34ms The reason for the slow planning time is that I

Re: [HACKERS] slow IN() clause for many cases

2005-10-14 Thread Andrew - Supernews
On 2005-10-12, Andrew - Supernews <[EMAIL PROTECTED]> wrote: > On 2005-10-12, Tom Lane <[EMAIL PROTECTED]> wrote: >> Andrew - Supernews <[EMAIL PROTECTED]> writes: >>> As the number of items in the IN clause increases, the planning time grows >>> rather radically. >> >> I was looking at this yester

Re: [HACKERS] slow IN() clause for many cases

2005-10-12 Thread Andrew - Supernews
On 2005-10-12, Tom Lane <[EMAIL PROTECTED]> wrote: > Andrew - Supernews <[EMAIL PROTECTED]> writes: >> As the number of items in the IN clause increases, the planning time grows >> rather radically. > > I was looking at this yesterday. There is some O(N^2) behavior in > create_bitmap_subplan, stem

Re: [HACKERS] slow IN() clause for many cases

2005-10-12 Thread Jonah H. Harris
other method for >20 numbers (actual number depends on costs) ?Maybe that should be added to TODO ?-Original Message-From: [EMAIL PROTECTED][mailto: [EMAIL PROTECTED]] On Behalf Of Andrew - SupernewsSent: Wednesday, October 12, 2005 1:41 PMTo: pgsql-hackers@postgresql.orgSubject: Re: [

Re: [HACKERS] slow IN() clause for many cases

2005-10-12 Thread Ilia Kantor
DO ? -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Andrew - Supernews Sent: Wednesday, October 12, 2005 1:41 PM To: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] slow IN() clause for many cases On 2005-10-11, "Ilia Kantor" <[EMAIL PROTECTED

Re: [HACKERS] slow IN() clause for many cases

2005-10-12 Thread Tom Lane
Andrew - Supernews <[EMAIL PROTECTED]> writes: > As the number of items in the IN clause increases, the planning time grows > rather radically. I was looking at this yesterday. There is some O(N^2) behavior in create_bitmap_subplan, stemming from trying to remove duplicated qual conditions. That

Re: [HACKERS] slow IN() clause for many cases

2005-10-12 Thread Andrew - Supernews
On 2005-10-11, "Ilia Kantor" <[EMAIL PROTECTED]> wrote: > When in clause becomes large enough (>20-30 cases), > It is much better to use "join" way of processing.. or even a different way of writing the IN clause. This one is one I've used after considerable research: select * from table where

Re: [HACKERS] slow IN() clause for many cases

2005-10-11 Thread Jonah H. Harris
true dat :) On 10/12/05, Tom Lane <[EMAIL PROTECTED]> wrote: "Ilia Kantor" <[EMAIL PROTECTED]> writes:> Bitmap Heap Scan on objects_hier  (cost=60.29..179.57 rows=80 width=600)> (actual time=0.835..1.115 rows=138 loops=1) vs>  Merge Join  (cost=62.33..576.80 rows=1117 width=600) (actual> time=0.54

Re: [HACKERS] slow IN() clause for many cases

2005-10-11 Thread Tom Lane
"Ilia Kantor" <[EMAIL PROTECTED]> writes: > Bitmap Heap Scan on objects_hier (cost=60.29..179.57 rows=80 width=600) > (actual time=0.835..1.115 rows=138 loops=1) vs > Merge Join (cost=62.33..576.80 rows=1117 width=600) (actual > time=0.542..2.898 rows=138 loops=1) Hmm, sure looks from here li

Re: [HACKERS] slow IN() clause for many cases

2005-10-11 Thread Ilia Kantor
>> It is bitmap-OR on multiple index(PK) lookups. > Describing it doesn't help. We need an *actual* EXPLAIN ANALYZE. Sure, why not.. 6ms for Bitmap Heap Scan on objects_hier (cost=60.29..179.57 rows=80 width=600) (actual time=0.835..1.115 rows=138 loops=1) Recheck Cond: ((id = 1) OR (id =

Re: [HACKERS] slow IN() clause for many cases

2005-10-11 Thread Josh Berkus
Ilia, > It is bitmap-OR on multiple index(PK) lookups. Describing it doesn't help. We need an *actual* EXPLAIN ANALYZE. -- --Josh Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] slow IN() clause for many cases

2005-10-11 Thread Ilia Kantor
>Please post an explain analyze on your query with a 20-30 item IN clause so that we can see what plan is being generated. It is bitmap-OR on multiple index(PK) lookups. ---(end of broadcast)--- TIP 4: Have you searched our list archives?

Re: [HACKERS] slow IN() clause for many cases

2005-10-11 Thread Jonah H. Harris
Please post an explain analyze on your query with a 20-30 item IN clause so that we can see what plan is being generated. On 10/11/05, Ilia Kantor <[EMAIL PROTECTED]> wrote: When in clause becomes large enough (>20-30 cases),It is much better to use "join" way of processing..I mean,"SELECT * FROM t

Re: [HACKERS] slow IN() clause for many cases

2005-10-11 Thread Ilia Kantor
When in clause becomes large enough (>20-30 cases), It is much better to use "join" way of processing.. I mean, "SELECT * FROM table WHERE field IN (1,2...30)" will be slower than "SELECT * FROM table JOIN (SRF returning 1...30) USING(field)" I'm not quite sure, where the difference starts, but