[SQL] Question about slow Select when using 'IN'.

2002-12-02 Thread Mike Winter
Hi all, I hope someone can help me out.

I'm doing single-table select statements on a large table and I could use
some help in speeding it up.

My query is of the form:
SELECT col, count(col) FROM tab WHERE id IN (3,
4,7,2, ...) GROUP BY COL ORDER BY count

for a very large number of rows.

I have an index on id, so the explain looks like:

Aggregate  (cost=12.12..12.14 rows=1 width=5)
  ->  Group  (cost=12.12..12.13 rows=4 width=5)
->  Sort  (cost=12.12..12.12 rows=4 width=5)
  ->  Index Scan using col_id_idx2, col_id_idx2, col_id_idx2,
col_id_idx2 on tab  (cost=0.00..12.08 rows=4 width=5)

So, it does a separate index scan for each row in the IN statement, which
takes forever.

How do I force the query parser to emulate the behaviour displayed by this
query:

SELECT col, count(col) FROM tab WHERE (0 = id % 5) GROUP BY COL ORDER BY
count

Aggregate  (cost=3.75..3.86 rows=2 width=5)
  ->  Group  (cost=3.75..3.81 rows=21 width=5)
->  Sort  (cost=3.75..3.75 rows=21 width=5)
  ->  Index Scan using col_id_idx2 on tab
(cost=0.00..3.29 rows=21 width=5)

Which only does one index scan for an equivelant number of records.

Thanks for any help.  Please cc to my e-mail.




---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



[SQL] Large Query Question. (Slow Select while using 'IN')

2002-12-03 Thread Mike Winter
Hi all, I hope someone can help me out.

I'm doing single-table select statements on a large table and I could use
some help in speeding it up.

My query is of the form:
SELECT col, count(col) FROM tab WHERE id IN (3,
4,7,2, ...) GROUP BY COL ORDER BY count

for a very large number of rows.

I have an index on id, so the explain looks like:

Aggregate  (cost=12.12..12.14 rows=1 width=5)
  ->  Group  (cost=12.12..12.13 rows=4 width=5)
->  Sort  (cost=12.12..12.12 rows=4 width=5)
col_id_idx2 on tab  (cost=0.00..12.08 rows=4 width=5)

So, it does a separate index scan for each row in the IN statement, which
takes forever.

How do I force the query parser to emulate the behaviour displayed by this
query:

SELECT col, count(col) FROM tab WHERE (0 = id % 5) GROUP BY COL ORDER BY
count

Aggregate  (cost=3.75..3.86 rows=2 width=5)
  ->  Group  (cost=3.75..3.81 rows=21 width=5)
->  Sort  (cost=3.75..3.75 rows=21 width=5)
  ->  Index Scan using col_id_idx2 on tab
(cost=0.00..3.29 rows=21 width=5)

Which only does one index scan for an equivelant number of records.

Thanks for any help.  Please cc to my e-mail.

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



[SQL] "SELECT IN" Still Broken in 7.4b

2003-08-20 Thread Mike Winter
I'm sure many on this list are sick of hearing about this problem, but it
was on the fix list for 7.4, but doesn't appear to have been changed.

You can see one of the many threads on the problem at:
http://archives.postgresql.org/pgsql-sql/2003-05/msg00352.php

Basically, queries of the form SELECT  FROM  WHERE  IN
() take forever for high numbers of rows in the IN clause.
We've done timing on 7.3 and 7.4b and there is no speed improvement on
these queries.

Does anyone know what the status of this bug is?
-- 
___
Front Logic Inc.  Tel: 306.653.2725 x14
226 Pacific Ave   or 1.800.521.4510 x14
Saskatoon, SK Fax: 306.653.0972
S7K 1N9  Canada   Cell: 306.717.2550
http://www.frontlogic.com   [EMAIL PROTECTED]




---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [SQL] "SELECT IN" Still Broken in 7.4b

2003-08-20 Thread Mike Winter
On Wed, 20 Aug 2003, Stephan Szabo wrote:

>
> On Wed, 20 Aug 2003, Mike Winter wrote:
>
> > I'm sure many on this list are sick of hearing about this problem, but it
> > was on the fix list for 7.4, but doesn't appear to have been changed.
>
> IN (subselect) was changed for 7.4 (although I'm not sure of the list
> mentions the difference). I don't know of any major changes to IN
> (valuelist) though.

Thanks, Stephan.  I was really hoping that the IN(valuelist) was going to
be changed at the same time, because it really is unusable for anything
over a couple of thousand values.
-- 
___
Front Logic Inc.  Tel: 306.653.2725 x14
226 Pacific Ave   or 1.800.521.4510 x14
Saskatoon, SK Fax: 306.653.0972
S7K 1N9  Canada   Cell: 306.717.2550
http://www.frontlogic.com   [EMAIL PROTECTED]



---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [SQL] "SELECT IN" Still Broken in 7.4b

2003-08-20 Thread Mike Winter
On Wed, 20 Aug 2003, Rod Taylor wrote:

> Ensure your IN list is unique.  You might find better times by through
> an indexed temp table.

That is what I ended up doing, but it's not a very elegant solution.
MySQL does queries of this type orders of magnitudes faster than Postgres
on large value lists, although I have no specific algorithmic solutions to
offer for how to make it faster.  I don't believe making the IN lists
unique has any affect on performance.

-- 
___
Front Logic Inc.  Tel: 306.653.2725 x14
226 Pacific Ave   or 1.800.521.4510 x14
Saskatoon, SK Fax: 306.653.0972
S7K 1N9  Canada   Cell: 306.717.2550
http://www.frontlogic.com   [EMAIL PROTECTED]



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]