I realize that this is somewhat off topic, but I
have seen far better improvments in query speeds
when using a multifield index instead of individual 
ones. So, for the given select below:

SELECT km.km, su.su 
FROM km, su 
WHERE km.id = su.id 
AND km.rel = su.rel
AND km.item = su.item 
AND su ~ '^ethanol';

multifield indexes should be created:

CREATE INDEX k_km on km(id, rel, item);
CREATE INDEX k_su on su(id, rel, item);

and the query should be rewritten as:

SELECT km.km, su.su 
FROM km, su 
WHERE 
(km.id,km.rel,km.item) = (su.id,su.rel,su.item)
AND su ~ '^ethanol';

Perhaps the distribution of your data is more 
"normal" using multikey indexes (I hope).  Note
that if you have a multifield index, and you still
submit your original query, the planner/optimizer
doesn't appear smart enough to use the multikey 
index except for the first join condition, so to
realize all of the benefits, you have to use the

WHERE
(t1.field1,t1.field2) = (t2.field1,t2.field2)

construction instead of:

WHERE
t1.field1 = t2.field1 AND
t1.field2 = t2.field2

On another note, looking through the backend 
sources reveals that, at one point, partial indices
were once allowed:

CREATE INDEX k_km on km(item) WHERE item <> 1;

but the grammer has been removed and is now 
illegal (although the code to handle it still
exists). Perhaps someday this feature will be 
restored.

Hope the above helps some,

Mike Mascari
([EMAIL PROTECTED])

--- "Gene Selkov, Jr." <[EMAIL PROTECTED]> wrote:
> I am wondering whether it's normal to see dropping a
> btree improve the
> query which could never complete enough that it
> completes in a blink?
> 
> I realize the data I have here represent the worst
> possible input to
> btree, and I'm probably better off without any index
> at all, but I
> guess it is something that the optimizer or the
> access method itself
> should be able to decide.
> 
> I am joining two tables, "km" and "su" on an int2
> attribute, "item".
> Just take a look at the value histograms for item:
> 
> emp=> SELECT item, count (*) AS count FROM km GROUP
> BY item;
> item|count
> ----+-----
>    1|31262
>    2|  110
>    3|    3
>    4|    1
> (4 rows)
> 
> emp=> SELECT item, count (*) AS count FROM su GROUP
> BY item;
> item|count
> ----+-----
>    1|94108
>    2| 1697
>    3|  773
>    4|  482
>    5|  237
>    6|  146
>    7|  105
>    8|   68
>    9|   41
>   10|   29
>   11|   22
>   12|   15
>   13|   13
>   14|    8
>   15|    7
>   16|    6
>   17|    5
>   18|    4
>   19|    4
>   20|    4
>   21|    4
>   22|    3
>   23|    3
>   24|    3
>   25|    1
>   26|    1
>   27|    1
>   28|    1
>   29|    1
>   30|    1
>   31|    1
> 
> As a default rule, I used to create the btree
> indices for all integer
> types, regardless of their values. Not anymore. It
> took me quite a
> while to figure that the following query did not
> work because of the
> faulty btree index on "item" (other joined
> attributes are char()):
> 
> SELECT km.km, su.su 
>   FROM km, su 
>   WHERE km.id = su.id 
>     AND km.rel = su.rel
>     AND km.item = su.item 
>     AND su ~ '^ethanol';
> 
> Can the btree or any other AM be smart enough and
> bail out from CREATE
> INDEX saying, "your data isn't worth indexing"?
> 
> --Gene


=====

__________________________________________________
Do You Yahoo!?
Bid and sell for free at http://auctions.yahoo.com

************

Reply via email to