> -----Original Message-----
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED]]On Behalf Of Milosz Krajewski
> Sent: Monday, July 29, 2002 16:50
> To: [EMAIL PROTECTED]
> Subject: [SQL] How to optimize SQL query ?
>
>
> How to optimize query or just force postgre to do it my way ?
>
> Example:
> table continets (
> id numeric, ..., active numeric
> );
>
> table countries (
> id numeric, id_continent numeric, ..., active numeric
> );
>
> table cities (
> id numeric, id_country numeric, ..., active numeric
> );
>
> relations:
> cities.id_county are in countries.id
> countries.id_continent are on continents.id
>
> Query:
> table temp_cities (
> id_city numeric;
> );
>
>
> temp_cities is temp table which holds few (~20) id of cities, to
> show them.
>
> so:
> select * from cities
> where cities.id in (select id_city from temp_cities);
AVOID: indexes (which you should have defined on primary keys [implicitely
defined by PostgreSQL] and foreign keys [must be defined explicitely]) are
not used
> or:
> select * from cities, temp_cities tmp
> where cities.id = tmp.id_city;
BETTER ;-)
> works fine.
>
> But the problem starts here:
>
> select * from cities, coutries, continets
> where
> (cities.id in (select id_city from temp_cities)) and
> (cities.id_county = countries.id) and
> (countries.id_continent = continents.id) and
> (cities.active = 1) and (coutries.active = 1) and
> (continents.active = 1)
>
> (active means is row active or archive, many of them are active,
> but I have to check it)
>
> Posgre is planning it like this:
> joins cities with coutries
> joins countries with continents
> selects active
> filtering with cities.id (with temp_cities)
>
> If I could force it to filter cities.id first
> (I can do this with Oracle by changing
> "select id_city from temp_cities" to
> "select id_city from temp_cities group by id_city")
> it will work much (1000x) faster.
>
> Can I force postgre do it my way ?
Use the explicit JOIN syntax and join each table one after another in the
order you feel is the more adequate for your query. PostgreSQL will respect
this order.
>From one I understand, you should write it this way:
SELECT
*
FROM
continents
INNER JOIN
countries
ON ( continents.id = country.id_continent )
INNER JOIN
cities
ON ( countries.id = cities.id_country )
INNER JOIN
temp_cities
ON ( cities.id = temp_cities.id )
WHERE
( continents.active = 1 )
AND ( countries.active = 1 )
AND ( cities.active = 1 )
The reason to do so are:
1. Joining first on the tables that contain the less rows contributes to
keep the cartesian product between the joins as low as possible
2. Thus if a continent - respectively country - is not active, it will be
casted out from the join immediately and thus reduce the cartesian product
for the next join(s)
3. Joining on 'temp-cities' allows the usage of the hopefully defined index
I achieved ratio from 10000 to 1 respecting this strategy on a scenario
fairly closed to yours ;-)
Ce.D
> --
> [ Milosz "Krashan" Krajewski ][ mail: [EMAIL PROTECTED], UIN: 1319535 ]
> [ inet: Vilge, Vilgefortz ][ www: http://www.ds2.pg.gda.pl/~krash ]
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
> (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
>
---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?
http://archives.postgresql.org