Interestingly, I tried the new version with and without enable_seqscan on, and the version without indexes performs better because, I think, it returns more rows than an index lookup would enhance.

Thanks again for your help. This is certainly an improvement over my original version.

-tfo

--
Thomas F. O'Connell
Co-Founder, Information Architect
Sitening, LLC
http://www.sitening.com/
110 30th Avenue North, Suite 6
Nashville, TN 37203-6320
615-260-0005

On Dec 2, 2004, at 6:42 AM, Pierre-Frédéric Caillaud wrote:

Let's re-take your query from the start. At each step you should explain analyze the query to check if it runs smoothly.

1. You want the messages which have no actions. Rather than a subselect, I'd use a LEFT JOIN :

untested syntax :
SELECT m.id FROM message m LEFT JOIN message_action ma ON m.id=ma.messages_id WHERE ma.messages_id IS NULL;


On my machine, I have a zones table with 3000 rows and a cities table with 2 million rows, each place having a zone_id :

EXPLAIN ANALYZE SELECT z.zone_id FROM geo.zones z LEFT JOIN geo.cities c ON c.zone_id=z.zone_id WHERE c.id IS NULL;
Merge Left Join (cost=0.00..142063.06 rows=3663 width=4) (actual time=8726.203..8726.203 rows=0 loops=1)
Merge Cond: ("outer".zone_id = "inner".zone_id)
Filter: ("inner".id IS NULL)
-> Index Scan using zones_pkey on zones z (cost=0.00..99.10 rows=3663 width=4) (actual time=15.027..43.987 rows=3663 loops=1)
-> Index Scan using cities_zones_idx on cities c (cost=0.00..116030.55 rows=2073935 width=8) (actual time=25.164..5823.496 rows=2073935 loops=1)
Total runtime: 8726.327 ms
(6 lignes)


        8 seconds, this gives you an idea with that many records.
        You should check your indexes are used !

Now you have the messages which have no actions, you must get the user email domains :

        SELECT split_part( u.email, '@', 2 ) as domain
                FROM users u, message m
                LEFT JOIN message_action ma ON m.id=ma.messages_id
                WHERE u.id=m.user_id
                        AND ma.messages_id IS NULL;

        Can you time this query ? Are the indexes used ?
        Now, let's remove the duplicates :

        SELECT split_part( u.email, '@', 2 ) as domain
                FROM users u, message m
                LEFT JOIN message_action ma ON m.id=ma.messages_id
                WHERE u.id=m.user_id
                        AND ma.messages_id IS NULL
                GROUP By domain;

        GROUP BY is faster than DISTINCT (in some cases).

How does it go ?

---------------------------(end of broadcast)--------------------------- TIP 7: don't forget to increase your free space map settings

Reply via email to