Nis Jorgensen skrev:
> Paul Rauch wrote:
>   
>> Is there a way to reduce execution time by counting objects?
>>
>> it's even splitted in two queries now, with only 2474487 entries ;)
>>
>> in django I use "model.objects.count()"
>>
>> SELECT COUNT(*) FROM "rainbowtables_hashes":34.552
>>
>> SELECT COUNT(*) FROM "rainbowtables_hashes":31.024
>>   
>>     
>
> I guess you are using postgresql. This is a known "issue" with that
> database - count(*) on an entire table (or large subsets) is slow (and
> takes time proportional to the rows found).
>
>   
[snip]
> Hmm, i just realized a generic way of making this work in postgresql (if
> you are happy to do your own ddl). Basically use an updatable view,
> which maintains the additional table behind the scenes. Will write that
> up in more detail later.
>   

If  anyone is interested,. I did a little testing on this. It does seem
to work in some cases, but not using the query plan I thought it would.

My setup:

CREATE TABLE entity_base (id SERIAL PRIMARY KEY, f1 varchar(255), f2
varchar(255), f3 varchar(255), f4 varchar(255), f5 varchar(255), f6
varchar(255), f7 varchar(255));
CREATE TABLE entity_idx (id integer PRIMARY KEY REFERENCES entity_base
(id) ON DELETE CASCADE);

CREATE VIEW entity AS (SELECT * FROM entity_base NATURAL JOIN entity_idx );

-- Create one row of data

INSERT INTO entity_base (f1,f2,f3,f4,f5,f6,f7) VALUES ('Long random
string','must be random to avoid compression', ...);

-- Each line below doubles the amount of rows

INSERT INTO entity_base (f1,f2,f3,f4,f5,f6,f7) SELECT
f1,f2,f3,f4,f5,f6,f7 FROM entity_base;
INSERT INTO entity_base (f1,f2,f3,f4,f5,f6,f7) SELECT
f1,f2,f3,f4,f5,f6,f7 FROM entity_base;
-- ...
-- 21 times
-- ...
INSERT INTO entity_base (f1,f2,f3,f4,f5,f6,f7) SELECT
f1,f2,f3,f4,f5,f6,f7 FROM entity_base;

-- populate the "index table"

INSERT INTO entity_idx SELECT id FROM entity_base;

> EXPLAIN ANALYZE SELECT count(*) FROM entity_base;
                                                          QUERY
PLAN                                         
-------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=183116.80..183116.81 rows=1 width=0) (actual
time=16005.803..16005.803 rows=1 loops=1)
   ->  Seq Scan on entity_base  (cost=0.00..182461.44 rows=262144
width=0) (actual time=58.861..15893.200 rows=262144 loops=1)
 Total runtime: 16005.852 ms

>EXPLAIN ANALYZE SELECT count(*) FROM entity;
                                                                        
QUERY PLAN                          
------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=193688.18..193688.19 rows=1 width=0) (actual
time=2289.985..2289.986 rows=1 loops=1)
   ->  Merge Join  (cost=0.00..193032.82 rows=262144 width=0) (actual
time=0.148..2178.405 rows=262144 loops=1)
         Merge Cond: ("outer".id = "inner".id)
         ->  Index Scan using entity_base_pkey on entity_base 
(cost=0.00..184324.58 rows=262144 width=4) (actual time=0.074..1253.894
rows=262144 loops=1)
         ->  Index Scan using entity_idx_pkey on entity_idx 
(cost=0.00..4839.58 rows=262144 width=4) (actual time=0.067..291.596
rows=262144 loops=1)
 Total runtime: 2290.056 ms

My hope was that the query could be satisfied by doing a seqscan on
entity_idx. This optimization is probably beyond the ability of
postgresql though.  Apparently, however,  the count can be satisfied
just by looking at the 2 indexes - which seems strange to me. I would
think at least one of the tables would need to be seqscanned, to ensure
that the rows have not been deleted.


What is missing above is creating INSERT/UPDATE/DELETE rules for the
"entity" view.  I hacked together the ones below - they seem to work,
but there might be subtle points of how this works that I have not
understood.

 CREATE RULE entity_delete as ON DELETE TO entity DO INSTEAD DELETE FROM
entity_base where id = old.id;
 CREATE RULE entity_update as ON UPDATE TO entity DO INSTEAD UPDATE
entity_base SET f1=NEW.f1, f2=NEW.f2, f3=NEW.f3, f4=NEW.f4, f5=NEW.f5,
f6=NEW.f6, f7=NEW.f7 where id = NEW.id;
CREATE RULE entity_insert as ON INSERT TO entity DO INSTEAD (
    INSERT INTO entity_base (f1,f2,f3,f4,f5,f6,f7) VALUES (NEW.f1,
NEW.f2, NEW.f3, NEW.f4, NEW.f5, NEW.f6, NEW.f7);
  INSERT INTO  entity_idx VALUES (currval('entity_base_id_seq'::regclass))
);


/Nis

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to