Oh - if you do this then make sure that you have the primary key index on overview too.
Alex On Jan 14, 2008 12:53 AM, Alex Turner <[EMAIL PROTECTED]> wrote: > If you combine it with Tom Lane's suggestion - it will go even better, > something like: > > select * from t_documentcontent where _id in (select _id from overview > where _id>x order by _id limit 50); > > Alex > > > On Jan 13, 2008 11:59 PM, pepone. onrez <[EMAIL PROTECTED]> wrote: > > > > > Thanks Alex > > > > I test your solution and is realy more faster. > > > > Nested Loop (cost=1743.31..2044.58 rows=50 width=908) (actual time= > > 101.695..106.178 rows=50 loops=1) > > -> HashAggregate (cost=1743.31..1743.31 rows=50 width=108) (actual > > time=101.509..101.567 rows=50 loops=1) > > -> Subquery Scan "IN_subquery" (cost=1741.60..1743.19 rows=50 > > width=108) (actual time=101.327..101.456 rows=50 loops=1) > > -> Limit (cost=1741.60..1742.69 rows=50 width=108) > > (actual time=101.313..101.383 rows=50 loops=1) > > -> Seq Scan on overview > > (cost=0.00..3283.07rows=150807 width=108) (actual time= > > 0.036..72.249 rows=80050 loops=1) > > -> Index Scan using i_documentcontent_id on t_documentcontent (cost= > > 0.00..6.01 rows=1 width=908) (actual time=0.083..0.085 rows=1 loops=50) > > Index Cond: ((t_documentcontent._id)::text = > > ("outer"._id)::text) > > Total runtime: 106.323 ms > > > > I now need to see what trigers i need to add, and test the insertions. > > Thanks again > > > > > > On Jan 14, 2008 5:54 AM, Alex Turner < [EMAIL PROTECTED]> wrote: > > > > > Here is a table I threw together to demonstrate the approximate speed > > > of a materialized view in this case: > > > > > > trend=# explain analyze select property_id from overview order by > > > property_id limit 50 offset 50000; > > > QUERY > > > PLAN > > > > > > ---------------------------------------------------------------------------------------------------------------------------- > > > Limit (cost=19112.75..19112.88 rows=50 width=8) (actual time= > > > 446.048..446.125 rows=50 loops=1) > > > -> Sort (cost=18987.75..19400.49 rows=165094 width=8) (actual > > > time=384.788..424.433 rows=50050 loops=1) > > > Sort Key: property_id > > > -> Seq Scan on overview (cost=0.00..2501.94 rows=165094 > > > width=8) (actual time= 0.012..88.691 rows=173409 loops=1) > > > Total runtime: 447.578 ms > > > (5 rows) > > > > > > trend=# select count(*) from overview; > > > count > > > -------- > > > 173409 > > > (1 row) > > > > > > trend=# > > > > > > It's not great - but it's better than 47 seconds (The machine I'm > > > running it on is far from big iron, so these results should be fairly > > > typical for any modern x86 box - also this materialized view is almost > > > certainly in RAM, and therefore IO speed is irrelevant). > > > > > > Tom lane has already suggested another approach, whereby you order > > > your results, then select the next 10 from the set where the id is greater > > > than the greatest of the last one: > > > > > > select id from overview order by id limit 50; > > > > > > x = get row['id'] // for row 1 > > > do something > > > x=get row['id'] // for row 2 > > > do something > > > ... > > > x=get row['id'] // for row 50 > > > > > > select id from overview where id>x order by id limit 50. > > > > > > The order by is relevant and infact imperative, because you must order > > > your results somehow, otherwise your pagination will produce different > > > results each time you try it as database updates will affect the order the > > > rows come back by default without an order by clause. > > > > > > Let me say that again to be clear: The order rows come back if you > > > don't specify an order by can change! so pulling rows without an order by > > > is a REALLY bad idea. This will break your pagination if a database > > > update > > > happens between someone viewing a page and hitting next to view the next > > > page. > > > > > > Alex > > > > > > > > > On Jan 13, 2008 11:43 PM, Alex Turner <[EMAIL PROTECTED] > wrote: > > > > > > > If you have to access the data this way (with no where clause at all > > > > - which sometimes you do) then I have already provided a solution that > > > > will > > > > work reasonably well. If you create what is essentially a materialized > > > > view > > > > of just the id field, the sequence scan will return much fewer pages > > > > than > > > > when you do it on the main table. Then you join it to the indexed main > > > > table, and page in just the rows you need. Voila - much faster result. > > > > Of > > > > course we haven't really talked about how that will affect insert speed > > > > and > > > > delete speed if you trigger then up, but you haven't really talked > > > > about any > > > > requirements there. > > > > > > > > Alex > > > > > > > > > > > > On Jan 13, 2008 11:27 PM, pepone. onrez <[EMAIL PROTECTED] > > > > > wrote: > > > > > > > > > Sorry Alex i forget mention that i have setscan of in my last > > > > > test. > > > > > > > > > > now I have set seqscan on and indexscan on and added order by > > > > > _id > > > > > > > > > > The table has an index in the _id field > > > > > > > > > > CREATE INDEX i_documentcontent_document > > > > > ON t_documentcontent > > > > > USING btree > > > > > (_document); > > > > > > > > > > The database was rencently vacum analyze , but not vacun full > > > > > > > > > > here is the explain of 2 diferent queries , when i put a large > > > > > OFFSET > > > > > > > > > > EXPLAIN ANALYZE SELECT > > > > > t_documentcontent._id AS _id > > > > > FROM t_documentcontent ORDER BY _id LIMIT 50 OFFSET 50000 > > > > > > > > > > "Limit (cost=137068.24..137068.36 rows=50 width=58) (actual time= > > > > > 41119.702..41119.792 rows=50 loops=1)" > > > > > " -> Sort (cost=136943.24..137320.26 rows=150807 width=58) > > > > > (actual time=41064.802..41100.424 rows=50050 loops=1)" > > > > > " Sort Key: _id" > > > > > " -> Seq Scan on t_documentcontent (cost= > > > > > 0.00..110772.07rows=150807 width=58) (actual time= > > > > > 106.679..33267.194 rows=150807 loops=1)" > > > > > "Total runtime: 41120.015 ms" > > > > > > > > > > EXPLAIN ANALYZE SELECT > > > > > t_documentcontent._id AS _id > > > > > FROM t_documentcontent ORDER BY _id LIMIT 50 OFFSET 10000 > > > > > > > > > > "Limit (cost=39839.37..40038.56 rows=50 width=58) (actual time= > > > > > 1172.969..1194.228 rows=50 loops=1)" > > > > > " -> Index Scan using i_documentcontent_id on t_documentcontent > > > > > (cost=0.00..600805.54 rows=150807 width=58) (actual time= > > > > > 0.077..1189.688 rows=10050 loops=1)" > > > > > "Total runtime: 1194.316 ms" > > > > > > > > > > Tom > > > > > i using uuid for the _id field that is the primary key add a > > > > > WHERE id > ? don 't apply > > > > > the cursor aproach is also not suitable for same of my queries > > > > > > > > > > I use this query for paginate contents of a filesysstem with lots > > > > > of documents avoid offset is not posible always > > > > > > > > > > > > > > > > > >