Re: [HACKERS] index-only scans

Cédric Villemain Thu, 11 Aug 2011 14:39:54 -0700

2011/8/11 Robert Haas <robertmh...@gmail.com>:
> Please find attached a patch implementing a basic version of
> index-only scans.  This patch is the work of my colleague Ibrar Ahmed
> and myself, and also incorporates some code from previous patches
> posted by Heikki Linnakanagas.


Great!.

>
> I'm able to demonstrate a significant performance benefit from this
> patch, even in a situation where it doesn't save any disk I/O, due to
> reduced thrashing of shared_buffers.  Configuration settings:
>
> max_connections = 100
> shared_buffers = 400MB
> maintenance_work_mem = 256MB
> synchronous_commit = off
> checkpoint_segments = 100
> checkpoint_timeout = 30min
> checkpoint_completion_target = 0.9
> checkpoint_warning = 90s
> seq_page_cost = 1.0
> random_page_cost = 1.0
> effective_cache_size = 3GB
>
> Test setup:
>
> pgbench -i -s 50
> create table sample_data as select (random()*5000000)::int as aid,
> repeat('a', 1000) as filler from generate_series(1,100000);
>
> Test queries:
>
> select sum(aid) from sample_data a1 where exists (select * from
> pgbench_accounts a where a.aid = a1.aid and a.aid != 1234567);
> select sum(aid) from sample_data a1 where exists (select * from
> pgbench_accounts a where a.aid = a1.aid and a.bid != 1234567);
>
> On my laptop, the first query executes in about 555 ms, while the
> second one takes about 1125 ms.  Inspection via pg_buffercache reveals
> that the second one thrashes shared_buffers something fierce, while
> the first one does not.  You can actually get the time for the first
> query down to about 450 ms if you can persuade PostgreSQL to cache the
> entire sample_data table - which is difficult, due the
> BufferAccessStrategy stuff - and as soon as you run the second query,
> it blows the table out of cache, so practically speaking you're not
> going to get that faster time very often.  I expect that you could get
> an even larger benefit from this type of query if you could avoid
> actual disk I/O, rather than just buffer cache thrashing, but I
> haven't come up with a suitable test cases for that yet (volunteers?).
>
> There are several things about this patch that probably need further
> thought and work, or at least some discussion.
>
> 1. The way that nodeIndexscan.c builds up the faux heap tuple is
> perhaps susceptible to improvement.  I thought about building a
> virtual tuple, but then what do I do with an OID column, if I have
> one?  Or maybe this should be done some other way altogether.

Can this faux heap tuple be appended by the data from another index
once it has been created ? ( if the query involves those 2 index)

-- 
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] index-only scans

Reply via email to