On Mon, Dec 9, 2013 at 6:47 PM, Heikki Linnakangas
<hlinnakan...@vmware.com> wrote:
> On 12/09/2013 11:35 PM, Jim Nasby wrote:
>>
>> On 12/8/13 1:49 PM, Heikki Linnakangas wrote:
>>>
>>> On 12/08/2013 08:14 PM, Greg Stark wrote:
>>>>
>>>> The whole accounts table is 1.2GB and contains 10 million rows. As
>>>> expected with rows_per_block set to 1 it reads 240MB of that
>>>> containing nearly 2 million rows (and takes nearly 20s -- doing a full
>>>> table scan for select count(*) only takes about 5s):
>>>
>>>
>>> One simple thing we could do, without or in addition to changing the
>>> algorithm, is to issue posix_fadvise() calls for the blocks we're
>>> going to read. It should at least be possible to match the speed of a
>>> plain sequential scan that way.
>>
>>
>> Hrm... maybe it wouldn't be very hard to use async IO here either? I'm
>> thinking it wouldn't be very hard to do the stage 2 work in the callback
>> routine...
>
>
> Yeah, other than the fact we have no infrastructure to do asynchronous I/O
> anywhere in the backend. If we had that, then we could easily use it here. I
> doubt it would be much better than posix_fadvising the blocks, though.


Without patches to the kernel, it is much better.

posix_fadvise interferes with read-ahead, so posix_fadvise on, say,
bitmap heap scans (or similarly sorted analyze block samples) run at 1
IO / block, ie horrible, whereas aio can do read coalescence and
read-ahead when the kernel thinks it'll be profitable, significantly
increasing IOPS. I've seen everything from a 2x to 10x difference.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to