Thank you for all your comments.

I think the condition of this optimization is whether the small table can fit 
into memory. If not, then it doesn't work since two tables still need to be 
written to disk. But if yes, we can save all I/O costs in the hash join 
process. 

Thanks,
Li Jie

----- Original Message ----- 
From: "Robert Haas" <robertmh...@gmail.com>
To: "Simon Riggs" <si...@2ndquadrant.com>
Cc: "Jie Li" <jay23j...@gmail.com>; "pgsql-hackers" 
<pgsql-hackers@postgresql.org>
Sent: Wednesday, December 29, 2010 8:59 PM
Subject: Re: [HACKERS] small table left outer join big table


On Wed, Dec 29, 2010 at 7:34 AM, Simon Riggs <si...@2ndquadrant.com> wrote:
> On Wed, 2010-12-29 at 07:17 -0500, Robert Haas wrote:
>> >
>> > Here I have a puzzle, why not choose the small table to build hash table? 
>> > It
>> > can avoid multiple batches thus save significant I/O cost, isn't it?
>>
>> Yeah, you'd think. Can you post a full reproducible test case?
>
> It's not a bug, that's the way it currently works. We don't need a test
> case for that.
>
> I agree that the optimisation would be a useful one.
>
> It allows you to ask the query "Show me sales for each of my stores"
> efficiently, rather than being forced to request the inner join query
> "Show me the sales for each of my stores for which there have been
> sales", which is a much less useful query.

Oh, you're right.  I missed the fact that it's a left join.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to