On Wed, Aug 12, 2020 at 8:06 PM Tom Lane <t...@sss.pgh.pa.us> wrote:

> Ayub M <hia...@gmail.com> writes:
> > This is PostgreSQL 11 on AWS, there is a mview query in this OLAP
> database,
> > the tables involved are huge - 50-100m records on average records
> hundreds
> > of columns in most cases. The query runs for a while and then errors out
> > saying "No space left on device". I could see it generating around 500gb
> of
> > temp file data. At times it goes thru and at times it fails - probably
> due
> > to other queries running at the same time and causing failure.
>
> Are you sure that these queries are actually producing the answers you
> want?  It sounds suspiciously like you are computing underconstrained
> joins.
> --> Yes, it is as per the business requirement.
>


> > The joins are happening on around 10 tables and all are joining on the PK
> > and FK columns. I see partition pruning happening but the hash joins are
> > killing the query.
> > Is there any way to avoid hash joins?
>
> TBH, you are asking the wrong question.  A merge join would take about as
> much temporary space, and a nestloop join over so much data would probably
> not finish in an amount of time you're willing to wait.  Indexes are NOT
> a magic solution here.  What you need to be thinking about is how to not
> need to process so much data.
>
> If you really need to have this proven to you, you can try "set
> enable_hashjoin = off", but I don't think you'll find that better.
> --> You are right, neither merge join nor nested loop are resolving the
> issue.
>


>                         regards, tom lane
>


-- 
Regards,
Ayub

Reply via email to