On Wed, Aug 12, 2020 at 8:06 PM Tom Lane <t...@sss.pgh.pa.us> wrote: > Ayub M <hia...@gmail.com> writes: > > This is PostgreSQL 11 on AWS, there is a mview query in this OLAP > database, > > the tables involved are huge - 50-100m records on average records > hundreds > > of columns in most cases. The query runs for a while and then errors out > > saying "No space left on device". I could see it generating around 500gb > of > > temp file data. At times it goes thru and at times it fails - probably > due > > to other queries running at the same time and causing failure. > > Are you sure that these queries are actually producing the answers you > want? It sounds suspiciously like you are computing underconstrained > joins. > --> Yes, it is as per the business requirement. >
> > The joins are happening on around 10 tables and all are joining on the PK > > and FK columns. I see partition pruning happening but the hash joins are > > killing the query. > > Is there any way to avoid hash joins? > > TBH, you are asking the wrong question. A merge join would take about as > much temporary space, and a nestloop join over so much data would probably > not finish in an amount of time you're willing to wait. Indexes are NOT > a magic solution here. What you need to be thinking about is how to not > need to process so much data. > > If you really need to have this proven to you, you can try "set > enable_hashjoin = off", but I don't think you'll find that better. > --> You are right, neither merge join nor nested loop are resolving the > issue. > > regards, tom lane > -- Regards, Ayub