[GitHub] [incubator-doris] morrySnow commented on pull request #8695: [enhancement] update broadcast join cost algorithm

GitBox Mon, 28 Mar 2022 06:26:28 -0700


morrySnow commented on pull request #8695:
URL: https://github.com/apache/incubator-doris/pull/8695#issuecomment-1080649337



   > > > Why add a memory control to limit the broadcast memory? Instead of 
using mem limit uniformly?
   > > 
   > > 
   > > there are 2 reason:
   > > 
   > > 1. broadcast is not always fast than shuffle. The cost of creating a 
FULL TABLE hash table is not negligible when broadcast table is large.
   > > 2. In be, we allocate hash table in buffer pool, and it' is not limited 
by mem limit.
   > 
   > 1. Added a new memory parameter that will make it more difficult for users 
to understand and debug.
   >    I understand that broadcast is faster than shuffle in most cases. If 
shuffle is faster than broadcast, it is not directly related to the size of the 
hash table, but is related to the gap between the data sizes of the left and 
right tables.
   >    In this case, can manually hint to specify the join method.
   
   About Create hash table is expensive when expand hash table size. it can't 
just include network overhead, If we need an accurate cost model.
   
   > 2. From what I see, the MemPool currently used by HashJoinNode allocates 
the memory of the HashTable, and the BufferPool is only used in the HashTable 
of the Partitioned Agg.
   > 
   > If the remaining 1G is to reserve memory for a query except for hash join, 
we should try to estimate the memory consumption of all nodes in a fragment, 
and complete it by collecting statistics.
   i will recheck it, thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [incubator-doris] morrySnow commented on pull request #8695: [enhancement] update broadcast join cost algorithm

Reply via email to