my-vegetable-has-exploded commented on PR #12754:
URL: https://github.com/apache/datafusion/pull/12754#issuecomment-2429057021

   > It's still _input_size^2_ complexity, however the _N^2_ step (step 3) is 
just looping through a bit array, it's way more efficient than do _N^2_ join 
condition evaluation if it's using NLJ, to make this implementation efficient.
   
   If use btreemap to maintain it, the complexity will be $N*logN + 
OutputSize$. And the bitmap[i..n] maybe sparse in many scenes, use bitmap with 
bloom filter(bloomfilter[i]=0 means bitmap[k*i..k*(i+1)] are all 0) can skip 
lots of useless value, and bitmap is very cache-friendly.
   
   > Is it possible to extract the second sorting outside `IEJoin` executor, 
and let the existing sort executor to do this? This step then can be more 
easily parallelized, or does `IEJoin` executor have a better way to parallelize 
its job?
   
   I don't know how to use extract the second sorting outside `IEJoin` executor 
though.  And currently parallelize mechanism is split the left table into n 
blocks and right table into m blocks (left table and right table already sort 
by condition1).  Then we have n*m part data to compute,  and we can check 
whether l[i] block and r[j] block can produce any result pair satisfying 
condition1 in O(1) complexity, if not result pair can be produced, we can just 
skip the block pair l[i] and r[j].  If you have a better idea, I'd love to hear 
about it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to