Re: Need help with HashAggregateExec, TungstenAggregationIterator and UnsafeFixedWidthAggregationMap

Russell Spitzer Fri, 07 Sep 2018 13:06:02 -0700

That's my understanding :) doExecute is for non-codegen while doProduce and
Consume are for generating code


On Fri, Sep 7, 2018 at 2:59 PM Jacek Laskowski <ja...@japila.pl> wrote:

> Hi Devs,
>
> Sorry for bothering you with my questions (and concerns), but I really
> need to understand this piece of code (= my personal challenge :))
>
> Is this true that SparkPlan.doExecute (to "execute" a physical operator)
> is only used when whole-stage code gen is disabled (which is not by
> default)? May I call this execution path traditional (even "old-fashioned")?
>
> Is this true that these days SparkPlan.doProduce and SparkPlan.doConsume
> (and others) are used for "executing" a physical operator (i.e. to generate
> the Java source code) since whole-stage code generation is enabled and is
> currently the proper execution path?
>
> p.s. This SparkPlan.doExecute is used to trigger whole-stage code gen
> by WholeStageCodegenExec (and InputAdapter), but that's all the code that
> is to be executed by doExecute, isn't it?
>
> Pozdrawiam,
> Jacek Laskowski
> ----
> https://about.me/JacekLaskowski
> Mastering Spark SQL https://bit.ly/mastering-spark-sql
> Spark Structured Streaming https://bit.ly/spark-structured-streaming
> Mastering Kafka Streams https://bit.ly/mastering-kafka-streams
> Follow me at https://twitter.com/jaceklaskowski
>
>
> On Fri, Sep 7, 2018 at 7:24 PM Jacek Laskowski <ja...@japila.pl> wrote:
>
>> Hi Spark Devs,
>>
>> I really need your help understanding the relationship
>> between HashAggregateExec, TungstenAggregationIterator and
>> UnsafeFixedWidthAggregationMap.
>>
>> While exploring UnsafeFixedWidthAggregationMap and how it's used I've
>> noticed that it's for HashAggregateExec and TungstenAggregationIterator
>> exclusively. And given that TungstenAggregationIterator is used exclusively
>> in HashAggregateExec and the use of UnsafeFixedWidthAggregationMap in both
>> seems to be almost the same (if not the same), I've got a question I cannot
>> seem to answer myself.
>>
>> Since HashAggregateExec supports Whole-Stage Codegen
>> HashAggregateExec.doExecute won't be used at all, but doConsume and
>> doProduce (unless codegen is disabled). Is that correct?
>>
>> If so, TungstenAggregationIterator is not used at all, but
>> UnsafeFixedWidthAggregationMap is used directly instead (in the Java code
>> that uses createHashMap or finishAggregate). Is that correct?
>>
>> Pozdrawiam,
>> Jacek Laskowski
>> ----
>> https://about.me/JacekLaskowski
>> Mastering Spark SQL https://bit.ly/mastering-spark-sql
>> Spark Structured Streaming https://bit.ly/spark-structured-streaming
>> Mastering Kafka Streams https://bit.ly/mastering-kafka-streams
>> Follow me at https://twitter.com/jaceklaskowski
>>
>

Re: Need help with HashAggregateExec, TungstenAggregationIterator and UnsafeFixedWidthAggregationMap

Reply via email to