Thanks Russ! That helps a lot. On the other hand makes reviewing the codebase of Spark SQL slightly harder since Java code generation is so much about string concatenation :(
p.s. Should all the code in doExecute be considered and marked @deprecated? Pozdrawiam, Jacek Laskowski ---- https://about.me/JacekLaskowski Mastering Spark SQL https://bit.ly/mastering-spark-sql Spark Structured Streaming https://bit.ly/spark-structured-streaming Mastering Kafka Streams https://bit.ly/mastering-kafka-streams Follow me at https://twitter.com/jaceklaskowski On Fri, Sep 7, 2018 at 10:05 PM Russell Spitzer <russell.spit...@gmail.com> wrote: > That's my understanding :) doExecute is for non-codegen while doProduce > and Consume are for generating code > > On Fri, Sep 7, 2018 at 2:59 PM Jacek Laskowski <ja...@japila.pl> wrote: > >> Hi Devs, >> >> Sorry for bothering you with my questions (and concerns), but I really >> need to understand this piece of code (= my personal challenge :)) >> >> Is this true that SparkPlan.doExecute (to "execute" a physical operator) >> is only used when whole-stage code gen is disabled (which is not by >> default)? May I call this execution path traditional (even "old-fashioned")? >> >> Is this true that these days SparkPlan.doProduce and SparkPlan.doConsume >> (and others) are used for "executing" a physical operator (i.e. to generate >> the Java source code) since whole-stage code generation is enabled and is >> currently the proper execution path? >> >> p.s. This SparkPlan.doExecute is used to trigger whole-stage code gen >> by WholeStageCodegenExec (and InputAdapter), but that's all the code that >> is to be executed by doExecute, isn't it? >> >> Pozdrawiam, >> Jacek Laskowski >> ---- >> https://about.me/JacekLaskowski >> Mastering Spark SQL https://bit.ly/mastering-spark-sql >> Spark Structured Streaming https://bit.ly/spark-structured-streaming >> Mastering Kafka Streams https://bit.ly/mastering-kafka-streams >> Follow me at https://twitter.com/jaceklaskowski >> >> >> On Fri, Sep 7, 2018 at 7:24 PM Jacek Laskowski <ja...@japila.pl> wrote: >> >>> Hi Spark Devs, >>> >>> I really need your help understanding the relationship >>> between HashAggregateExec, TungstenAggregationIterator and >>> UnsafeFixedWidthAggregationMap. >>> >>> While exploring UnsafeFixedWidthAggregationMap and how it's used I've >>> noticed that it's for HashAggregateExec and TungstenAggregationIterator >>> exclusively. And given that TungstenAggregationIterator is used exclusively >>> in HashAggregateExec and the use of UnsafeFixedWidthAggregationMap in both >>> seems to be almost the same (if not the same), I've got a question I cannot >>> seem to answer myself. >>> >>> Since HashAggregateExec supports Whole-Stage Codegen >>> HashAggregateExec.doExecute won't be used at all, but doConsume and >>> doProduce (unless codegen is disabled). Is that correct? >>> >>> If so, TungstenAggregationIterator is not used at all, but >>> UnsafeFixedWidthAggregationMap is used directly instead (in the Java code >>> that uses createHashMap or finishAggregate). Is that correct? >>> >>> Pozdrawiam, >>> Jacek Laskowski >>> ---- >>> https://about.me/JacekLaskowski >>> Mastering Spark SQL https://bit.ly/mastering-spark-sql >>> Spark Structured Streaming https://bit.ly/spark-structured-streaming >>> Mastering Kafka Streams https://bit.ly/mastering-kafka-streams >>> Follow me at https://twitter.com/jaceklaskowski >>> >>