Sorry I am confused now... My UDF gets executed for each row anyway
(because I am doing with column and want to execute the UDF with each row).
The difference is that with the optimization "ConvertToLocalRelation" it
gets executed for each row on the driver in the optimization stage?
On Fri, Jun 8
I see. Thanks for the clarification. It's not a a big issue but I am
surprised my UDF can be executed in planning phase. If my UDF is doing
something expensive it could get weird.
On Fri, Jun 8, 2018 at 3:44 PM, Reynold Xin wrote:
> But from the user's perspective, optimization is not run righ
But from the user's perspective, optimization is not run right? So it is
still lazy.
On Fri, Jun 8, 2018 at 12:35 PM Li Jin wrote:
> Hi All,
>
> Sorry for the long email title. I am a bit surprised to find that the
> current optimizer rule "ConvertToLocalRelation" causes expressions to be
> eag
Hi All,
Sorry for the long email title. I am a bit surprised to find that the
current optimizer rule "ConvertToLocalRelation" causes expressions to be
eager-evaluated in planning phase, this can be demonstrated with the
following code:
scala> val myUDF = udf((x: String) => { println("UDF evaled")
The vote passes. Thanks to all who helped with the release!
I'll follow up later with a release announcement once everything is published.
+1 (* = binding):
- Marcelo Vanzin *
- Reynold Xin *
- Sean Owen *
- Denny Lee
- Dongjoon Hyun
- Ricardo Almeida
- Hyukjin Kwon
- John Zhuge
- Mark Hamstra *