Re: Udf Performance and Object Creation

2015-08-14 Thread Stephan Ewen
Yes, map() is like a convenience function around mapPartition(). On Fri, Aug 14, 2015 at 6:09 PM, Flavio Pompermaier wrote: > Hi Stephan thanks for the reply! > Now it's more clear..if I understood correctly map and mapPartition are > the same iff I have only one slot per task manager, right? >

Re: Udf Performance and Object Creation

2015-08-14 Thread Flavio Pompermaier
Hi Stephan thanks for the reply! Now it's more clear..if I understood correctly map and mapPartition are the same iff I have only one slot per task manager, right? I was convinced to have post those questions in this thread as 3rd or 4th message..isn't it? On 14 Aug 2015 17:57, "Stephan Ewen" wro

Re: Udf Performance and Object Creation

2015-08-14 Thread Fabian Hueske
O sorry, Flavio! I didn't see Hawins questions :-( Thanks Stephan for picking up! 2015-08-14 17:43 GMT+02:00 Flavio Pompermaier : > Any insight about these 2 questions..? > On 12 Aug 2015 17:38, "Flavio Pompermaier" wrote: > >> This is something I've never understood in depth: isn't a mapper cr

Re: Udf Performance and Object Creation

2015-08-14 Thread Stephan Ewen
Hi! (1) A mapper is created once per parallel task. So if you create a program that runs a map() transformation with a parallelism of n, you will have n mapper instances in the cluster. Some may be on the same TaskManager, if the TaskManager has multiple slots. (2) I would really like that. But i

Re: Udf Performance and Object Creation

2015-08-14 Thread Fabian Hueske
I think Timo answered both questions (quoting Michael: "Hey Timo, yes that is what I needed to know. Thanks"). Maybe one more comment. The motivation of the examples is not the best performance but to showcase Flink's APIs and concepts. Best, Fabian 2015-08-14 17:43 GMT+02:00 Flavio Pompermaier

Re: Udf Performance and Object Creation

2015-08-14 Thread Flavio Pompermaier
Any insight about these 2 questions..? On 12 Aug 2015 17:38, "Flavio Pompermaier" wrote: > This is something I've never understood in depth: isn't a mapper created > for each record?if it's created only once per task manager then it's not so > different from mapPartition..what I'm missing here? >